LLOD schema for Simplified Offensive Language Taxonomy in multilingual detection and applications

Autoři

LEWANDOWSKA-TOMASZCZYK Barbara BĄCZKOWSKA Anna DONTCHEVA-NAVRÁTILOVÁ Olga LIEBESKIND Chaya VALUNAITE OLEŠKEVIČIENE Giedre ŽITNIK Slavko TROJSZCZAK Marvin POVOLNÁ Renata SELMISTRAITIS Linas UTKA Andrius GUDELIS Dangis

Rok publikování 2023
Druh Článek v odborném periodiku
Časopis / Zdroj Lodz Papers in Pragmatics
Fakulta / Pracoviště MU

Pedagogická fakulta

Citace
www https://www.degruyter.com/document/doi/10.1515/lpp-2023-0016/html
Doi http://dx.doi.org/10.1515/lpp-2023-0016
Klíčová slova offensive language; offensive language taxonomy; annotation; LLOD; linguistic linked open data; hate speech
Popis The goal of the paper is to present a Simplified Offensive Language (SOL) Taxonomy, its application and testing in the Second Annotation Campaign conducted between March-May 2023 on four languages: English, Czech, Lithuanian, and Polish to be verified and located in LLOD. Making reference to the previous Offensive Language taxonomic models proposed mostly by the same COST Action Nexus Linguarum WG 4.1.1 team, the number and variety of the categories underwent the definitional revision, and the present typology was tested in the annotation on the publicly available offensive language datasets of each of the four languages. The results of the annotation are presented and as they are contained within the accepted statistical values on the inter-annotator agreement in the SOL categories and their aspects, we propose this taxonomy as a core ontology which represents the encoding of the supported offensive languages and justify its use on new data in terms of a more universal Linguistic Linked Open Data (LLOD) schema.

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.