Call for Participation

Dear All,

this is the call to participate in the Shared Task on Identification of Offensive Language GermEval 2019 (Task 2). We invite everyone from academia and industry to participate in the Shared Task on the Identification of Offensive Language for German.

Introduction

Offensive language is commonly defined as hurtful, derogatory or obscene comments made by one person to another. This type of language can increasingly be found on the web. As a consequence many operators of social media websites no longer manage to manually monitor user posts. Therefore, there is a pressing demand for methods to automatically identify suspicious posts.

This second shared task on the topic is to intensify research on the identification of offensive content in German language microposts. Offensive comments are to be detected from a set of German tweets. We focus on Twitter since tweets can be regarded as a prototypical type of micropost.

The workshop discussing this year’s edition of this shared task is planned to be held in conjunction with the Conference on Natural Language Processing (KONVENS ) in Nürnberg in October 2019.

Data

The training and test data from 2018 serve as example data for subtask I and subtask II and are available from the website (URL see above). An evaluation script can be downloaded there as well.

The training data, which are going to be released in April, can be downloaded after registering with the organizing committee. The task evaluations will take place in July 2019. 

Tasks

We offer the two subtasks described below, as in 2018. Additionally, we will have a third subtask on explicit and implicit offensive language in 2019 that is also described below.

Participants in this year’s shared task can choose to participate in one, two or all of the subtasks.

Subtask I — Binary classification

The task is to decide whether a tweet includes some form of offensive language or not.

Subtask II — Fine-grained classification

In addition to detecting offensive language tweets, we distinguish between three subcategories:

PROFANITY: usage of profane words, however, the tweet clearly does not want to insult anyone.

INSULT: unlike PROFANITY the tweet clearly wants to offend someone.

ABUSE: unlike INSULT, the tweet does not just insult a person but represents the stronger form of abusive language

 

Subtask III – Classification of explicit and implicit offensive language

In addition to detecting offensive language tweets, we distinguish between two subcategories:

 

EXPLICIT: an offensive tweet which directly expresses hate, condemnation, superiority towards an explicitly or implicitly given target

 

IMPLICIT:  an offensive tweet where the expression of hate, condemnation, superiority etc. as directed towards an explicitly or implicitly given target has to be inferred from the ascription of (hypothesized) target properties that are insulting, degrading, offending, humiliating etc.

 

Subtask III is cast as a two-way classification task where a tweet either is explicit offensive (EXPLICIT) or implicit offensive (IMPLICIT).

 

Timeline

  • March 2019 Call for Participation
  • April 2019 Release of Training Data
  • June 2019 Registration Deadline
  • July 2019 Release of Test Data
  • August 2019: Submission of System Runs, System Description paper and Survey
  • August 2019: Feedback on System Description papers
  • August 2019: Final Submission of System Description papers
  • October 2019: Workshop co-located with KONVENS-2019

GermEval

GermEval is a series of shared task evaluation campaigns that focus on natural language processing for the German language. So far, there have been four iterations of GermEval, each with a different type of task. GermEval shared tasks have been run informally by self-organized groups of interested researchers. However, the last shared task as well as the one for 2019 were endorsed by special interest groups within the German Society for Computational Linguistics (GSCL). All iterations of GermEval shared tasks held their concluding workshop in conjunction with either the GSCL or the KONVENS bi-annual conferences, depending on which of them took place.

For the first time in 2019, there is more than one shared task in GermEval:

 

GermEval 2019 Task 1 — Shared task on hierarchical classification of blurbs

GermEval 2019 Task 2 — Shared Task on the Identification of Offensive Language (this task)

 

Contact email

Mailing group

Please join our discussion group at in order to receive announcements and participate in discussions.

Best regards,

The GermEval 2019 Task 2 Organizers:

Manfred Klenner (University of Zurich)

Josef Ruppenhofer (Institute for German Language, Mannheim)

Melanie Siegel (Darmstadt University of Applied Sciences)

Julia Maria Struß (University of Applied Sciences Potsdam)

Michael Wiegand (Institut for German Language, Mannheim/Heidelberg University)