Web4/LS-W4-Aero-roberta-r4-hate-speech

Model Summary

The Web4/LS-W4-Aero-roberta-r4-hate-speech is a specialized text classification model for hate speech detection. It was developed by Linkspreed UG by fine-tuning the facebook/roberta-hate-speech-dynabench-r4-target model. This model is capable of identifying hate speech in both English and German.

Intended Use

The primary use case for this model is to classify text to detect hate speech. It is a valuable tool for content moderation, filtering, and analysis of online communications. The model is designed to assist platforms in maintaining a safe and inclusive digital environment by automatically identifying harmful content.

Risks and Limitations

The model card explicitly outlines several risks and limitations common to hate speech detection models:

Bias: The model may exhibit biases from its training data, potentially leading to false positives (legitimate expressions misidentified as hate speech) or false negatives (missed subtle or new forms of hate speech).
Demographic Bias: The model could disproportionately flag content from specific demographic groups if the training data was unbalanced.
Language Nuance: It may have difficulty with contextual or evolving forms of hate speech, such as slang, code words, or euphemisms.

Due to these limitations, the developers strongly recommend human review for critical applications and regular auditing of the model's performance on diverse datasets.

Technical Details

Model Type: Text Classification (Hate Speech Detection)
Finetuned from: facebook/roberta-hate-speech-dynabench-r4-target
Languages: English, German
Parameters: 125 million
License: Apache 2.0
Developers: Linkspreed UG
Framework: Transformers
File Type: Safetensors

Link to Model Card

For further details and to access the model files, please visit the official Hugging Face model card: https://huggingface.co/Web4/LS-W4-Aero-roberta-r4-hate-speech