OpenAI Introduces a New Method to Leverage GPT-4 Moderation

OpenAI claims that its engine can perform the tasks of human content moderators. It performs these tasks with high precision, enhanced uniformity, and devoid of individuals’ emotional strain. Specifically, it happens when exposed to distressing and offensive content for prolonged periods.

OpenAI leveraged on advanced capabilities of their latest GPT-4 model to develop the content moderation system. Their findings indicate that the system surpasses the performance of moderately trained moderators. Yet, it falls short of matching the proficiency of the most adept human moderators.

AI Moderation Tools

While AI-driven moderation tools are not new, several have already made their mark. For instance, Perspective has been available for years. Google’s Jigsaw division and Counter Abuse Technology Team oversee Perspective. Numerous startups, such as Spectrum Labs, Hive, Oterlu, and Cinder, offer automated moderation solutions.

Past incidents show that their track record has not been flawless. Researchers at Penn State discovered that content relating to individuals with disabilities on social media platforms could often be incorrectly flagged as more negative or toxic by prevailing sentiment and toxicity detection models. Another study highlighted the shortcomings of older Perspective versions in identifying hate speech that utilized slurs like “queer” or involved unconventional spellings with missing characters.

These issues happen partly because the people who label the data introduce their biases. These labelers mark the examples the model learns from, and sometimes their biases affect this process. Specifically, there are often differences in how labelers who are African Americans or LGBTQ+ members mark the data compared to those who are not part of these groups.

OpenAI to the Rescue

OpenAI has made strides in addressing this issue. Its enhanced predictive capabilities of GPT-4 hold promise for improving moderation outcomes compared to previous platforms.

This system can handle various stages of identifying and addressing problematic content. It handles the formulation of moderation guidelines and practical implementation of them. OpenAI asserts that its approach can significantly expedite the deployment of novel content moderation policies. Additionally, OpenAI positions its methodology as surpassing the strategies of startup ventures like Anthropic.

However, even the most advanced AI systems, including GPT-4, are imperfect and can still make errors.

The featured image is from metaversepost.com