OpenAI is teaching AI models to ‘confess’ when they hallucinate — here’s what that actually means

OpenAI is testing a “confessions” system that trains new AI models to openly report their own mistakes, rule-breaking, uncertainties, and possible hallucinations through a second output channel. It’s not a ChatGPT feature yet, but early research suggests it could become a powerful safety tool for detecting failures that would otherwise go unnoticed.

https://www.tomsguide.com/ai/chatgpt/openai-is-teaching-ai-models-to-confess-when-they-hallucinate-heres-what-that-actually-means