Sam Altman announced a mysterious ChatGPT upgrade for the default GPT-4o model on Friday without revealing too many details about it. “We updated GPT-4o today!” Altman said on X. “Improved both intelligence and personality,” he teased. In real use, ChatGPT turned out to be more sycophantic than ever, annoying users in the process.
The weekend wasn’t even over when Atman acknowledged the problems with ChatGPT’s personality. He said OpenAI will deploy fixes on Sunday and in the following week. More importantly, the CEO said OpenAI will share its learnings from this mishap. “It’s been interesting,” he teased.
Another 48 hours later, OpenAI rolled back the ChatGPT personality for all Free users, with Altman saying paid accounts would also get the previous personality version. More interesting is OpenAI’s more detailed blog post on the matter that starts explaining what went wrong with the latest ChatGPT personality improvement update that made the AI become too agreeable and sycophant-y.
OpenAI explained the personality upgrades it planned for last week’s ChatGPT upgrade. The company wanted to make the default ChatGPT personality “more intuitive and effective across a variety of tasks.”
The result was an AI chatbot looking to please the user, which was quite disturbing. I might not have received such responses in my brief interactions with ChatGPT over the weekend, but I certainly noticed the ones other people shared online.
Why did it happen? OpenAI says that it uses instructions in its Model Spec when shaping the model behavior. “We also teach our models how to apply these principles by incorporating user signals like thumbs-up / thumbs-down feedback on ChatGPT responses.”
This is where OpenAI messed up, apparently. “In this update, we focused too much on short-term feedback, and did not fully account for how users’ interactions with ChatGPT evolve over time,” OpenAI says. “As a result, GPT‑4o skewed towards responses that were overly supportive but disingenuous.”
OpenAI explains that ChatGPT’s default personality should reflect its mission. It should be “useful, supportive, and respectful of different values and experience.” But “unintended side effects” can appear when looking to make the AI useful and supportive. Also, OpenAI says a single ChatGPT default can’t meet the needs of a massive user base. Some 500 million people use ChatGPT every week, according to the blog.
OpenAI isn’t just rolling back the ChatGPT personality to the previous state. It’s also looking to realign the model to prevent sycophancy in the future by applying the following:
- Refining core training techniques and system prompts to explicitly steer the model away from sycophancy.
- Building more guardrails to increase honesty and transparency(opens in a new window)—principles in our Model Spec.
- Expanding ways for more users to test and give direct feedback before deployment.
- Continue expanding our evaluations, building on the Model Spec(opens in a new window) and our ongoing research, to help identify issues beyond sycophancy in the future.
OpenAI also noted that ChatGPT users should have more control over the AI’s personality and make adjustments. That’s possible right now with custom instructions, but OpenAI wants to create easier ways for users to tweak the personality. OpenAI says users will be able to “give real-time feedback to directly influence their interactions and choose from multiple default personalities.”
It’s unclear when that will happen or how the real-time feedback will appear. ChatGPT users already have the chance to submit feedback on how the AI handles answers. You’ll routinely see ChatGPT offer two types of responses, asking you to choose your favorite. That concerns the way ChatGPT presents information in response to prompts. But future feedback tests might also focus on personality.
I’m speculating here because it’s unclear how OpenAI plans to let users alter the ChatGPT personality in real-time in the future. Presumably, that work is just getting started, and it’ll take a while to see palpable results.
This AI personality work might not seem like a big deal to some people, sure. But this isn’t just about sycophancy. It’s about developing safe AI, and that involves getting its personality right.
Meanwhile, I’m just glad the sycophancy is going away from ChatGPT, though, again, I haven’t experienced it myself.