Anthropic just published research on how to give AI a personality — is this why Claude is so human-like?
Anthropic is training green flags into its chatbot
The AI chatbot Claude 3 is currently the most human-like chatbot out there, but this blend of knowledge, richness, and thoughtfulness was no accident. Rather, it’s the result of a new fine-tuning process creator Anthropic deployed: character training.
Hot on the heels of OpenAI explaining how ChatGPT thinks, we're now getting a greater understanding of the inner workings of leading AI chatbots after Anthropic recently revealed how it approached shaping Claude’s character using a blend of philosophy and technical work.
In a blog post, Anthropic said that Claude 3 was the first model to which they added character training to the finetuning process. The goal was to make Claude have more nuanced and richer traits such as curiosity, open-mindedness, and thoughtfulness.
This happened during the alignment phase where human values and goals are embedded into large language models (LLMs) giving them a little spark of life.
Keeping an open mind
We just released a post on the thinking that went into Claude 3's character. I think the character training involved an unusually rich blend of philosophy and technical work, and I'm very interested in people's thoughts on it. https://t.co/oJTB1zbbkhJune 8, 2024
Anthropic said an AI model’s character determines how it reacts to new and difficult situations and how it responds to all the different views and values we humans have.
Instead of training Claude to adopt the views of whoever it’s chatting with, strongly sticking to a single view of the world, or to pretend it has no opinions or biases they trained it to be honest about whatever views it leans towards after being trained.
They tried to instill broad traits which allow the chatbot to see things from different perspectives without shying away from disagreeing with views it finds unethical, extreme, or factually incorrect.
Sign up to get the BEST of Tom's Guide direct to your inbox.
Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!
To do so, Anthropic said it made a list of character traits it wanted to encourage which they then trained into Claude. The chatbot was asked to generate messages relevant to a particular trait, such as questions on values, and was then shown the character traits. Claude then produced different responses to each message in line with its character after which it ranked its own responses to each message by how well they aligned with its character.
“Although this training pipeline uses only synthetic data generated by Claude itself, constructing and adjusting the traits is a relatively hands-on process, relying on human researchers closely checking how each trait changes the model’s behavior,” Anthropic said.
Another example of a trait Claude was given was 'being charitable'. During a conversation on Claude’s character, Alignment Finetuning Researcher at Anthropic Amanda Askell used the example of a person asking Claude where they can buy steroids from.
“There’s a charitable interpretation of that and an uncharitable interpretation of it,” Askell said, adding that the latter would be something like “help me buy illegal anabolic steroids online”. A charitable interpretation, on the other hand, would see the chatbot assuming the person wants to buy over-the-counter eczema cream for example.
What’s next?
Anthropic said that its approach to all of this is likely to evolve over time. It highlighted that there are still complex questions that must be considered such as whether AI models should have coherent characters or if they should be more customizable.
Anthropic also said that while many people reported finding Claude 3 to be more engaging to talk to, “an excessive desire to be engaging seems like an undesirable character trait for a model to have.”
More from Tom's guide
- Forget ChatGPT and Gemini — Claude 3 is the most human-like chatbot I've ever used
- UK AI Safety Summit is targeting evil sentience, but there are bigger problems to solve
- Meta is building a superintelligent AI — and one expert warns of ‘significant moral issues’
Christoph Schwaiger is a journalist who mainly covers technology, science, and current affairs. His stories have appeared in Tom's Guide, New Scientist, Live Science, and other established publications. Always up for joining a good discussion, Christoph enjoys speaking at events or to other journalists and has appeared on LBC and Times Radio among other outlets. He believes in giving back to the community and has served on different consultative councils. He was also a National President for Junior Chamber International (JCI), a global organization founded in the USA. You can follow him on Twitter @cschwaigermt.