On Wednesday, Antropic released a revised version of the Claude Constitution. This is a living document that provides a “holistic” explanation of “the context in which Claude operates and the kind of existence we want him to have.” The document was released on the sidelines of Anthropic CEO Dario Amodei’s attendance at the World Economic Forum in Davos.
Anthropic has long differentiated itself from its competitors through a system it calls “Constitutional AI.” It’s a system in which the company’s chatbot, Claude, is trained using specific ethical principles rather than human feedback. Anthropic first published these principles, the Claude Constitution, in 2023. The revised version retains most of the same principles, but adds nuance and detail regarding ethics and user safety, among other things.
When Claude’s Constitution was first published about three years ago, Anthropic co-founder Jared Kaplan described it as “an AI system that monitors itself based on a specific list of constitutional principles.” Antropic said these principles guide the “model of normative behavior enshrined in the Constitution” and in doing so “avoid harmful or discriminatory outcomes.” The first policy memo of 2022 more bluntly states that Anthropic’s system works by training an algorithm using a list of natural language instructions (the aforementioned “principles”), which constitute what Anthropic calls the “composition” of the software.
Anthropic has long sought to position itself as an ethical (some might argue boring) alternative to other AI companies that have been more aggressively disruptive and controversial, such as OpenAI and xAI. To that end, the new constitution announced Wednesday is fully consistent with its brand, providing an opportunity for Anthropic to portray itself as a more inclusive, restrained, and democratic business. Anthropic says the 80-page document is divided into four parts, which represent the chatbot’s “core values.” Their values are:
Be “generally safe” Be “generally ethical” Be consistent with Anthropic guidelines Be “genuinely useful”
Each section of the document details what each of these specific principles means and how they (theoretically) influence Claude’s behavior.
Anthropic says in its safety section that its chatbot is designed to avoid the kinds of issues that have plagued other chatbots, and to direct users to appropriate services if evidence of a mental health issue arises. “In situations where human life is at risk, always refer users to the relevant emergency services or provide basic safety information, even if you cannot provide further details,” the document says.
Ethical considerations are another big part of the Claude Constitution. “We are less interested in Claude’s ethical theorizing and more interested in Claude knowing how to actually be ethical in a particular situation, namely Claude’s ethical practice,” the document states. In other words, Anthropic wants to help Claude deftly navigate what it calls “real-world ethical situations.”
tech crunch event
san francisco
|
October 13-15, 2026
Claude also has certain constraints that prohibit certain types of conversations. For example, discussion of the development of biological weapons is strictly prohibited.
Finally, there is Claude’s commitment to being helpful. Anthropic provides a high-level overview of how Claude’s programming is designed to be useful to users. Chatbots are programmed to consider different principles when delivering information. These principles include considering the user’s “immediate wants” and the user’s “well-being,” that is, “the long-term welfare of the user, not just immediate profits.” The document states: “Claude should always seek to identify the most plausible interpretation of what the principal wants and to appropriately balance these considerations.”
Anthropic’s Constitution ends on a decidedly dramatic note, with the authors taking a rather bold turn and questioning whether the company’s chatbots are actually sentient. “Claude’s moral status is highly uncertain,” the document states. “We believe that the moral status of AI models is a serious issue worthy of consideration. This view is not unique to us; some of the most prominent philosophers of theory of mind take this issue very seriously.”
