A Few Observations on AI Companies and Their Military Usage Policies
I led the Geopolitics Team at OpenAI for approximately three years and then joined two other teams before deciding to leave in June 2025. I was a researcher who was often tapped to provide ‘feedback’ on military usage policies.
There’s been plenty of good analysis on the Anthropic vs. the Department of War situation. But what has largely been missed is that this story is also about how little the public knows of AI companies and the contract relationships they’ve formed with defense and intelligence communities.
On Friday, Axios reported that the Pentagon agreed to OpenAI’s red lines. Or is it that OpenAI agreed to the Pentagon’s red lines? As I explain more below, there’s significant confusion about what military uses the companies allow or disallow, in part because companies’ public descriptions of allowed uses are vague by design.
People tend to expect a tl;dr for these types of posts, so:
Today, frontier AI companies do not have coherent policies around military use of their AI tools. The usage policies are vague and often change, which allows the company’s leadership to preserve ‘optionality.’
Frontier AI companies should be significantly more forthright about what their policies allow or disallow to the public and, frankly, to their own employees. (What we know of the recent Slack message to employees authored by Sam Altman is not a good example of candor because it leaves ample room for guesswork.)
As a disclaimer, these are my opinions and I’m only drawing on publicly available reporting.
1.
I’m probably not the first to note that Anthropic’s ‘ask’ is the status quo in American policy. It is a matter of US policy that ‘appropriate levels of human judgement’ are applied to semi- and fully-autonomous weapons systems. This policy is outlined in the ‘Project Maven’ memo, aka DoD Directive 3000.09, which was last updated in 2023.
(A company could potentially surveil Americans with legally acquired data, but I am more equipped to speak on AI and military integration given my background, so I’ll stick to what I know best.)
This raises an obvious question: Why would the Pentagon spend the energy fighting Anthropic if they seemingly agree on policy? There are a few (non-exhaustive) possibilities:
People are speaking past one another.
Anthropic’s ‘no’ is interpreted as an affront to the DoW’s authority; to agree to Anthropic’s guardrails may create greater problems for the DoW in the event there’s a future disagreement. (And this administration is exceedingly concerned with projecting strength.)
There could be a real disagreement over what it means to apply ‘appropriate levels of human judgement’ to an autonomous weapons system.
2.
Policy and law are not free-floating static ‘things.’ The borders of the law are fuzzy and filtered through political ideology. Throughout US history, policymakers have reinterpreted and exploited gaps in the law to allow for activity that independent legal observers have called straightforwardly illegal. Particularly egregious historical examples include the so-called ‘Bush Torture Memos’ and the Obama Administration’s decision to systematically exclude certain civilians from the collateral damage count caused by military drone strikes.
And so it’s understandable that Anthropic has balked at language that other frontier AI companies have accepted, which allows the DoW the use of their AI models for ‘all lawful purposes.’ That language provides significant latitude for interpretation.
There isn’t a consensus over what it means in practice to have adequate ‘human supervision,’ ‘human in the loop’ or ‘meaningful human control’ in autonomous weapons systems. Terms that reference human oversight remain contentious around the world. Militaries are still trying to develop new testing and evaluation procedures for reducing problems like e.g. over-reliance in human-AI teams. It’s possible that Anthropic disagreed with how ‘human supervision’ (broadly speaking) would be put into practice.
A few frontier AI company employees have asked me about whether the ‘lawful purposes’ language is a sufficiently strong bulwark against misuse. The answer is always going to be it depends. You have to decide whether that’s good enough and if you trust your company leaders to respond effectively in case something goes wrong.
Edit: This post went live before I viewed OpenAI’s statement on their deal with the Pentagon. The language the company uses is: “no use of OpenAI technology to direct autonomous weapons systems” (emphasis mine.) That is language that requires further clarification, especially given their work on a drone swarm trial with DIU and DAWG. More on that below.
Based on this reading, my expectation is that OpenAI is okay with their models being part of an autonomous weapons system and will try to find a carve-out for acceptable use within that system.
3.
It’s tempting to view Anthropic and its CEO Dario Amodei as heroes. The reality is more complicated.
In the statement that Dario Amodei issued yesterday (Feb 26th), he floated the possibility that he’d be supportive of fully autonomous systems without human supervision for national defense purposes in some cases. What’s stopping him today? It’s that the tools aren’t yet sufficiently reliable.
Just to make sure we’re all following: Dario/Anthropic is open to the use of autonomous weapons that presumably use LLMs in a manner that would be less restrictive than the US policy that currently exists on the books.
This is not a morally enlightened position, and I hope that I’ve misunderstood him. At the very least, it means that the commitments made today by AI companies may change tomorrow based on criteria that are not self-evident.
(I’ve never met Dario Amodei and hold nothing against him personally, but can we please agree that democracy can’t survive on the goodwill of CEOs who have 200 million dollars to burn? This is not a sustainable solution.)
4.
Over the past 5 years, I have witnessed several usage policy updates from OpenAI. When I first joined the company in 2021, there was a complete prohibition on military use. That changed in 2024. But the new language was light on specifics, to put it mildly.
Let’s skip to recent Bloomberg reporting that reveals OpenAI models are being used in a drone swarm trial. On February 12th, an OpenAI spokesperson told Bloomberg that the use of its open source models in a Defense Innovation Unit (DIU) & Defense Autonomous Warfare Group (DAWG) trial for building drone swarms would need to comply with the company’s usage policy.
It turns out that the usage policy can be read in a few ways, depending on whether you believe that the use of an AI voice-to-digital tool in a kill-chain amounts to helping build a weapon, or if you believe that an AI model can be treated in isolation from its larger weapon system. As a public, what are we supposed to take from such assurances?
I’m dwelling on this case because it illustrates a need for the public to hone their eye around the public comms pushed by frontier AI companies. A quick hack is to pay attention when companies argue that their technologies aren’t used for ‘offensive’ or ‘kinetic’ purposes. Google pulled a similar comms stunt surrounding their Project Maven contract in 2018 when they claimed that their object detection software was ‘non-offensive.’
One reading is that this is a way to dampen employee and public criticism. Another, as a serious misunderstanding of the distributed authority that characterizes human-machine teaming in war. (They’re called autonomous weapons systems. The least AIish component of the AI weapons system is the trigger.)
As of the time of writing, OpenAI has conceded to the ‘all lawful purposes’ language required by the Department of War. Does that mean their usage policy has changed with regards to the drone swarm trial, or not?
I hope that this convinces you that company policies around military use are volatile, likely to change again, and are subject to a variety of invisible pressures. (You are one of the invisible pressures.)
5.
The biggest losers in all of this are everyday people and civilians in conflict zones. LLMs are black boxes. Defense and intelligence agencies are (mostly) black boxes as a matter of policy. Conflict zones are black boxes due to the fog of war and disinformation.
Our ability to understand the effects of military AI in war is and will be severely hindered due to layers of opacity caused by technical design and policy. It’s black boxes all the way down.
I wish this conversation were also about how we could improve public transparency around AI use in national security. The public does not have to accept the ethical threshold set by privately-held companies on matters of life and death, or their poor communication about how they’ve decided to involve themselves in the national security supply-chain.
Foreign policy is infamously siloed from the public. If there’s a silver lining to the Anthropic-DoW fight, it’s that a conversation that would have been hidden behind closed doors is now open to public scrutiny. The hard part will be paying attention after this story leaves the news cycle.

Sarah, this is the clearest account I've read of why the policy vagueness is a feature
LLMs, defence agencies, conflict zones - each one individually resists scrutiny. Together they create something close to perfect unaccountability its quite clever when you think about it.
I work in responsible AI in financial services, where regulators have spent years building disclosure frameworks precisely because "trust us, our policies are fine" stopped being acceptable after 2008. The language you're describing - "all lawful purposes," "no offensive use" - would not survive five minutes in front of a financial regulator. It would be read immediately as what it is: deliberate ambiguity that preserves room to manoeuvre.
The Dario point is subtle and deserves more attention than it's getting. He didn't say Anthropic will never support autonomous weapons. He said the technology isn't reliable enough yet. The guardrails move when the models improve.
His drone swarm framing is also a tell. Arguing that an AI component in a kill chain isn't "building a weapon" is the same logic as saying a targeting algorithm isn't surveillance because a human pulls the trigger. The system is the weapon. The component is part of the system.
Can this conversation stay open after the news cycle ends? Based on how financial services handled its equivalent moment - it mostly didn't, until the next crisis forced it back open.
Happy Wednesday
I respect your insights so much, though I do feel a sense of alarm that yourself and many other respected voices in the AI - Military integration space rarely speak about the incentives and actions of the defense contractors actually deploying the models for projects to fulfill their contracts and market themselves to attract additional contracts. Having some experience in (then called) DoD contracting, it has felt very clear to me that the most prominent risks associated with AI-military integration are actually effectuated by Defense contractors - not necessarily the frontier model organizations or the DoD itself nor any of the agreements, public or private, made between those two entities.