AI technology is poised to transform national security. In the United States, experts and policymakers are already experimenting with large language models that can aid in strategic decision-making in conflicts and autonomous weapons systems (or, as they are more commonly called, “killer robots”) that can make real-time decisions about what to target and whether to use lethal force.
But these new technologies also pose enormous risks. The Pentagon is filled with some of the country’s most sensitive information. Putting that information in the hands of AI tools makes it more vulnerable, both to foreign hackers and to malicious inside actors who want to leak information, as AI can comb through and summarize massive amounts of information better than any human. A misaligned AI agent can also quickly lead to decision-making that unnecessarily escalates conflict.
“These are really powerful tools. There are a lot of questions, I think, about the security of the models themselves,” Mieke Eoyang, the deputy assistant secretary of Defense for cyber policy during the Joe Biden administration, told POLITICO Magazine in a wide-ranging interview about these concerns.
In our conversation, Eoyang also pointed to expert fears about AI-induced psychosis, the idea that long conversations with a poorly calibrated large language model could spiral into ill-advised escalation of conflicts. And at the same time, there’s a somewhat countervailing concern she discussed — that many of the guardrails in place on public LLMs like ChatGPT or Claude, which discourage violence, are in fact poorly suited to a military that needs to be prepared for taking lethal action.
Eoyang still sees a need to quickly think about how to deploy them — in the parlance of Silicon Valley, “going fast” without “breaking things,” as she wrote in a recent opinion piece. How can the Pentagon innovate and minimize risk at the same time? The first experiments hold some clues.
This interview has been edited for length and clarity.
Why specifically are current AI tools poorly suited for military use?
There are a lot of guardrails built into the large language models that are used by the public that are useful for the public, but not for the military. For instance, you don’t want your average civilian user of AI tools trying to plan how to kill lots of people, but it's explicitly in the Pentagon’s mission to think about and be prepared to deliver lethality. So, there are things like that that may not be consistent between the use of a civilian AI model and the military AI model.
Why is the tweak not as simple as giving an existing, public AI agent more leeway on lethality?
A lot of the conversations around AI guardrails have been, how do we ensure that the Pentagon's use of AI does not result in overkill? There are concerns about “swarms of AI killer robots,” and those worries are about the ways the military protects us. But there are also concerns about the Pentagon's use of AI that are about the protection of the Pentagon itself. Because in an organization as large as the military, there are going to be some people who engage in prohibited behavior. When an individual inside the system engages in that prohibited behavior, the consequences can be quite severe, and I'm not even talking about things that involve weapons, but things that might involve leaks.
Even before AI adoption, we've had individuals in the military with access to national security systems download and leak large quantities of classified information, either to journalists or even just on a video game server to try and prove someone wrong in an argument. People who have AI access could do that on a much bigger scale.
What does a disaster case for internal AI misuse look like?
In my last job at the Pentagon, a lot of what we worried about was how technology could be misused, usually by adversaries. But we also must realize that adversaries can masquerade as insiders, and so you have to worry about malicious actors getting their hands on all those tools.
There are any number of things that you might be worried about. There's information loss, there's compromise that could lead to other, more serious consequences.
There are consequences that could come from someone’s use of AI that lead them to a place of AI psychosis, where they might engage in certain kinds of behaviors in the physical world that are at odds with reality. This could be very dangerous given the access that people have to weapons systems in the military.
There are also concerns with the “swarms of killer robots” people are worried about, which involve escalation management. How do you ensure that you're not engaging in overkill? How do you ensure that the AI is responding in the way that you want? And those are other challenges that the military is going to have to worry about and get their AI to help them think through.
On that last point, we published a piece in POLITICO Magazine recently from Michael Hirsh in which he reported that almost all public AI models preferred aggressive escalation toward a nuclear war when presented with real-life scenarios. They didn’t seem to understand de-escalation. Has that been your experience in working with these tools?
I think one of the challenges that you have with AI models, especially those that are trained on the past opus of humans, is that the tendency toward escalation is a human cognitive bias already. It already happens without AI. So what you're enabling with AI is for that to come through faster. And unless you're engineering in some way to say, “Hey, check your cognitive biases,” it will give you that response.
So does the Pentagon need to develop its own AI tools?
I think that they need to be working on how to develop tools that are consistent with the ways that the Pentagon operates, which are different than the ways the civilian world operates, and for different purposes. But it really depends on which mission set we’re talking about. A lot of this conversation has been about AI around large language models and decision support. There's a whole different branch of AI that the military needs to engage in, and it's about navigating the physical world. That's a totally different set of challenges and technologies.
When you think about the idea of unmanned systems, how do they navigate the world? That's technology like self-driving cars. Those are inputs that are not the same as taking in large quantities of human text. They're about: How do you make sense of the world?
Is there a general need for more understanding of the utility of AI in the military? What are some ways that high-ranking officials at the Pentagon misunderstand AI?
It's not a fully baked technology yet, and so there are activities like moving consideration of AI into the research and development space for the Pentagon, which the Donald Trump administration did, that make a lot of sense. That allows you to do testing and work through some of these new features and develop these technology models in ways that refine them. This means that when they land on the desks of a wider range of Pentagon personnel, they've worked through some of these kinks.
What’s the way forward with AI tools when it’s so difficult to prevent misuse?
One of the things that we need to do going forward is to be much more specific about the particular missions for which we are thinking about adopting AI into the Pentagon. The Pentagon is a trillion-dollar enterprise, and it's going to have a lot of business functions that's like any other business in the United States — basic business functions like booking travel or payroll.
And then, there are areas that are more military-unique, and those may deserve more specialized study, because there is not this civilian ecosystem that is also involved in the testing and development of these technologies. The Pentagon may have to fund their own research into things like understanding unidentified objects coming toward the United States or robots that need to navigate a battlefield or making sense of lots of different strands of intelligence reporting.