Books

The AI Commander

Centaur Teaming, Command, and Ethical Dilemmas

by James Johnson2024Oxford University Press

James Johnson, an associate professor of strategic studies at the University of Aberdeen, writes on nuclear strategy, emerging technology and the future of warfare. The AI Commander is his attempt to think through what happens when artificial intelligence moves from the margins of military operations into the seat of command itself, and what the human officer becomes when an algorithm is offering — or making — the next decision.

The book’s central argument is that the most consequential question is not whether AI will replace generals, but how the partnership between human and machine will be designed. Johnson borrows the chess term centaur to describe a hybrid commander in which silicon handles speed, pattern-matching and breadth while flesh handles judgement, accountability and the moral weight of killing. He argues that the cognitive shortcuts machines impose — speed pressure, automation bias, the seductive clarity of a confidence score — could quietly hollow out the deliberative core that makes command lawful and legitimate in the first place.

Across its chapters Johnson moves through the technical landscape of machine learning, decision-support tools and lethal autonomous weapons, then sets that landscape against the long tradition of military command from Clausewitz onward. He examines how AI compresses the OODA loop, how generative and predictive systems reshape intelligence and targeting, and how machine speed interacts with nuclear command and control — a domain where minutes already feel short and where shaving them further may be destabilising. He works through case studies and scenarios involving great-power competition with China and Russia, the war in Ukraine, and the prospect of AI-enabled escalation in a Taiwan crisis. Ethical dilemmas run throughout: meaningful human control, the diffusion of responsibility across software vendors, programmers and operators, the brittleness of black-box models in the friction of real combat, and the risk that adversaries will read each other’s automated behaviour as intent rather than artefact. He draws on just war theory, international humanitarian law and behavioural psychology to test whether existing frameworks can hold under these new conditions.

Where the book sits in the field is on the bridge between two literatures that often talk past one another — the technical writing on AI safety and autonomy, and the strategic writing on deterrence and command. Johnson is conversant in both, which makes The AI Commander particularly useful for officers, policymakers and analysts who need a careful, sceptical reading of centaur warfare rather than either a sales pitch or a panic. It is a measured book about an unmeasured shift in how wars will be decided.

Read the longer summary

James Johnson’s The AI Commander appeared from Oxford University Press in 2024, near the close of a decade in which the conversation about artificial intelligence in warfare had moved from speculative to operational. Drones over Ukraine, decision-support tools fielded under the United States Department of Defense’s Joint All-Domain Command and Control initiative, and the steady proliferation of computer-vision targeting systems had all collapsed the comfortable distance between the academic AI-in-war literature and the things commanders were actually using. Johnson, a senior lecturer in strategic studies at the University of Aberdeen and the author of an earlier study of AI and nuclear stability, set out to address a narrower question than the broader survey works that had come before: not whether autonomous weapons are coming, but how human commanders and the machines that advise them actually share decision-making, and what happens when that sharing goes wrong.

The book sits within a small but active scholarly conversation. Paul Scharre’s Army of None and Four Battlegrounds had mapped the terrain of autonomous weapons and great-power AI competition. Kenneth Payne’s I, Warbot had examined the strategic logic of machine intelligence in conflict. Christian Brose’s The Kill Chain had argued for radical reorganization of United States forces around networked sensing. Heather Roff, Michael Horowitz, and Avi Goldfarb had each contributed pieces of an emerging argument about the limits of algorithmic judgment in violent contexts. Johnson’s contribution is to take seriously a concept that the other authors mention without dissecting — the idea of the centaur, the human and the machine fused into a single decision-making unit — and to ask whether that metaphor survives contact with the realities of high-stakes military command.

The centaur metaphor originates in chess. After Garry Kasparov’s defeat by Deep Blue in 1997, a format known as advanced chess emerged in which human players paired with computer engines competed against each other. For a brief period the pairings outperformed both pure humans and pure machines, and the metaphor migrated into management literature, into intelligence analysis, and finally into the writings of United States defense officials looking for an organising principle for their AI investments. Robert Work, the former deputy secretary of defense whose Third Offset speeches did much to popularize the term inside the Pentagon, called human-machine teaming the central design idea of future force structure. Johnson’s central argument is that this importation was less rigorous than it looked. Chess, he observes, is a clean environment with perfect information, fixed rules, and unambiguous outcomes. Military command is none of those things. The cognitive partnership that worked in advanced chess does not straightforwardly transfer to a battlefield where the information is partial, the rules are contested, and the outcomes resist clean evaluation.

From this starting point the book builds a more uncomfortable picture. The human element of a centaur team, Johnson argues, is not a stable ballast against machine error. It is itself a cognitive system with predictable failure modes: automation bias, the well-documented tendency to over-trust algorithmic recommendations even in the face of contradicting evidence; cognitive offloading, the gradual erosion of skills and intuitions that humans no longer practise because the machine does them; mode confusion, the misreading of what the system is doing or about to do. Each of these is well established in the human-factors literature on aviation, medicine, and process control, and Johnson’s contribution is to import that literature into the military command context and ask what it implies. The implication, in his telling, is that giving a commander an AI advisor does not produce a strictly better decision-maker. It produces a different kind of decision-maker, one whose particular failure modes are not yet well understood and whose performance on edge cases — the cases that matter most in war — is essentially untested.

The structure of the book takes the reader through this argument by stages. Early chapters establish the theoretical framing: what a centaur team is, where the term comes from, how it has been adopted in defense circles, and what the gap between metaphor and practice looks like. Middle chapters turn to the cognitive science of human-machine collaboration, drawing on decades of research into trust calibration, situational awareness, and the human side of complex sociotechnical systems. From there the book moves into the specifically military terrain: command culture in different services and different nations, the role of operational tempo and the pressure to compress decision cycles, and the way that AI-enabled adversaries create a competitive dynamic in which slow human deliberation becomes a liability rather than a virtue. A later chapter takes the argument into the nuclear domain, returning to the territory of Johnson’s earlier work and asking what AI integration into nuclear command and control might mean for stability, for accidental escalation, and for the strategic logic of deterrence itself. The book closes with chapters on ethics, accountability, and the responsibility gap — the question of who answers, legally and morally, when a centaur team produces a catastrophic outcome that neither the human nor the machine could be cleanly said to have authored alone.

The examples and evidence are drawn from the contemporary inventory. Johnson discusses Project Maven, the United States military’s computer-vision targeting effort that became a flashpoint over Google’s involvement and that has since become embedded as a standard piece of the targeting workflow. He examines the Joint All-Domain Command and Control concept and the broader push to network sensors, shooters, and decision-aids into a single fabric. He looks at Russian and Chinese investments in AI for military purposes, and at the strategic-stability problem created by the asymmetric speed and opacity of those investments. He returns repeatedly to the Soviet-era Perimeter or Dead Hand arrangement, the semi-automated nuclear retaliation system reported to remain in some form in Russian service, as a case study in what happens when the pressure for assured response leads to delegation of consequential decisions to machinery. Israeli use of algorithmic targeting, the autonomous loitering munitions employed in Nagorno-Karabakh and Ukraine, and experimentation with AI co-pilots in air-combat exercises such as the Defense Advanced Research Projects Agency’s AlphaDogfight trials all supply concrete instances of the abstract problem.

Within those examples the author repeatedly returns to a small set of recurring failure modes. One is the speed trap. An adversary willing to delegate decisions to machines runs a faster decision cycle. The pressure on the cautious side to match that speed creates a ratchet: each step toward delegation reduces the time available for human deliberation, which in turn creates further pressure to delegate. Johnson is careful to note that this dynamic is not new — Cold War nuclear command and control struggled with the same problem under different technology — but it is sharper under AI because the speed differentials are larger and the systems involved are less interpretable. A second is the brittleness of the AI advisor under conditions outside its training distribution. Machine-learning systems behave smoothly inside the envelope of cases they were trained on and badly outside it. The cases that matter in war — surprise, deception, novel tactics — are precisely the ones likely to lie outside the envelope. The commander relying on an AI advisor is most at risk of being misled at exactly the moment when good judgment matters most. A third is the loss of the deliberative space that has traditionally surrounded consequential military decisions. Johnson is explicit that this is partly a cultural argument: command, in its older form, was an art that included the slow weighing of options, the consultation of advisors, and the accumulation of judgment over a career. The substitution of an algorithmic advisor in that space does not merely add a tool. It changes the practice.

The treatment of ethics and accountability follows from these observations. If a centaur team produces a bad outcome — a strike on the wrong target, a misjudged escalation, a failure to act — the question of responsibility becomes harder to answer than in the traditional case. The human commander is in some sense in the loop, but the loop has been so compressed and so mediated by the advisor that meaningful agency may be hard to locate. The machine cannot be held to account in any morally serious sense. The designers of the system are removed in time and space from the decision. The contractors and procurement officials and political authorities who fielded the system have hands in the chain, but distant ones. Johnson surveys the existing literature on the responsibility gap — work by Robert Sparrow, by Heather Roff, and by others — and concludes that the centaur framing makes the problem worse rather than better. By distributing the decision across the team, it creates a situation in which everyone is partly responsible and no one is fully so.

The reception of the book has tracked the broader fault lines in the field. Reviewers sympathetic to Johnson’s framing have welcomed the corrective to optimistic accounts of human-machine teaming, particularly the way the book imports the human-factors literature into a domain that has often treated it as peripheral. Readers more invested in the operational promise of AI have pushed back on what they see as an excessive focus on failure modes at the expense of the cases in which AI advisors clearly do help. Some have argued that Johnson underweights the practical reforms — training curricula, interface design choices, doctrinal guardrails — that the human-factors community has developed to mitigate the failure modes he describes. The book has been read alongside Scharre’s Four Battlegrounds and Payne’s I, Warbot as part of a small canon of book-length treatments of strategic AI, and within that canon it is the volume that takes the cognitive science most seriously. Johnson has been a regular voice in the policy conversation since publication, contributing to debates about a possible understanding among the major powers on AI in nuclear command, which has emerged as one of the harder arms-control questions of the period.

For a reader working through the literature on AI in war today, The AI Commander serves a particular function. It is not the introductory survey — Scharre’s Army of None still does that job better — and it is not the operational concept book that Brose’s The Kill Chain offers. It is the book to read for the question of what happens at the seam between human and machine in command, and what the cognitive science actually says about that seam. It pairs naturally with Payne’s work on machine strategy and with Johnson’s own earlier book on AI and nuclear stability. It is less concerned with the politics of regulation than works by writers such as Frank Sauer, and it does not engage in detail with the legal-humanitarian debate that animates much of the international discussion at the Convention on Certain Conventional Weapons meetings in Geneva. Readers interested in those threads will want to look elsewhere.

The durable element of the book is the analytic frame. The specific systems will date — Project Maven will give way to its successors, the JADC2 vocabulary will be displaced by some new acronym, the particular Russian and Chinese programmes Johnson discusses will evolve — but the underlying questions about how humans and machines share decision-making under pressure, about the failure modes of the partnership, and about who answers for the outcomes will remain. As militaries integrate language-model-based interfaces, automated planning tools, and increasingly autonomous platforms, the centaur question becomes more pressing, not less. What is likely to age less well is the calibration of Johnson’s worry against the worry of his contemporaries. The decade since the centaur concept entered defense discourse has been one in which the operational integration of AI has run ahead of the theoretical scrutiny. The AI Commander is one of the more serious attempts to slow that gap down by giving the scrutiny a coherent shape, and on that count it will be read for a while yet, whatever happens to the particular battlefields and budget lines it describes.

computers
Listed in Claude knowledge sweep NATO library AI guide

Publisher's description

This book addresses the largely neglected question of how the fusion of machines into the war machine will affect the human condition of warfare. It emphasizes the "mind" and the mechanisms of thought (intelligence, consciousness, emotion, memory, experience, etc.) to consider the effects of AI and autonomy on the human condition of war.
  • Computers

Last researched .