Governing Lethal Behavior in Autonomous Robots

by Ronald C. Arkin2009CRC Press

Ronald C. Arkin, a roboticist at Georgia Tech’s Mobile Robot Laboratory, wrote this 2009 volume as a technical and philosophical proposal: that autonomous machines on the battlefield could be engineered to behave more ethically than human soldiers, and that the engineering work to do so should begin now rather than after such systems are already fielded. The book grew out of research funded by the US Army Research Office and is addressed equally to roboticists, military planners, and ethicists.

Arkin’s central argument is that the laws of war and rules of engagement are, in principle, formalisable as machine-enforceable constraints, and that a properly designed autonomous platform need not share the cognitive and emotional failure modes that lead human combatants to commit atrocities. He cites the 2006 Surgeon General’s Office Mental Health Advisory Team report on US troops in Iraq — the findings on stress, dehumanisation, and willingness to mistreat non-combatants — as evidence that the human baseline for ethical battlefield conduct is lower than usually assumed. The aim is not to claim machines will be perfectly moral, but to argue that the bar they must clear is set by people, not by abstractions.

The bulk of the book lays out an architecture. Arkin describes an “ethical governor” that sits between a robot’s deliberative planner and its weapons release, evaluating each candidate lethal action against encoded representations of the Geneva and Hague conventions, the principles of discrimination and proportionality drawn from just war theory, and mission-specific rules of engagement. He pairs it with an “ethical adapter” that uses affective computing — modelled guilt and remorse — to throttle behaviour after errors, and a “responsibility advisor” that records the chain of human authorisation behind any lethal decision. Worked scenarios include building-clearing operations against insurgents, response to sniper fire from a mosque, and the discrimination of combatants from civilians at a checkpoint. Each is run through the proposed governor to show how the formal constraints would block or permit fire.

Where the book sits in the field is somewhat distinctive: Arkin writes as an engineer who takes the philosophical objections seriously rather than as a philosopher critiquing engineers from outside. He engages directly with the work of Peter Asaro, Noel Sharkey, and the Campaign to Stop Killer Robots, conceding that present-day perception and reasoning systems fall well short of what his governor would require, while maintaining that the research direction is the responsible one. For readers tracking the autonomous weapons debate, the volume remains a useful primary source — a clear statement of the strongest “machines can be made ethical” position, written before the drone wars made the question urgent in policy circles, and still the reference text whenever an ethical-governor architecture is invoked.

Read the longer summary

Ronald C. Arkin’s Governing Lethal Behavior in Autonomous Robots appeared in 2009, when the United States had been flying armed Predators over Iraq and Afghanistan for several years, ground robots like PackBot and TALON were clearing roadside bombs, and the Pentagon’s Future Combat Systems programme — though about to be cancelled — was still pushing toward a battlefield populated by unmanned platforms. Arkin, a roboticist at the Georgia Institute of Technology with a long career in behaviour-based robotics and a previous textbook of that title, had spent three years on a project funded by the U.S. Army Research Office to ask a question that most of his peers preferred to leave to philosophers and lawyers: if armed robots are coming, how do you build the part that decides whether to pull the trigger? The book is the long-form report of that project, expanded into a monograph. It joined a small but rapidly growing literature on autonomous weapons — Peter Singer’s Wired for War came out the same year, and Armin Krishnan’s Killer Robots followed shortly after — but stood apart by being neither journalism nor a policy brief. It is an engineer’s book, written by someone who proposes to do the thing.

The central argument is unusual enough to have framed the debate for the decade that followed. Arkin contends that autonomous systems, properly designed, can in principle behave more ethically on the battlefield than human soldiers. The claim rests on a comparison: humans under combat stress are subject to fear, fatigue, anger, the desire for revenge, and well-documented failures of moral judgement, all catalogued in the Surgeon General’s Mental Health Advisory Team reports from Iraq, which Arkin cites at length and which show troubling rates of soldiers admitting they would not report a fellow soldier for harming a noncombatant. A machine has none of those drives. It does not panic, does not seek vengeance, does not lie to itself about what it saw, and can be designed to refuse an order that violates the laws of war. The claim is carefully bounded — Arkin does not say robots are ethical now, or that they should replace humans, or that all missions are appropriate for autonomous lethality. He says that for a narrow set of well-defined missions, in well-bounded environments, an architecture can be built that adheres to international humanitarian law more reliably than the human alternative. The book is the blueprint for that architecture.

The structure follows that engineering brief. After an opening that establishes the trajectory of unmanned systems in modern conflict — citing Department of Defense roadmaps, the proliferation of armed Predators and Reapers, the use of the SWORDS armed ground robot in Iraq, and the Israeli Harpy loitering munition as an existing autonomous lethal system — Arkin spends a long stretch on the philosophical and legal terrain. He walks through just war theory, the distinction between jus ad bellum and jus in bello, the principles of discrimination and proportionality, the Geneva Conventions and their additional protocols, and the U.S. Rules of Engagement as practised. He treats these not as background colour but as design requirements. A chapter on the evidence for human failure on the battlefield draws on Dave Grossman’s work on the psychology of killing and on the Mental Health Advisory Team data to make the comparative case. From there the book turns technical: the formal representation of the laws of war as constraints, the proposed architecture, the responsibility advisor and operator interface, and an implementation in simulation. The closing chapters cover objections — from Noel Sharkey, from the International Committee for Robot Arms Control then being organised, from roboticists sceptical of the whole project — and Arkin’s responses.

The architecture itself is where readers either follow Arkin or part ways with him. He proposes what he calls an ethical governor, modelled loosely on the mechanical governor on a steam engine: a component that sits between the robot’s tactical reasoning and its weapons release, and that can only inhibit lethal action, never initiate it. Action is generated by a behavioural controller in the tradition of Arkin’s earlier AuRA architecture; the governor evaluates each proposed lethal response against a set of constraints derived from the laws of war and rules of engagement, expressed as logical predicates over the perceived world state. If the proposed action would violate any constraint — firing on a target inside a protected site, using a disproportionate munition, engaging an entity that has surrendered — the governor blocks it. A second component, the ethical adaptor, uses an affective model of guilt: when a strike produces outcomes worse than predicted, an internal guilt variable rises, and as it rises the system’s willingness to apply lethal force is throttled down across subsequent engagements. A third component, the responsibility advisor, is the human-in-the-loop interface — it forces a human operator to acknowledge specific overrides, logs who authorised what, and is the part of the architecture explicitly designed to keep the chain of accountability intact when something goes wrong. Arkin works the architecture through scenarios drawn from real doctrine: a sniper in a minaret, where the constraint is the protected status of the religious site; a building suspected of housing combatants and noncombatants, where proportionality calculations gate the weapon choice; a convoy ambush in urban terrain. He runs them in a prototype implementation built on the MissionLab simulation environment from his lab at Georgia Tech and reports the governor refusing or modifying actions that the underlying tactical layer would otherwise have taken.

The examples and evidence are where the book is densest and where listeners or readers tend to learn the most. Arkin catalogues incidents he treats as motivating: the Haditha killings in 2005, where U.S. Marines killed twenty-four Iraqi civilians; the conduct described in the Taguba report on Abu Ghraib; the recurring problem of friendly fire and civilian casualties in close air support. He details the systems already in the field — the Phalanx close-in weapon system on U.S. Navy ships, which has autonomous engagement modes against incoming missiles and was the subject of well-known incidents long before this book; the Patriot missile batteries that shot down a British Tornado and a U.S. F/A-18 during the 2003 invasion of Iraq, where the level of autonomy and the human operator’s role were both implicated; the Aegis combat system; the Israeli Harpy and its successor Harop, which can loiter over an area and dive on emitters matching a stored signature without further human authorisation; South Korea’s SGR-A1 sentry robot, deployed along the demilitarised zone with a claimed autonomous engagement mode. He treats these as existence proofs that the question is not whether to permit autonomous lethal systems but how to govern the ones already in service and the ones whose deployment is being planned. He cites the U.S. Army Unmanned Aircraft Systems Roadmap and the Joint Robotics Program documentation to show that increasing autonomy is an explicit objective of acquisition, not a speculative future. He walks through the laws of war as they would apply to specific cases — what counts as a military objective under Article 52 of Additional Protocol I, how the principle of distinction interacts with combatants who do not wear uniforms, how proportionality calculations are made in practice — and shows how each becomes a constraint expressible in the governor.

Reception in the field has been sharper than the prose of the book itself. Noel Sharkey, the British roboticist who became one of the founders of the Campaign to Stop Killer Robots, has been the most prominent critic, arguing in a series of papers and in his own later book that the discrimination problem — telling a combatant from a noncombatant in a real environment — is so far beyond current and foreseeable machine perception that Arkin’s project amounts to providing intellectual cover for systems that cannot meet the legal standard he sets for them. Human Rights Watch’s 2012 report Losing Humanity, written with the Harvard Law School International Human Rights Clinic, took a similar line and called for a pre-emptive ban on fully autonomous weapons. The International Committee for Robot Arms Control, formed in 2009 and including Sharkey, Jürgen Altmann, Peter Asaro and Robert Sparrow, has argued from a related position. Sparrow’s earlier paper “Killer Robots” raised the responsibility-gap problem that Arkin’s responsibility advisor is partly designed to answer: if an autonomous system kills a civilian, who is morally and legally responsible? Sparrow argued no satisfactory answer exists; Arkin’s response is that the architecture itself, combined with logged operator acknowledgements, distributes responsibility across designers, commanders and operators in ways the law can handle. The debate that followed at the United Nations Convention on Certain Conventional Weapons, where Group of Governmental Experts meetings on lethal autonomous weapons systems have been running since 2014, has largely been a continuation of the argument between the position Arkin staked out in this book and the position the campaigners staked out in response.

Other roboticists have engaged more sympathetically with the engineering while remaining sceptical of the deployment timeline. Wendell Wallach and Colin Allen, whose Moral Machines appeared the same year, took a broader view of machine ethics that placed Arkin’s work as one approach among several to building artificial moral agents. Patrick Lin, working with George Bekey and Keith Abney on a 2008 report to the U.S. Office of Naval Research that overlapped substantially with the audience for this book, treated the ethical-governor approach as one of the serious options on the table. Within the U.S. defence establishment, the 2012 Department of Defense Directive 3000.09 on autonomy in weapon systems — which requires that autonomous and semi-autonomous weapon systems allow commanders and operators to exercise appropriate levels of human judgment — reflects something close to the human-in-the-loop position Arkin advocates, though without endorsing his specific architecture. The directive’s renewal in 2023 kept the same posture.

For someone reading widely on AI in war today, the book sits at a particular spot in the literature: it is the book where the technical case for designing ethical constraints directly into autonomous weapons was made in full, by someone with the credentials to make it. It pairs naturally with Singer’s Wired for War, which gives the journalistic and political tour; with Krishnan’s Killer Robots, which takes the legal and arms-control view; with Sharkey’s later writing, which gives the opposing engineering perspective; and with Paul Scharre’s Army of None from 2018, which updates the policy picture with material on the autonomous drone swarms and AI-driven targeting that were not yet operational in 2009. It does not cover machine learning in any depth — the governor is symbolic, rule-based, and works over hand-specified predicates, which is both a strength when judged against the explainability demands of the laws of war and a weakness when judged against the actual direction the field has taken with deep learning. It does not anticipate the role commercial AI companies would come to play in defence, the controversies around Project Maven, or the use of off-the-shelf computer vision in loitering munitions in Ukraine. It is firmly a book of its decade in its examples, even where its argument has aged better than its case studies.

What is likely to age well is the framing. The proposition that ethical reasoning in autonomous weapons is an engineering problem that can be specified, implemented and tested — rather than a problem to be deferred until someone decides whether the weapons should exist — has become the working assumption inside every defence ministry that is now fielding autonomous systems in any meaningful number. The architectural distinctions Arkin draws between behaviour generation, ethical inhibition, affective modulation and responsibility tracking still map onto the components engineers actually build. What has aged less well is the confidence that the laws of war can be expressed as a tractable set of logical constraints over a perceptual world the robot accurately understands; the perception problem turned out to be harder than the ethics problem, and the ethics problem turned out to be harder than the 2009 framing implied. The book remains the clearest statement of a position that the field has spent the years since either building on or arguing against, and which the operational deployment of autonomous and semi-autonomous lethal systems in Nagorno-Karabakh, Libya, Gaza and Ukraine has made impossible to set aside.

Listed in Claude knowledge sweep

Publisher's description

Publisher data is pending — Google Books quota deferred until 2026-07-03T13:58:19.897499+00:00.

Last researched 2026-05-22.