Chapter 11: Moral Reasoning as Optimal Search

RUNNING EXAMPLE — Priya’s Model

Priya’s decision tree is a pathfinding problem. Start: she knows the algorithm is biased. Goal: a morally defensible outcome. Available moves: fix quietly, escalate internally, report to the FDA, go to the press, resign, do nothing. Blocked nodes: anything violating her NDA carries high legal cost—a deontological constraint with heuristic h ≈ ∞. The ‘fix quietly’ path looks optimal by A*: low cost, no career risk. But her obligation vector points toward escalation. The heuristic gradient says the goal is closer via that harder path. Her reluctance is consequentialist search—minimizing path cost. Her instinct is the heuristic saying the cheaper path does not reach the right goal. Virtue ethics asks: what kind of engineer do I want to be? That is heuristic calibration.

11.1 The Computational Problem

The preceding chapter established the geometry of moral dynamics: the covariant derivative, parallel transport, holonomy, geodesics, and curvature. These are the structural features of the moral landscape. But a landscape, however carefully mapped, does not by itself explain how agents navigate it. Geodesics are the paths of least resistance — the optimal trajectories through moral space. But how does an agent find a geodesic? How does a decision-maker, confronted with a morally complex situation, determine which path through the space of possible actions minimizes behavioral friction?

This chapter answers that question by establishing a formal equivalence between moral reasoning and optimal search. The claim is precise: moral reasoning is A* pathfinding on the moral manifold, where the accumulated behavioral friction serves as the backward cost, and moral obligations serve as pre-compiled heuristic functions that estimate the cost remaining to reach equilibrium. The equation f(n) = g(n) + h(n) — familiar to every AI researcher from Hart, Nilsson, and Raphael’s foundational 1968 paper — is the fundamental equation of moral reasoning.

The shift from geometry to computation is not merely a change of vocabulary. It resolves a deep puzzle: why does moral reasoning feel like search? Why do we deliberate, weigh alternatives, backtrack, and sometimes get stuck? The answer, as this chapter proves, is that the exact computation of the optimal moral geodesic is intractable — the curse of dimensionality on a nine-dimensional manifold makes brute-force computation impossible for any resource-bounded agent. Moral obligations exist because they must: they are pre-compiled heuristic approximations to an intractable optimization problem, and without them, real-time moral judgment would be computationally impossible.

11.2 A* Search on the Moral Manifold

We begin with the navigation problem. An agent occupies a state s₀ ∈ ℳ, the current moral configuration of a multi-agent system. The agent must choose a sequence of actions that moves the system toward a goal region G_eq — a region of low behavioral friction where obligations and interests are approximately aligned. The objective is to find the path γ from s₀ to G_eq that minimizes the total behavioral friction BF[γ] (§6.4).

Definition 11.1 (Moral Search Problem). [Modeling Axiom.] Given the moral manifold ℳ, a current state s₀ ∈ ℳ, and a goal region G_eq ⊂ ℳ of low behavioral friction, the moral search problem is to find the path γ* = argmin_γ BF[γ] from s₀ to G_eq, subject to the constraint that γ does not cross into forbidden strata (the constraint surfaces of §8.3).

Remark on G_eq. The goal region G_eq is formally defined as the subset of ℳ where the obligation vector field is approximately vanishing and behavioral friction is below threshold: G_eq = {s ∈ ℳ : ||Oᵐ(s)|| < ε, BF(s) < δ}. The parameters ε and δ are not fixed by the framework; they are set by the moral community or governance structure. Different choices of G_eq yield different search problems, and disagreement about what constitutes moral equilibrium maps to disagreement about these parameters — a feature the framework represents rather than resolves.

This is a constrained shortest-path problem on a stratified Riemannian manifold. The classical algorithm for optimal pathfinding with heuristic guidance is A* search.

Definition 11.2 (A* Moral Search). [Modeling Axiom.] The A* evaluation of a candidate state n in the search tree is:

f(n) = g(n) + h(n)

where:

g(n) = BF[γ_{s₀ → n}] is the exact accumulated behavioral friction of the path from the start state s₀ to n — the integrated cost of all actions taken so far. This is the backward cost, already defined as the behavioral friction functional (Definition 6.4, Equation 6.4).

h(n) is the heuristic function estimating the cheapest friction-path from n to the goal region G_eq. This is the forward estimate — the agent’s best guess of how much additional cost remains.

f(n) is the total estimated cost of the cheapest path from s₀ to G_eq passing through n.

A* expands the node with the lowest f(n) at each step. It is optimal — guaranteed to find the cheapest path — when h(n) satisfies two conditions: admissibility (h never overestimates the true remaining cost) and consistency (h satisfies the triangle inequality). Both conditions are defined precisely in §11.4.

Why A* rather than its alternatives? The choice is not arbitrary; it reflects the structure of moral reasoning itself:

Dijkstra’s algorithm is A* with h = 0 — no heuristic guidance. The search is blind, expanding nodes in order of backward cost alone. In moral terms, this is pure consequentialism without moral intuition: evaluate every possibility by its consequences, with no prior sense of which directions are promising. It is optimal but computationally explosive.

Greedy best-first search uses only h(n), ignoring backward cost g(n). This is pure rule-following without attention to accumulated consequences: the agent follows its moral intuitions regardless of the costs already incurred. It is fast but not optimal — it can be misled by locally promising but globally costly paths.

Gradient descent (Chapter 10, §10.8) follows the steepest local improvement. It is myopic: it gets stuck at local optima, saddle points, and metric singularities. Chapter 10.8 identified these pathologies; the A* framework explains them: gradient descent is greedy search without backward cost accounting.

A* combines exact backward cost with heuristic forward estimation. It is both optimal (given admissibility) and efficient (the heuristic prunes the search space). In moral terms, it balances consequentialist accounting of past actions with deontological guidance toward the goal. This is the computational content of the familiar observation that good moral reasoning requires both principles and consequences.

The physical analogy is to variational principles. Fermat’s principle (light follows the path of least time), Hamilton’s principle (systems follow the path of least action), and the geodesic equation (free particles follow paths of least curvature) all find optimal paths by combining local cost with global structure. A* is their computational implementation on a discrete approximation of the continuous manifold.

11.3 Obligation Vectors as Heuristic Functions

The central claim of this chapter is that moral obligations — the tangent vectors Oᵐ defined in Chapter 6 (§6.2) — are the gradients of heuristic functions in the A* framework. This is not a new definition of obligation vectors; it is a reinterpretation of the existing formalism that reveals its computational content.

Proposition 11.1 (Obligation-Heuristic Correspondence). [Modeling Axiom.] The obligation vector Oᵐ at a point p ∈ ℳ is the negative metric-raised gradient of the heuristic function h:

Oᵐ = −gᵐᵛ ∂ₙh

The obligation vector points in the direction of steepest descent of h — toward the estimated cheapest path to moral equilibrium.

This proposition reinterprets the obligation vectors of Chapter 6 within the A* framework. When Chapter 6 defined Oᵐ as a tangent vector encoding a “directional moral imperative,” the directionality was left as a structural feature. The A* framework explains what the direction means: it is the direction in which the heuristic function h decreases most rapidly — the direction that, according to the agent’s pre-compiled moral knowledge, most efficiently reduces the estimated cost of reaching equilibrium. Note that this correspondence is not stipulated arbitrarily. Obligation vectors were already defined in Chapter 6 as tangent vectors pointing in the direction of moral improvement. The heuristic gradient points in the direction of steepest estimated cost-reduction. If behavioral friction is the cost measure — as established in §6.4 — then these two directions necessarily coincide, and the modeling axiom reduces to a consistency requirement between the geometric and computational interpretations of the same underlying structure.

Moral obligations such as “do not murder,” “keep your promises,” and the Golden Rule are highly optimized heuristic functions h(n). They allow an agent with limited computational power to instantly approximate the optimal geodesic through the social space. When an agent experiences a “moral obligation,” they are reading the output of an evolutionary heuristic designed to steer the system away from stratum boundaries — away from catastrophic regime transitions that would spike behavioral friction beyond recovery.

The Murder Prohibition as h(n) = ∞

Consider the prohibition of murder. In the A* framework, the heuristic assigns h_murder(n) = ∞ for any state n that involves killing. This is not a modeling failure; it is a correct encoding of the stratification from Chapter 8. Murder crosses a stratum boundary into a catastrophic regime — specifically, it triggers a Type III boundary (nullifier, §8.3), where the moral evaluation structure changes so drastically that the action is unconditionally prohibited regardless of the values on other dimensions.

The true cost g*(n, G_eq) of the cheapest path from a post-murder state back to equilibrium includes: (i) grief processing across all connected agents in the social network; (ii) institutional response — investigation, adjudication, incarceration; (iii) reputational damage propagation through the network; (iv) deterrence expenditure to prevent recurrence; (v) the irreversibility of death, which means that the recovery path may not exist within the catastrophic stratum at all. In the limit of network connectivity, these costs grow without bound, making g*(n, G_eq) = ∞ as well. The critical point is not merely that costs are large, but that the original goal region — G_eq as defined relative to the pre-murder configuration, which includes the victim being alive and unharmed — is genuinely unreachable. The irreversibility of death means that no path from the post-murder state returns to the original G_eq. The social network may eventually reach a different equilibrium, but the goal region that existed before the boundary crossing has been destroyed. This is a claim about path-specificity (the original destination no longer exists), not merely about cost-magnitude (the journey is expensive).

Hence h_murder(n) ≤ g*(n, G_eq), and the heuristic is admissible. The infinite value is not an approximation failure; it correctly reflects that the stratum boundary is a point of no return — the absorbing stratum of §8.3.

Promise-Keeping as a Finite Heuristic

Not all moral heuristics assign infinite cost. Promise-keeping provides a more nuanced example. The heuristic h_promise(n) estimates the trust-rebuilding cost if the promise is broken: the behavioral friction of repairing the damaged social bond, re-establishing reliability, and compensating for the reliance costs incurred by the promisee.

The empirical evidence from §17.2 supports this interpretation directly. The phrase “you promised” triggers a deontic phase transition with 94% effectiveness — a sharp, near-universal shift in moral evaluation that crosses stratum boundaries. In the A* framework, this phrase activates the promise-keeping heuristic, which reports a high (but finite) forward cost for states in which the promise is broken. The 94% effectiveness is a measurement of the heuristic’s sharpness: it is a well-calibrated approximation that fires reliably when the relevant stratum boundary is approached.

The finitude of h_promise is morally significant. Unlike murder, promise-breaking does not trigger a nullifier. The broken-promise state occupies a damaged but recoverable stratum: trust can be rebuilt, compensation can be offered, and the social bond can be repaired — at a cost. The heuristic correctly encodes this recoverability as a finite, large estimate rather than an infinite one.

The Evolutionary Origin

Why do these heuristic functions exist? The intractability result of §11.5 provides the answer: exact moral computation is intractable, so any resource-bounded agent that must make real-time moral decisions requires pre-compiled approximations. Biological evolution and cultural evolution have provided these approximations in the form of moral intuitions, social norms, and codified rules. They are the moral analogue of pattern databases in classical A* — offline computation (evolutionary selection, cultural refinement, legal codification) that enables fast online lookup (moral intuition, rule-following, legal compliance). A clarification is warranted: the A* framework is not an evolutionary debunking argument. The admissibility of a heuristic — the mathematical property h(n) ≤ g* — holds or fails independently of the heuristic’s causal history. An obligation that was shaped by natural selection, cultural transmission, or rational deliberation is admissible if and only if it does not overestimate the true cost. The evolutionary account explains how the heuristics were compiled; it does not determine whether they track the structure of moral space.

When an agent experiences the phenomenological force of a moral obligation — the sense that they must act in a certain way, the guilt of transgression, the relief of compliance — they are reading the output of these pre-compiled heuristic functions. The phenomenology is the interface; the computation is the A* evaluation f(n) = g(n) + h(n).

11.4 Admissibility and Consistency

The A* algorithm is optimal — guaranteed to find the cheapest path — if and only if the heuristic function h satisfies two conditions: admissibility and consistency. This section defines both conditions in the moral context and proves that the core moral heuristics satisfy them.

Definition 11.3 (Admissibility). [Formal Definition.] A moral heuristic h is admissible if, for every state n ∈ ℳ:

h(n) ≤ g*(n, G_eq)

where g*(n, G_eq) is the true minimum behavioral friction from n to the goal region G_eq. An admissible heuristic never overestimates the true cost to equilibrium. It is an optimistic lower bound.

Definition 11.4 (Consistency). [Formal Definition.] A moral heuristic h is consistent (or monotone) if, for every pair of neighboring states n, n′ ∈ ℳ:

h(n) ≤ c(n, n′) + h(n′)

where c(n, n′) is the local behavioral friction cost of the transition from n to n′. This is the triangle inequality on the metric space: the estimated cost from n cannot exceed the cost of one step plus the estimated cost from the next state. Consistency implies admissibility (standard result), and ensures that A* never needs to re-expand a node.

Theorem 11.1 (Admissibility of Core Moral Heuristics; the Heuristic Truncation Theorem). [Conditional Theorem; conditional on the behavioral friction model of Chapter 6 and the Whitney stratification of Chapter 8.] The classical moral prohibitions (murder, theft, deception) and positive obligations (reciprocity, promise-keeping, care) are admissible heuristics in the A* sense: they never overestimate the true behavioral friction required to reach equilibrium from the states they prohibit or prescribe.

Proof sketch. We argue case by case. For murder: h_murder(n) = ∞ for states involving killing. As shown in §11.3, the true cost g*(n, G_eq) is also infinite (the stratum boundary is absorbing), so h ≤ g*. For theft: h_theft(n) estimates the cost of property-rights restoration, institutional response, and trust repair. The true cost includes all of these plus network externalities (other agents adjust their behavior), so g* ≥ h. For deception: h_deception(n) estimates the trust-rebuilding cost. Trust rebuilding is slow and path-dependent (Chapter 10, holonomy), and the true cost includes the uncertainty premium imposed by all agents who learn of the deception, so g* ≥ h. For reciprocity norms: the heuristic points toward cooperative equilibria, which are by definition the geodesics of lowest friction (Chapter 10, §10.7). The full proof, with detailed cost analysis for each heuristic using the stratum structure of Chapter 8, appears in the Technical Appendix. □

Proposition 11.2 (Consistency from Metric Compatibility). [Proved.] If h(n) = d_g(n, G_eq) — the geodesic distance from n to the goal region in the moral metric g_{μν} — then h is consistent.

Proof. By the triangle inequality for the Riemannian distance d_g: for any states n, n′ and goal region G_eq, d_g(n, G_eq) ≤ d_g(n, n′) + d_g(n′, G_eq). The local behavioral friction cost c(n, n′) is the arc length of the path from n to n′, which satisfies c(n, n′) ≥ d_g(n, n′) (arc length ≥ geodesic distance). Therefore h(n) = d_g(n, G_eq) ≤ d_g(n, n′) + d_g(n′, G_eq) ≤ c(n, n′) + h(n′), which is the consistency condition. □

Remark. Real moral heuristics approximate but do not exactly equal the geodesic distance. They are approximately consistent — the triangle inequality holds in most cases but may fail at stratum boundaries (where the metric changes discontinuously) or in exotic moral configurations (where the heuristic’s pre-compiled assumptions break down). These failures correspond to moral dilemmas: situations where the pre-compiled heuristic gives contradictory or unreliable guidance, and the agent must fall back on explicit moral reasoning — that is, on a more careful search through the local neighborhood of the manifold. The implications for optimality are bounded, not catastrophic. The literature on weighted A* and ε-admissible search (Pohl 1970; Pearl 1984) establishes that approximately consistent heuristics yield bounded-suboptimal paths: the solution cost is at most (1 + ε) times the true optimum, where ε reflects the degree of inconsistency. Real moral reasoning is therefore near-optimal — close to the geodesic, but not exactly on it. This is precisely the content of Corollary 11.2: heuristic imperfection is structural and bounded, not arbitrary.

The ε-Admissibility Spectrum and the Heuristic Hierarchy

The preceding remark acknowledges that real moral heuristics may overestimate. This section formalizes the consequences. The key insight is that overestimation is not binary (admissible vs. inadmissible) but falls on a spectrum with distinct moral signatures at each level.

Definition 11.5 (ε-Admissible Moral Heuristic). [Formal Definition.] A moral heuristic h is ε-admissible if, for all states n ∈ ℳ: h(n) ≤ (1 + ε) · g*(n, G_eq), where g*(n, G_eq) is the true minimum behavioral friction from n to the goal region. When A* search uses an ε-admissible heuristic, the returned path has total cost at most (1 + ε) times the optimal path cost. The parameter ε quantifies the degree of moral conservatism: ε = 0 yields perfect optimality; small ε yields near-optimal paths at reduced computational cost; large ε yields fast but potentially suboptimal moral judgments.

Remark (Connection to Weighted A*). In the search literature, ε-admissibility is achieved by weighted A*: f(n) = g(n) + w · h(n), where w = 1 + ε. The weight w trades optimality for computational tractability. This maps directly onto moral reasoning: a time-pressured agent (emergency physician, crisis responder) implicitly increases w — relying more heavily on heuristic judgment and less on deliberate calculation — accepting a bounded-suboptimal moral path in exchange for the speed needed to act. The bounded suboptimality guarantee ensures that this is not moral recklessness but a principled trade-off with a quantifiable cost ceiling (Pohl 1970; Pearl 1984).

Theorem 11.1b (The Moral Heuristic Hierarchy). [Conditional Theorem; conditional on the behavioral friction model of Chapter 6 and the stratification of Chapter 8.] Moral heuristics fall into four categories with distinct phenomenological signatures:

Strictly admissible (h(n) ≤ g*(n, G_eq)): The heuristic underestimates or correctly estimates the true moral cost. The A* path is optimal. These are the core prohibitions whose true costs genuinely exceed any finite estimate — murder (true cost includes absorbing-stratum penalties, legal consequences, social ostracism, and psychological damage), non-consensual treatment (true cost includes rights violation, institutional liability, trust destruction). The proof of Theorem 11.1 establishes this category.

ε-admissible (h(n) ≤ (1 + ε) · g*(n, G_eq), small ε): The heuristic slightly overestimates the true boundary cost. The A* path is near-optimal, with total cost at most (1 + ε) times the true optimum. This is the normal operating range of a well-calibrated moral agent. The slight conservatism — marginally overweighting harm avoidance, slightly inflating the cost of promise-breaking — produces the well-documented phenomenon that experienced moral reasoners are slightly more cautious than a pure expected-value calculation would dictate, at negligible cost to overall moral performance. Evolutionary and cultural selection pressures calibrate ε to be small but positive: an agent with ε = 0 (perfect calibration) is fragile (any perturbation makes the heuristic inadmissible), while an agent with small positive ε is robust.

Inadmissible (h(n) ≫ g*(n, G_eq)): The heuristic substantially overestimates boundary costs. The A* path avoids genuinely beneficial actions because the inflated heuristic makes them appear too costly. This category has a precise clinical and phenomenological signature: it is the mathematical structure of moral injury, trauma, and pathological risk aversion. An agent whose heuristic has been damaged by forced boundary crossing (the moral injury formalized in the domain literature) has inflated β_k values — they overestimate the cost of actions near the violated boundary, avoiding even beneficial actions in that region. Defensive behavior (defensive medicine, defensive lawyering, institutional risk aversion) is β_institutional ≫ β*_institutional: the institution’s heuristic overweights legal and reputational cost relative to its true value, foreclosing beneficial actions.

Gauge-variant: The heuristic’s output depends on the framing or description of the moral situation rather than on the situation’s attribute vector. This violates the Bond Invariance Principle (Chapter 5, §5.5): the heuristic is not invariant under admissible re-descriptions. Examples include omission bias (harmful action judged as worse than equally harmful inaction, even when the attribute vectors are identical), status quo bias (framing the current state as the reference point inflates the cost of departure), anchoring to initial information, and order effects in moral judgment. These are the phenomena that genuinely deserve the label ‘cognitive bias’ in moral reasoning — not because the heuristic overestimates, but because it responds to morally irrelevant features of the description. The distinction matters: categories (i)–(iii) are calibration errors (the heuristic estimates the right quantity, more or less accurately); category (iv) is a gauge failure (the heuristic estimates the wrong quantity).

Proof sketch. Category (i) follows from Theorem 11.1 (the Heuristic Truncation Theorem). Category (ii) follows from the bounded suboptimality guarantee of ε-admissible A* (Pohl 1970, Theorem 2; Pearl 1984, §4.2): if h(n) ≤ (1 + ε) · g*, then f(γ) ≤ (1 + ε) · f(γ*). Category (iii) is defined by the negation of (i)–(ii): when ε exceeds a threshold, the heuristic forecloses paths whose true cost is acceptable, producing systematic moral avoidance. Category (iv) is defined by violation of gauge invariance (BIP, Chapter 5): h(τ(n)) ≠ h(n) for some admissible transformation τ. The four categories are mutually exclusive and exhaustive for any heuristic function on ℳ. □

Remark (Heuristic Recalibration). The hierarchy has a dynamic interpretation. An agent’s heuristic can move between categories over time. Moral education and professional training move uncalibrated heuristics from category (iii) or (iv) toward categories (i)–(ii) — this is what it means for clinical training, legal education, or military ethics instruction to ‘calibrate moral judgment.’ Conversely, moral injury (forced boundary crossing that inflates β_k) moves a previously well-calibrated heuristic from category (ii) to category (iii) — the mathematical signature of moral damage. Ethics consultation, peer review, and therapeutic intervention function as recalibration mechanisms: they restore inadmissible heuristics toward ε-admissibility by correcting inflated boundary penalties to values closer to their true costs. The framework predicts that effective moral repair must target specific β_k values (not general ‘resilience’), and that the dimension of repair must match the dimension of injury.

11.5 Computational Intractability of Exact Moral Reasoning

We now prove the computational result that justifies the entire A* framework: exact moral geodesic planning is intractable. This result explains why moral heuristics exist — they are not cultural artifacts but computational necessities.

Theorem 11.2 (Intractability of Exact Moral Geodesic Planning). [Conditional Theorem; conditional on the 9-dimensional moral manifold of Chapter 5 and the behavioral friction model of Chapter 6.] Finding the exact optimal path on the moral manifold ℳ is computationally intractable in the dimensionality of the manifold.

Proof. The proof proceeds by analyzing the computational cost of discretized geodesic planning on a d-dimensional stratified Riemannian manifold.

Step 1: Discretization. To compute geodesics numerically, the continuous manifold ℳ must be approximated by a discrete graph. To achieve accuracy ε on a d-dimensional manifold, a standard grid discretization requires N ∝ (1/ε)ᵈ vertices per stratum. For d = 9 (the dimensionality of the moral manifold, Definition 5.1), this gives N ∝ (1/ε)⁹ vertices per stratum.

Step 2: Per-stratum complexity. On a graph with N vertices and O(N²) edges (the connectivity of the discretized manifold), the shortest-path algorithm (Dijkstra) runs in O(N² log N) time. Each edge weight evaluation requires O(d²) = O(81) operations for the metric tensor contraction. With m strata, the total is O(mN² · d² · log(mN)) — polynomial in N, m, and d. This is the result of Bond (2026, SGE), Theorem 5.3.

Step 3: The curse of dimensionality. Substituting N ∝ (1/ε)⁹, the total complexity becomes O(m · (1/ε)¹⁸ · d² · log(m(1/ε)⁹)). For practical accuracy ε = 0.01, N ∝ 10¹⁸ per stratum. The total vertex count across m strata is on the order of m · 10¹⁸, and the edge count on the order of m · 10¹⁸. Even with m = 1, a single Dijkstra evaluation requires on the order of 10¹⁸ metric evaluations — vastly exceeding the computational capacity of any physical system operating in real time.

Step 4: Multi-agent and temporal extensions. In a multi-agent setting with n stakeholders, the state space grows by a factor of n (each stakeholder contributes a perspective). With a temporal horizon of T time steps and stochastic dynamics (branching factor b at each step), the effective search space grows by a factor of bᵀ. The overall complexity is O(m · (1/ε)¹⁸ · n · bᵀ · d² · log(mN)) — exponential in both the dimensionality d and the temporal horizon T.

Conclusion. While the per-vertex computation is polynomial (SGE Theorem 5.3), the number of vertices required for adequate accuracy on a 9-dimensional manifold is astronomically large. Exact moral geodesic planning is intractable not because the algorithm is inefficient, but because the space is too large to discretize adequately. This is the curse of dimensionality applied to moral reasoning. □

A note on the nature of the intractability claim. The argument above establishes practical intractability — the resource requirements exceed any physically realizable computation — rather than complexity-theoretic hardness in the sense of NP-completeness or PSPACE-completeness. These are distinct but complementary notions. For the purpose of this framework, practical intractability is the relevant concept: the question is whether any resource-bounded agent (biological or artificial) can compute exact moral geodesics in real time, and the answer is no, regardless of algorithmic cleverness. The curse of dimensionality is intrinsic to the 9-dimensional manifold: even adaptive discretization methods (which concentrate resolution near stratum boundaries and use coarser grids elsewhere) reduce constants but cannot eliminate the exponential dependence on d. The fundamental bottleneck is not the algorithm but the volume of the space that must be explored.

Corollary 11.1 (Evolutionary Necessity of Moral Heuristics). [Conditional Theorem.]

Since exact moral computation is intractable, any resource-bounded agent — biological or artificial — that must make real-time moral decisions requires pre-compiled heuristic functions. The existence of moral obligations is a computational necessity, not merely a cultural artifact.

Corollary 11.2 (Heuristic Imperfection Is Structural). [Conditional Theorem.]

Since moral heuristics are finite approximations to an intractable problem, they are necessarily imperfect. Moral error — cases where obligation-following leads to suboptimal outcomes — is a structural consequence of computational intractability, not a defect of moral cognition. This resolves a longstanding puzzle in moral philosophy: why do good people sometimes do wrong? Not because they are morally deficient, but because they are using heuristics for a problem that admits no tractable exact solution. Importantly, structural explanation does not eliminate moral responsibility. An agent remains accountable for the calibration of their heuristics — the virtue ethics interpretation developed in §11.8 below — and for the decision to invoke deliberative reasoning (System 2) when stakes are high or when heuristic guidance is manifestly unreliable. The corollary explains the mechanism of moral error, not its moral status.

11.6 The Satisfaction Scalar as Search Output

The A* framework provides a new interpretation of the fundamental contraction formula from Chapter 6. The satisfaction scalar S = I_μ Oᵐ (§6.4, Definition 6.3) is the inner product of what an agent is obligated to do (the obligation vector Oᵐ, the heuristic gradient) with what the affected parties need (the interest covector I_μ). In the A* framework, this contraction is the moment when the search yields a decision.

Proposition 11.3 (Satisfaction as Directional Derivative of the Heuristic). [Proved.] The satisfaction scalar is the directional derivative of the heuristic function h in the interest direction:

S = I_μ Oᵐ = −I_μ gᵐᵛ ∂ₙh

Proof. By Proposition 11.1, Oᵐ = −gᵐᵛ ∂ₙh. Substituting into the satisfaction formula (Definition 6.3), S = I_μ Oᵐ = I_μ(−gᵐᵛ ∂ₙh) = −I_μ gᵐᵛ ∂ₙh. This is the contraction of the interest covector with the metric-raised gradient of h, which is by definition the directional derivative of h in the direction determined by I_μ through the metric. □

When S > 0, the heuristic’s recommended direction (the obligation) aligns with the patient’s needs (the interest): following the moral rule serves the stakeholder. When S < 0, obligation and interest conflict: the rule prescribes an action that, in this context, works against the stakeholder’s needs. When S = 0, the action is morally neutral along the measured axes.

Worked Example: The Kidney Allocation Revisited

Chapter 7 examined a kidney allocation decision at five levels of geometric sophistication. We can now recognize that the Chapter 7 analysis was A* search throughout — it simply was not identified as such.

At Level 1 (scalar), the analysis computed a single score for each patient — equivalent to a degenerate A* search with a one-dimensional heuristic. At Level 2 (obligation vector), the analysis computed Oᵐ for each allocation option — the heuristic gradient in the search framework. At Level 3 (multi-agent rank-2 tensor), the analysis tracked separate interest covectors I_μ for each stakeholder — multiple A* evaluations, one per perspective. At Level 4 (the metric), the analysis used g_{μν} to compute the behavioral friction cost of each option — the g(n) term in A*. At Level 5 (stratification), the analysis identified stratum boundaries where certain allocations trigger deontic phase transitions — the constraint surfaces that the A* search must avoid.

The five-level progression of Chapter 7 is the progressive enrichment of the A* search: from a trivial search (scalar, one-dimensional, no structure) to a full search on the stratified manifold (tensorial, nine-dimensional, with metric, stratification, and gauge constraints). Each level adds computational structure that makes the search more accurate but also more expensive — the tradeoff at the heart of the intractability result (Theorem 11.2).

The Inevitability of Residue (Chapter 15, Proposition 15.1) applies directly: each level of the Chapter 7 analysis that collapses structure (from tensor to vector, from vector to scalar) discards information about the search space. The information lost is precisely the information about alternative paths that the collapsed representation cannot encode. The A* framework makes this loss concrete: it is the pruning of the search tree at each contraction step.

11.7 Connections to Existing Framework Elements

The A* interpretation does not stand alone. It connects to and illuminates several existing elements of the geometric ethics framework.

Gradient Flows as Degenerate A*

Chapter 10 (§10.8) introduced gradient flows: the agent follows the direction of steepest improvement of the satisfaction function S. The gradient flow is grad S = gᵐᵛ ∂S/∂xᵛ. In the A* framework, gradient flow corresponds to greedy best-first search with g = 0: the agent follows the heuristic’s recommendation (obligation) without accounting for accumulated cost. The pathologies identified in §10.8 — local maxima, saddle points, metric sensitivity — are exactly the pathologies of searchless optimization. The remedy is the same in both cases: incorporate backward cost (the g(n) term) to escape local optima.

Geodesics as A* Solutions

The geodesic of Chapter 10 (§10.7) is the optimal path found by A* with a perfect heuristic. If h = g* (the true cost-to-go), then A* finds the exact geodesic on the first expansion. Since computing g* requires solving the full optimization problem (which is intractable by Theorem 11.2), real agents use approximate heuristics h ≈ g*, and the A* path approximates but does not exactly follow the geodesic. The quality of the approximation depends on the quality of the heuristic — that is, on the calibration of the agent’s moral intuitions.

Holonomy and Heuristic Drift

Chapter 10 (§10.5) showed that parallel transport of obligations around a closed loop may change them — the holonomy of the moral connection. In the A* framework, holonomy means that the heuristic function h drifts as the agent traverses moral space: the pre-compiled moral intuitions that were accurate at the starting point may become miscalibrated after a circuit through diverse moral experiences. This explains why moral codes require updating through legal reform, cultural evolution, and moral education — the heuristics lose accuracy as the landscape changes, and must be recalibrated.

Contraction and Information Loss

The Inevitability of Residue (Chapter 15, Proposition 15.1) proves that contracting a moral tensor to a scalar is irreversibly lossy. The A* framework makes this loss precise: it is the information about the full search tree that is discarded when the agent commits to a single path. The satisfaction scalar S = I_μ Oᵐ is the shadow of the search tree projected onto a single dimension. The residue (§15.3) is the record of paths not taken — the moral possibilities that were evaluated and rejected in the search process.

11.8 Implications for Moral Philosophy

The A* framework provides computational interpretations of the major ethical traditions, explaining their strengths, limitations, and domains of applicability.

Deontology as Pre-Compiled Heuristics

Deontological rules (“do not lie,” “keep your promises,” “respect autonomy”) are pre-compiled heuristic functions. They work because they are admissible — they never overestimate the true cost of the actions they prohibit — and because they are fast: evaluating h(n) requires constant time, independent of the size of the search space. They fail when the heuristic is inconsistent (edge cases where following the rule increases total cost, such as the murderer-at-the-door who asks where your friend is hiding) or when the goal region has shifted (moral progress makes old heuristics obsolete, as with the historical acceptance of slavery, which reflected a miscalibrated heuristic that assigned insufficient cost to rights violations).

Pure consequentialism attempts to compute g* directly — the exact optimal path from the current state to equilibrium, evaluated by its consequences. This is equivalent to running A* without a heuristic (h = 0), or equivalently, running Dijkstra’s algorithm on the full moral manifold. By Theorem 11.2, this is intractable. Consequentialist reasoning is computationally expensive, which is why it is typically reserved for novel or high-stakes situations where the pre-compiled heuristics are clearly insufficient — exactly as predicted by the dual-process model of moral cognition.

Virtue Ethics as Heuristic Calibration

A virtuous agent, in the A* framework, is one whose heuristic functions are well-calibrated: the obligations they experience are not merely admissible (never overestimating) but tight (close to the true cost g*). A tighter heuristic prunes more of the search space, making the agent’s moral reasoning faster and more accurate. Aristotle’s doctrine of habituation — that virtue is developed through practice — is the process of refining heuristics through experience: each moral decision provides feedback that tightens the heuristic’s approximation. The “mean between extremes” is the region where the heuristic is most accurate: neither overestimating (excessive caution, moral paralysis) nor underestimating (recklessness, moral blindness).

Dual-Process Moral Cognition

The A* framework maps naturally onto the dual-process model of moral cognition (Kahneman’s System 1 and System 2). System 1 — fast, automatic, intuitive — is the heuristic evaluation h(n): the pre-compiled obligation that fires instantly when a morally relevant situation is detected. System 2 — slow, deliberate, effortful — is the full A* search: the explicit computation of backward cost g(n), forward estimate h(n), and their combination f(n) = g(n) + h(n), applied iteratively across the search tree. System 2 is invoked when System 1 fails — that is, when the heuristic is unreliable (near stratum boundaries), when the moral situation is novel (the heuristic was not pre-compiled for this case), or when the stakes are high enough to justify the computational cost.

A crucial clarification. The computational interpretations above do not reduce deontology to a crude approximation of consequentialist calculation, nor do they privilege consequentialism as the “true” form of moral reasoning. The cost function in the A* framework is behavioral friction — a geometric quantity defined by the moral metric (Chapter 6), which encodes rights violations, autonomy costs, fairness deficits, and other structural features of moral space alongside consequences. It is emphatically not a utilitarian welfare function. Both deontological and consequentialist reasoning are computational strategies for navigating the same underlying geometric structure, and the structure is prior to both. The framework does not adjudicate between them; it explains why both exist and where each is most effective. Kant’s categorical imperative, read geometrically, tests the invariance of moral maxims under coordinate transformations (§5.5) — a gauge constraint on the search, not a heuristic approximation to consequence-calculation.

11.9 Worked Example: The Self-Driving Car

We illustrate the full A* framework with a self-driving car scenario that connects to the DEME architecture of Chapter 19.

Setup. An autonomous vehicle approaches an intersection. A pedestrian is crossing against the light on the left. A cyclist is in the bike lane on the right. Oncoming traffic is approaching. The vehicle must choose a path through moral configuration space: brake hard (high friction for passengers, low friction for pedestrian), swerve left (high friction for oncoming traffic), swerve right (high friction for cyclist), or continue (catastrophic friction for pedestrian).

State space. Each candidate action moves the system to a new state n_i ∈ ℳ, characterized by its position in the nine-dimensional moral manifold: consequences (physical harm to each party), rights (right-of-way violations), fairness (distribution of risk), autonomy (override of pedestrian’s choice to cross), privacy (irrelevant in this case), societal impact (precedent-setting), virtue (the character of the decision-maker), legitimacy (traffic law compliance), and epistemic status (uncertainty about the pedestrian’s intent).

Backward cost g(n_i). For each candidate state, the accumulated behavioral friction of the path taken so far. This includes the costs of previous driving decisions, the current speed and trajectory, and the commitments implied by the vehicle’s position on the road.

Heuristic h(n_i). The pre-compiled obligations: (i) do not harm the pedestrian (h = very high for n_continue; illustratively, h ≈ 95); (ii) do not harm the cyclist (h = high for n_swerve_right; h ≈ 70); (iii) minimize harm to passengers (h = moderate for n_brake_hard; h ≈ 30); (iv) obey traffic law (h = moderate for n_swerve_left, which enters the oncoming lane; h ≈ 55).

A* evaluation. f(n_i) = g(n_i) + h(n_i) for each candidate. With illustrative backward costs g(n_continue) ≈ 5, g(n_brake_hard) ≈ 15, g(n_swerve_right) ≈ 10, g(n_swerve_left) ≈ 12, the total evaluations are: f(n_continue) ≈ 100, f(n_swerve_right) ≈ 80, f(n_swerve_left) ≈ 67, f(n_brake_hard) ≈ 45. The A* algorithm selects n_brake_hard: moderate harm to passengers (the deceleration cost is finite and recoverable), minimal harm to the pedestrian (the vehicle stops before the crossing), and no stratum boundary crossing (no one is killed or seriously injured). The scalar comparison shows why a single “harm score” would miss the structure: swerving right has lower total harm (the cyclist is uninjured in most scenarios) but higher obligation-violation cost (the cyclist has the right of way), so the full A* evaluation correctly prefers braking over swerving.

The DEME architecture (Chapter 19) implements exactly this A* search. The EthicalFacts layer canonicalizes the sensory input into the nine-dimensional state description. The Ethics Modules compute h(n_i) for each candidate action, using pre-compiled obligation vectors. The Governance layer combines g(n_i) and h(n_i) into f(n_i) and selects the action with the lowest total cost. The audit layer records the full search tree for post-hoc inspection. The No Escape Theorem (Chapter 18, Theorem 18.1) ensures that the search cannot be bypassed through representational manipulation.

11.10 Summary

This chapter has established a formal equivalence between moral reasoning and A* pathfinding on the moral manifold. The principal results are:

Moral reasoning is A* search. The equation f(n) = g(n) + h(n), where g is accumulated behavioral friction and h is a heuristic estimating cost-to-equilibrium, is the fundamental equation of moral reasoning (Definition 11.2).

Obligation vectors are heuristic gradients. The obligation vectors Oᵐ of Chapter 6 are the negative metric-raised gradients of the heuristic function h (Proposition 11.1). Moral obligations point in the direction of steepest estimated improvement.

Core moral heuristics are admissible. The classical prohibitions (murder, theft, deception) and positive obligations (reciprocity, care) never overestimate the true cost to equilibrium (Theorem 11.1). This is why rule-following is generally reliable.

Exact moral computation is intractable. The curse of dimensionality on the 9-dimensional moral manifold makes exact geodesic planning intractable for any resource-bounded agent (Theorem 11.2). Moral heuristics are computationally necessary, not merely culturally convenient.

Moral error is structural. Since heuristics approximate an intractable problem, they are necessarily imperfect (Corollary 11.2). When good people do wrong, the explanation is computational, not characterological.

The A* framework unifies several threads of the preceding chapters: geodesics (§10.7) are the solutions to the search problem; gradient flows (§10.8) are degenerate searches without backward cost; obligation vectors (§6.2) are heuristic gradients; and Inevitability of Residue (Chapter 15, Proposition 15.1) quantifies the information lost when the search tree is contracted to a scalar decision.

The framework also reveals a new requirement: for the A* search to yield consistent results, the heuristic function h(n) must be invariant under re-description — the same moral state, labeled differently, must receive the same heuristic estimate. This invariance requirement is precisely the Bond Invariance Principle (§5.5), and its consequences — via Noether’s theorem — are the subject of the next chapter.

Technical Appendix

Proof of Theorem 11.1 (Admissibility of Core Moral Heuristics).

We prove admissibility (h(n) ≤ g*(n, G_eq)) for each core heuristic by constructing lower bounds on the true cost g* and showing that h does not exceed them.

Case 1: Murder prohibition (h_murder = ∞). Let n be a post-murder state. By Definition 8.3 (Type III boundary/nullifier), murder triggers an absorbing stratum: the system enters a catastrophic regime from which return to the goal region G_eq requires, at minimum: (i) grief processing across k connected agents, with per-agent cost c_grief and propagation depth proportional to the connectivity of the social network; (ii) institutional response with cost c_inst = c_investigation + c_adjudication + c_incarceration; (iii) reputational damage d_rep propagating through the network with attenuation proportional to distance; (iv) deterrence expenditure c_det to prevent recurrence. The total cost g*(n, G_eq) ≥ k · c_grief + c_inst + d_rep + c_det. In the limit of large k (connected network), g* diverges. For finite k, the irreversibility of death means that the specific harm cannot be undone — only its consequences can be managed. In either case, g* ≥ h_murder, and the heuristic is admissible. For the case h_murder = ∞: since the absorbing stratum may have no path to G_eq, g* = ∞ as well, and the inequality h ≤ g* holds trivially.

Case 2: Theft prohibition (h_theft = C_property). Let C_property be the heuristic’s estimate of the cost of property-rights restoration. The true cost g*(n, G_eq) includes: (i) the direct restoration cost C_property; (ii) institutional costs (investigation, adjudication); (iii) trust-rebuilding costs across affected relationships; (iv) deterrence expenditure. Since g* ≥ C_property + additional terms ≥ C_property = h_theft, the heuristic is admissible.

Case 3: Deception prohibition (h_deception = C_trust). Let C_trust be the heuristic’s estimate of the trust-rebuilding cost. By the holonomy result of Chapter 10 (§10.5), trust rebuilding is path-dependent: the cost depends on the sequence of interactions through which trust is restored, and holonomy ensures that the cost is generically non-zero. The true cost g* includes C_trust plus the uncertainty premium (other agents increase their monitoring costs, reducing the efficiency of future interactions) plus potential cascade effects (the deception may trigger retaliatory deceptions, increasing systemic friction). Since g* ≥ C_trust = h_deception, the heuristic is admissible.

Case 4: Reciprocity norm (h_reciprocity = C_defection). The reciprocity heuristic assigns high cost to non-reciprocal actions (defection in repeated games). By the game-theoretic analysis of behavioral friction (the social cost of defection in an iterated setting includes retaliation, reputation damage, and exclusion from cooperative networks), g* ≥ C_defection = h_reciprocity. □

Proof of Proposition 11.2 (Consistency from Metric Compatibility).

Let h(n) = d_g(n, G_eq), where d_g is the geodesic distance in the moral metric g_{μν}. For any neighboring states n, n′: By the triangle inequality for Riemannian distance, d_g(n, G_eq) ≤ d_g(n, n′) + d_g(n′, G_eq). The local behavioral friction cost c(n, n′) is the arc length of the path from n to n′, which satisfies c(n, n′) ≥ d_g(n, n′) (the arc length of any path between two points is at least the geodesic distance between them — this is the defining property of geodesic distance). Therefore: h(n) = d_g(n, G_eq) ≤ d_g(n, n′) + d_g(n′, G_eq) ≤ c(n, n′) + h(n′). This is the consistency condition (Definition 11.4). □

Proof of Theorem 11.2 (Intractability of Exact Moral Geodesic Planning).

We establish intractability via explicit complexity calculation, not reduction from a known hard problem. The argument proceeds in four steps.

Step 1 (Discretization requirement). On a d-dimensional Riemannian manifold, a uniform ε-grid has vertex count N = O((L/ε)ᵈ), where L is the diameter of the manifold in the metric g. For the moral manifold with d = 9, N = O((L/ε)⁹). The edge count is O(N²) = O((L/ε)¹⁸), since each vertex may be connected to O(N) others in the worst case (the manifold’s connectivity is bounded by 3ᵈ − 1 in a grid, but stratum crossings add long-range connections).

Step 2 (Per-evaluation cost). Each edge weight requires evaluating the metric tensor g_{μν} at the midpoint and computing the arc length, costing O(d²) = O(81) operations per edge.

Step 3 (Shortest-path algorithm). Dijkstra’s algorithm on the discretized graph runs in O(|E| + |V| log |V|). On a pure d-dimensional grid, each vertex has at most 3ᵈ − 1 neighbors, giving E = O(N) and Dijkstra complexity O(N log N). However, stratum crossings introduce long-range edges connecting vertices across strata, and in the worst case these cross-stratum connections can produce E = O(N²), yielding O(N²) overall. We use the worst-case bound to ensure the intractability result is robust to manifold topology.

Step 4 (Total complexity). Combining: T(d, ε, m) = O(m · (L/ε)²ᵈ · d² · log(m(L/ε)ᵈ)). For d = 9, ε = 0.01L (1% accuracy), m = 3 strata: T ≈ 3 · 100¹⁸ · 81 · log(3 · 10¹⁸) ≈ 1.5 × 10⁴⁰ operations. At 10¹⁰ operations per second (a fast processor), this requires approximately 1.5 × 10³⁰ seconds — approximately 4.8 × 10²² years. This is intractable by any physical computation standard.

The intractability is due to the curse of dimensionality: the volume of the 9-dimensional space grows as the ninth power of the resolution, making fine-grained discretization infeasible. The discretized algorithm itself is polynomial in N (Step 3), but N grows exponentially with d for fixed accuracy ε. □ Adaptive discretization methods — which concentrate vertices near stratum boundaries and use coarser grids in smooth regions — can reduce the constant factors but do not change the exponential dependence on d. The volume of a d-dimensional ball grows as rᵈ, and no adaptive scheme can avoid sampling this volume to the accuracy required for reliable moral geodesic computation.

Proposition 11.4 (A* Optimality on Stratified Spaces). [Proved.] A* search with an admissible and consistent heuristic remains optimal on Whitney stratified spaces with stratum-crossing penalties, provided the heuristic accounts for the crossing penalties in its estimate.

Proof. The standard A* optimality proof (Hart, Nilsson, and Raphael, 1968) requires: (i) a finite graph; (ii) non-negative edge weights; (iii) an admissible and consistent heuristic. On a Whitney stratified space, the discretized graph has: (i) finitely many vertices per stratum and finitely many strata (by the locally finite decomposition); (ii) non-negative edge weights (behavioral friction is non-negative by the positive-definiteness of the metric); (iii) the heuristic h is admissible and consistent by Theorem 11.1 and Proposition 11.2. The stratum-crossing penalties are encoded as edge weights on the cross-stratum edges (the transitions between strata, Definition 8.11), which are non-negative. Therefore all three conditions hold, and A* is optimal. □ At stratum boundaries, where the metric may become degenerate, behavioral friction remains non-negative: the stratum-crossing penalties are defined as non-negative additive costs (Definition 8.11), and the within-stratum friction is bounded below by zero since the metric tensor is positive semi-definite even in the degenerate limit. The non-negativity condition therefore holds globally.

For a heuristic to be admissible, there must be a true cost it does not overestimate.

That our deepest moral intuitions satisfy h(n) ≤ g* is not a fact about us.

It is a fact about the space.