Chapter 29: Open Problems

RUNNING EXAMPLE — Priya’s Model

Six months later. Priya has rebuilt TrialMatch as a tensor-valued system. Open problems remain. She cannot measure the moral curvature of her patient manifold—she does not know whether the path-dependence she observed is local or global. She cannot determine the metric signature—are some dimensions truly independent, or should they be coupled? She does not know how to handle patients whose moral situations span multiple strata simultaneously. Mrs. Voss, meanwhile, was matched to BEACON-8 by the new system. She drives 4.5 hours each way, every three weeks. The new system did not shorten the drive. It made the drive visible.

29.1 What We Do Not Know

The preceding nineteen chapters have developed a mathematical framework for moral reasoning and begun to test it against data. The framework makes structural claims: that moral reasoning is multi-dimensional, that obligations transform as vectors, that moral space is stratified and curved, that re-description symmetry implies conservation of harm, that structural containment of AI is mathematically possible.

Some of these claims have empirical support. The Bond Invariance Principle is confirmed by cross-lingual experiments (Chapter 17): deontic structure transfers at 100% across 11 languages in the LaBSE model (model-mediated; see §17.7). The stratification of moral space is supported by semantic gate discreteness and nullifier universality in the Dear Abby corpus. The conservation of harm is consistent with the pattern of cross-lingual invariance and anomaly.

But much remains unknown. The specific geometry of the moral manifold — its metric, connection, curvature, and topology — has not been measured. The quantum extension makes predictions that have not been tested in specifically moral contexts. The governance account of the metric has not been distinguished empirically from realist or constructivist alternatives. The No Escape Theorem’s assumptions have not been verified in deployed systems.

This chapter surveys the open problems. It is organized into five sections: the empirical program (§29.2–29.3), the mathematical foundations (§29.4–29.6), the AI implementation challenges (§29.7–29.8), the philosophical questions (§29.9–29.10), and the institutional requirements (§29.11). Each section states the problem precisely, explains why it matters, identifies what progress would look like, and — where possible — proposes a research strategy.

The chapter is deliberately honest about what the framework has not achieved. A framework that overstates its results undermines trust; one that identifies its open problems with precision invites productive research. The problems listed here are not defects. They are the agenda for the next generation of work.

29.2 The Empirical Program for Moral Curvature

The Curvature Measurement Problem

Chapter 10 developed the mathematics of moral curvature: the Riemann tensor Rναβμ measures the path-dependence of parallel transport, the failure of moral obligations to return to their starting orientation when carried around a loop in moral space. The moral holonomy — the group of rotations generated by parallel transport around all loops — encodes the global curvature structure.

But curvature has not been directly measured.

The Dear Abby corpus (Chapter 17) provides indirect evidence: context-dependent moral weighting implies a non-constant metric, and a non-constant metric generically produces nonzero curvature. But “generically produces” is not “has been measured.” The framework predicts curvature; the data has not confirmed it.

Why Curvature Matters

Curvature is the framework’s deepest structural prediction. If moral space is flat — if parallel transport is path-independent and moral obligations return unchanged when carried around any loop — then the framework’s dynamics reduce to a much simpler theory, and the analogy with general relativity loses its force. Nonzero curvature is the signature that distinguishes geometric ethics from a decorated version of multi-criteria decision analysis.

Moreover, the magnitude and pattern of curvature would constrain the “moral field equation” — the relationship between the distribution of moral commitments and the curvature of moral space (Chapter 10, §10.6). Without curvature data, the field equation is a speculation; with it, the equation becomes a testable hypothesis.

What Progress Looks Like for Curvature

Minimal result: Evidence that parallel transport of a moral obligation around a specific loop in moral space produces a measurable rotation — a holonomy different from the identity. This would confirm nonzero curvature somewhere in moral space.

Stronger result: Measurement of curvature components Rναβμ at specific points in the moral manifold — determining, for example, whether curvature is concentrated near stratum boundaries (as the stratification theory predicts) or distributed throughout the manifold.

Strongest result: A curvature map — a systematic measurement of the Riemann tensor across a region of moral space, revealing the curvature’s magnitude, sign, and variation.

Research Strategy for Curvature

Holonomy experiments. Design longitudinal studies in which subjects carry a moral commitment through a sequence of contexts and return to the starting context. For example:

Start: Subject evaluates the obligation of promise-keeping in the context of personal friendship. Record the obligation vector Oμ(0).

Transport 1: Shift to the context of family obligation. Subject evaluates the “same” obligation of promise-keeping. Record Oμ(1).

Transport 2: Shift to institutional duty. Record Oμ(2).

Transport 3: Return to personal friendship. Record Oμ(3).

Holonomy: Compute ΔOμ=Oμ(3)-Oμ(0). If ΔOμ≠0, the moral space has curvature.

The experiment requires careful control: the “same” obligation must be tracked through genuinely different contexts (not merely different descriptions of the same context, which the BIP predicts should produce zero holonomy). The distinction between context change (curvature) and re-description (gauge invariance) must be operationalized.

Geodesic deviation. A complementary approach: measure the divergence of moral assessments for initially nearby cases as context shifts. Two cases that differ slightly in initial conditions (e.g., two promise-keeping situations with slightly different relational contexts) may diverge rapidly as context shifts — the moral analogue of tidal forces. The rate of divergence is proportional to the Riemann tensor, providing a curvature measurement without requiring a full loop.

29.3 The Signature and Topology of the Moral Metric

The Signature Problem

A metric on a manifold has a signature: the pattern of positive and negative eigenvalues of the metric tensor gμν. Euclidean space has signature (+,+,+): all directions are positive. Spacetime has Lorentzian signature (-,+,+,+): one direction (time) has opposite sign to the others.

What is the signature of the moral metric?

[Speculation/Extension.] This question was raised in Chapter 3 (§3.6) and deferred. It is among the most consequential open questions of the framework, because the signature determines the qualitative structure of moral space.

If the moral metric is positive-definite (Riemannian signature), then moral space has the geometry of a curved surface: all moral “distances” are positive, all trade-offs have a well-defined cost, and the inner product gμνOμPν between two obligation vectors is always well-defined. This is the simplest case and the one assumed in most of the preceding chapters.

If the moral metric is indefinite (semi-Riemannian or Lorentzian signature), then some moral directions are “timelike” (negative eigenvalue) and others “spacelike” (positive eigenvalue). This would mean that some pairs of moral values are related not by trade-off (one can be exchanged for another at a cost) but by something more radical: movement in one direction reduces the moral distance in another. There would be null directions — moral “light cones” — along which the metric degenerates, creating genuine incommensurabilities that are not merely large trade-off costs but structural features of the geometry.

If the moral metric is degenerate (one or more zero eigenvalues), then certain moral dimensions have no weight at all in some contexts — they are invisible to the metric. Sen’s incommensurable capabilities (Chapter 3, §3.5) would appear as degenerate metric components: two capabilities that cannot be traded off because the metric assigns zero weight to their cross-term.

Evidence and Constraints

The Dear Abby corpus provides some indirect evidence. The context-dependent weighting (Chapter 17, §17.2) shows that the moral metric varies across the manifold — some dimensions receive much more weight in some contexts than others. But whether any dimension ever receives negative weight (implying indefinite signature) or zero weight (implying degeneracy) cannot be determined from the existing data.

The framework’s structural constraints (Chapter 9, §9.6) narrow the possibilities: the metric must satisfy positive semi-definiteness on the harm dimension (harm is never negative), monotonicity in obligation (more serious obligations have greater metric weight), and symmetry of the off-diagonal components. These constraints are consistent with positive-definite, indefinite, or degenerate signatures — they do not settle the question.

The Topology Problem

The moral manifold’s global topology is similarly unknown. Chapter 5 (§5.10) raised the question: does M have nontrivial topology — holes, handles, non-orientability?

Holes. If the moral manifold has holes (nontrivial fundamental group), then some loops in moral space cannot be contracted to a point. This would mean that some moral trajectories are topologically distinct — that there are moral situations one cannot reach from a given starting point by any continuous path. Moral dilemmas at stratum boundaries (Chapter 8) might be topological features, not merely geometric ones.

Non-orientability. If the moral manifold is non-orientable (like a Mobius strip), then there exist moral circuits along which the orientation of the obligation vector reverses — an obligation that becomes its opposite when carried around the loop. This would be a striking structural feature with no analogue in simple multi-criteria theories.

Compactness. Is the moral manifold compact (bounded, like a sphere) or non-compact (unbounded, like a plane)? Compactness has consequences for the existence of global extrema: on a compact manifold, the satisfaction function S=IμOμ must achieve a maximum and a minimum; on a non-compact manifold, it need not.

The topology can be probed indirectly through the holonomy group (§29.2): the holonomy constrains the topology via the Ambrose-Singer theorem. But direct topological evidence would require identifying moral circuits that cannot be contracted — a challenging empirical task.

29.4 The Moral Field Equation

The Field Equation Problem

In general relativity, Einstein’s field equation relates the curvature of spacetime to the distribution of matter and energy:

Rμν-(1)/(2)R gμν=8πG Tμν

Chapter 10 (§10.6) posed the analogous question for ethics: is there a “moral field equation” relating the curvature of moral space to the distribution of moral commitments, relationships, and stakes?

The analogy is suggestive:

General relativityMoral geometry
Mass-energy distributionDistribution of commitments, relationships, stakes
Spacetime metricMoral metric (trade-off structure)
Einstein tensorMoral curvature tensor
“Matter tells spacetime how to curve”“Commitments tell moral space how to curve”
“Spacetime tells matter how to move”“Moral curvature tells obligations how to evolve”

But the analogy is currently just an analogy. No moral field equation has been derived or proposed. The problem is to determine whether the analogy can be made precise — whether there exists a tensor equation relating the moral Ricci tensor to a “moral stress-energy tensor” that encodes the distribution of moral sources.

Why the Field Equation Matters

A moral field equation would be the deepest result in geometric ethics — the analogue of Einstein’s greatest achievement. It would:

Determine the curvature from the sources. Given a distribution of moral commitments and relationships, the equation would determine the curvature of moral space — and thus the path-dependence of obligations, the geodesic trajectories, and the holonomy.

Constrain the metric. The field equation would provide differential equations for the moral metric, constraining its form in ways that the governance account alone does not.

Unify dynamics and structure. The field equation would connect the dynamics of Chapter 10 (how obligations evolve) with the statics of Chapters 5–8 (the structure of moral space), showing that they are two aspects of the same geometric theory.

What Progress Looks Like for the Field Equation

Necessary condition: Identification of the “moral stress-energy tensor” — the object on the right-hand side of the equation. What are the “sources” of moral curvature? Commitments? Relationships? Stakes? Power differentials? The answer determines the form of the equation.

Intermediate result: A proposed equation that is mathematically consistent (satisfies the contracted Bianchi identity, ensuring conservation), dimensionally correct, and qualitatively right (predicting curvature where curvature is expected).

Full result: A field equation that makes quantitative predictions about the curvature at specific points in moral space, testable against the empirical program of §29.2.

29.5 Additional Symmetries and Conservation Laws

Beyond the BIP

Chapter 12 established that the BIP — re-description invariance — is a gauge symmetry of the moral Lagrangian, and that Noether’s theorem generates the conservation of harm as the associated charge. But the BIP may not be the only symmetry of the moral Lagrangian.

Temporal symmetry. If the moral Lagrangian is invariant under time translation — if the laws of moral dynamics are the same today as yesterday — then Noether’s theorem generates a conserved “moral energy.” What this energy represents is unclear. In physics, time-translation invariance generates conservation of energy, which constrains dynamical evolution. In ethics, a conserved “moral energy” might constrain how moral situations can evolve: the total “moral tension” of a situation (the degree of unresolved obligation, competing claims, active conflict) might be conserved under some class of processes.

Permutation symmetry. If the moral Lagrangian is invariant under permutation of agents — if swapping the identities of the involved parties does not change the moral evaluation (a strong form of impartiality) — then a new conservation law follows. The conserved charge would encode some form of structural impartiality.

Scale symmetry. If the moral Lagrangian is invariant under scaling — if doubling all moral magnitudes (obligations, interests, stakes) leaves the dynamics unchanged — then a “moral dilatation current” is conserved. Scale invariance would imply that moral dynamics depends only on ratios of moral quantities, not on their absolute magnitudes.

Whether any of these symmetries holds is an open empirical and theoretical question. Each would generate a new conservation law, a new constraint on moral dynamics, and a new testable prediction. The research program is to identify which symmetries the moral Lagrangian possesses — by theoretical analysis (what symmetries are consistent with the known structure?) and empirical testing (do the predicted conservation laws hold in data?).

The Anomaly Program

The BIP experiments revealed anomalies — systematic deviations from perfect gauge invariance. The O↔C correlative transfer rate is 87%, not 100%; the L↔N rate is 82%. Chapter 12 (§12.8) interpreted these as moral analogues of quantum anomalies in physics: classical symmetries that fail to survive “quantization” (the transition from abstract moral structure to concrete moral cognition).

Understanding the source and structure of these anomalies is an open problem. Possibilities include:

Power asymmetry: The obligation-holder is described with more specificity than the claim-holder, breaking the symmetry between correlatives.

Cognitive salience: Obligations are more cognitively salient than claims, producing a systematic asymmetry in representation.

Genuine symmetry breaking: The BIP symmetry is exact at the level of the moral manifold but spontaneously broken at the level of moral cognition, analogous to spontaneous symmetry breaking in particle physics.

Each hypothesis makes different predictions about the pattern and magnitude of the anomaly, testable against the BIP data and the Dear Ethicist probes.

29.6 Torsion in Moral Space

The Torsion Detection Problem

Chapter 10 (§10.2) noted that the connection on the moral manifold may have torsion — the antisymmetric part of the connection coefficients:

Tαβμαβμβαμ

Torsion measures the failure of infinitesimal parallelograms to close: if you transport a vector v along w and simultaneously transport w along v, torsion is the gap between the endpoints.

In moral terms, torsion would mean that the order of moral consideration matters at the infinitesimal level: considering obligation A before obligation B produces a different result than considering B before A, even for infinitesimally similar obligations. This is distinct from holonomy (which involves finite loops) and from the non-commutativity of contraction (Chapter 15, §15.2, which involves the order of different operations).

Why Torsion Matters

If moral space has torsion, the standard Levi-Civita connection (torsion-free, metric-compatible) is not the correct connection for moral dynamics. The equations of parallel transport, the geodesic equation, and the curvature tensor would all require modification. The conservation laws of Chapter 12 would also be affected, since Noether’s theorem in the presence of torsion generates additional conserved quantities.

The existence or absence of torsion has practical implications for AI systems: if moral space is torsion-free, the order of moral consideration is irrelevant at the infinitesimal level (a simplification that computational systems can exploit); if it has torsion, the order matters everywhere, and systems must be designed to handle it.

Research Strategy for Torsion

Torsion can be detected experimentally by measuring the closure failure of infinitesimal parallelograms. In practice, this means presenting subjects with two moral considerations in both orders (A-then-B and B-then-A) and measuring whether the order produces a systematic difference that survives when the considerations are made arbitrarily similar. If order effects persist even as the considerations converge, moral space has torsion.

The quantum cognition literature (Chapter 13, §13.5) provides relevant methodology: order effects in probability judgments have been documented and modeled using non-commutative quantum probability. Whether specifically moral order effects exhibit the same structure — and whether they persist at the infinitesimal level — is an open empirical question.

29.7 Tensorial Interpretability for AI

The Interpretability Problem

Chapter 18 proposed tensor-valued objectives for AI systems: vector rewards rμ(s,a)∈Rd replacing scalar rewards, tensor evaluations Tνμ replacing scalar utilities, and explicit contraction as a separate architectural step. Chapter 19 developed the DEME architecture for implementing this proposal.

But current AI systems — large language models, reinforcement learning agents, neural network classifiers — do not explicitly represent tensors. Their internal representations are high-dimensional vectors in learned embedding spaces, not tensors on the moral manifold. The question of tensorial interpretability is: can we extract tensorial structure from these implicit representations?

Why Interpretability Matters

If we can extract an AI system’s implicit moral tensor — the multi-dimensional structure of its moral evaluations, hidden in its weights and activations — then we can:

Diagnose misalignment by comparing the extracted tensor to the endorsed tensor (Chapter 18, §18.5)

Identify invariance violations by testing whether the extracted tensor transforms correctly under admissible re-descriptions

Detect contraction errors by comparing the system’s implicit contraction with the governance-specified contraction

Monitor drift by tracking the tensor over time and detecting when it changes in morally significant ways

Without tensorial interpretability, the geometric framework applies only to systems designed with tensor-valued objectives (Chapter 19). With it, the framework becomes a diagnostic tool for any AI system.

What Progress Looks Like for Interpretability

Probe extraction. Use probing classifiers to extract moral-dimension activations from neural network hidden states. For each of the nine dimensions of the moral manifold, train a linear probe that predicts the dimension value from the hidden state. The collection of probes constitutes an approximate tensor extraction. Partial progress (February 2026): The BIP v10.16 experiments (§17.10) applied exactly this methodology to a LaBSE-based encoder. A linear probe extracted deontic-dimension activations with 80% F1 across languages and achieved a structural-to-surface ratio of 11.1 × , confirming that the trained encoder’s hidden states contain extractable tensorial structure aligned with the moral manifold’s deontic axis. A sobering finding: a linear probe also achieves 99.8% language identification, indicating that current probe-extraction methods cannot fully disentangle structural from surface features.

Representation geometry. Analyze the geometry of the learned embedding space: its metric (local distances between representations), its curvature (how the metric varies), and its topology (connected components, holes). If the embedding space has structure resembling the moral manifold, the system has learned an implicit geometric ethics.

Invariance testing at the representation level. Present equivalent moral situations (related by admissible re-descriptions) and measure whether the hidden-state representations are equivalent. If the representations are invariant, the system has learned the BIP at the representation level; if not, the invariance violation can be localized to specific layers and dimensions.

Contraction identification. Identify the computational step at which the multi-dimensional representation collapses to a scalar output. Characterize the implicit contraction: what weights does it use? What information does it discard? How does it compare to the governance-specified contraction?

Connection to Existing Work

Mechanistic interpretability research — the program of understanding neural networks by identifying interpretable features, circuits, and algorithms — provides the methodological foundation. The geometric framework adds a specific target: we are not looking for arbitrary interpretable features but for the specific tensorial structure that geometric ethics predicts. This narrows the search space and provides quantitative evaluation criteria: the extracted tensor should transform correctly under the BIP, satisfy the conservation of harm, and produce the correct contraction when contracted with the governance-specified interest covector.

29.8 Scalability of Structural Containment

The Scalability Problem

The No Escape Theorem (Chapter 18, §18.6) proves that structural containment blocks cognitive escape routes. But the theorem is a mathematical result about an idealized architecture. Whether the architecture can be implemented at scale — for systems with billions of parameters, processing millions of inputs per second, deployed across diverse domains — is an engineering question that the theorem does not address.

Key Challenges

Canonicalization at scale. The canonicalizer C must process every input before evaluation. For a high-throughput system (a content-moderation system processing millions of posts per day, for example), the canonicalizer must be fast. But canonicalization is non-trivial: mapping equivalent descriptions to a unique representative requires understanding the equivalence relation, which may be complex. Can canonicalization be performed in real time without sacrificing correctness?

Grounding adequacy. The grounding tensors Ψ must capture all morally relevant features of the domain (axiom G4). For narrow domains (medical devices measuring specific vital signs), this is achievable. For broad domains (a general-purpose language model interacting with any topic), the set of morally relevant features may be open-ended. Can grounding adequacy be achieved for general-purpose AI systems?

Audit storage and processing. Every output must be accompanied by a complete audit artifact (Requirement 3). For high-throughput systems, this generates enormous volumes of audit data. Can audit trails be compressed without losing morally relevant information? Can audit verification be performed in real time?

Multi-system composition. Modern AI deployments involve multiple interacting systems: a language model, a retrieval system, a ranking algorithm, a content filter. The No Escape Theorem applies to a single agent with a single pipeline. How does structural containment compose across multiple interacting systems?

Hardware scalability (partial progress, February 2026). The DEME V3 reference implementation addresses the computational throughput challenge through hardware-accelerated tensor evaluation. Backends for CPU (NumPy), CUDA (GPU), and NVIDIA Jetson (edge) enable evaluation at different scales, from edge devices in smart-home deployments to data-center workloads. Tucker decomposition and tensor-train formats reduce the computational cost of rank-6 tensor operations from O(k6) to O(k3r3) where r is the decomposition rank. These engineering advances partially address the scalability challenge, though the formal composition guarantees for multi-system containment remain open.

Research Directions

Approximate canonicalization. Develop canonicalizers that are provably ϵ -correct: ∥C(x)-C(x')∥<ϵ whenever x≈x', with bounded computation. The ϵ -approximation trades perfect canonicalization for scalability, with the approximation error bounded and auditable. Partial progress: The BIP v10.16 contrastive training (§17.10) achieves approximate canonicalization with 1.2% residual language leakage—an ϵ -correct canonicalizer in practice, though not yet with formal provability guarantees.

See also Section 19.13 for tensor-structured canonicalization proposals that partially address the correctness component of approximate canonicalization through architectural rather than statistical guarantees.

Domain-specific grounding. Rather than seeking universal grounding, develop domain-specific grounding declarations with formal adequacy proofs: for the medical domain, prove that Ψmedical captures all morally relevant features of medical decision-making; for the financial domain, prove the same for Ψfinancial. Each domain-specific proof is tractable; the universal proof may not be.

Compositional containment. Develop a compositional theory of structural containment: if system A is structurally contained and system B is structurally contained, under what conditions is their composition structurally contained? The theory would identify the interface conditions that preserve containment across system boundaries.

29.9 The Metaphysics of Moral Geometry

Is Moral Geometry Real?

The framework developed in this book uses the language of geometry — manifolds, tensors, metrics, curvature — to describe moral reasoning. But is the geometry real? Does moral space actually have curvature, in the way that spacetime actually has curvature? Or is the geometry a useful model — a way of organizing our thinking about ethics that happens to use geometric vocabulary without making geometric claims?

Three positions are available:

Moral geometric realism. The moral manifold is a real structure, and its geometric properties (dimension, curvature, topology) are genuine features of moral reality. The metric is discovered, not stipulated. Moral geometry is to ethics what spacetime geometry is to physics: the true structure of the domain, accessible through mathematical investigation and empirical testing.

Moral geometric instrumentalism. The geometric framework is a useful representational device. It organizes moral reasoning, reveals hidden structure, and enables computation. But the geometry is not “really there” in the way that spacetime geometry is “really there.” The manifold is a model, not a territory.

Moral geometric governance. The position developed in Chapter 9: the geometry is neither simply real nor merely instrumental. The moral metric is governed — the output of legitimate institutional processes. The geometry is real in the sense that it has genuine consequences (it constrains moral reasoning, enables computation, supports conservation laws), but its specific form is the product of governance, not discovery.

The framework is compatible with all three positions. The mathematics does not depend on the metaphysics: the same differential equations, the same conservation laws, the same structural containment results hold regardless of whether the geometry is “really there” or “usefully modeled.” But the metaphysical question matters for the framework’s interpretation and its authority claims. A realist reading gives the framework the authority of discovered truth; an instrumentalist reading gives it the authority of proven utility; a governance reading gives it the authority of legitimate institutional process.

Resolving this question — or determining that it cannot be resolved — is an open philosophical problem with practical implications for how the framework is deployed and defended.

The question “Is the geometry real?” is structurally underdetermined — as it is for all successful mathematical models in science. General relativity’s spacetime curvature may be a feature of reality (the realist reading), a useful fiction that organizes observations (the instrumentalist reading), or an artifact of our coordinate-dependent formulation that dissolves in a more fundamental theory (the structural reading). Physicists disagree about which interpretation is correct, and the disagreement has persisted for a century without impeding the theory’s predictive success. The same underdetermination applies to geometric ethics, and the same resolution is available: the framework works regardless of which interpretation is adopted. Its theorems hold, its predictions test out, its architecture governs. The metaphysical question is interesting but not urgent.

Empirical evidence could, in principle, distinguish the positions. If the moral manifold’s geometric properties (curvature, holonomy, gauge structure) are discovered independently by researchers using different formalisms and different data — the way general relativity’s predictions were confirmed by gravitational lensing, frame-dragging, and gravitational waves decades after the theory — that would support realism. If the framework proves useful in practice (governing AI systems, structuring policy analysis, clarifying philosophical disputes) but no convergence on its geometric claims emerges from independent research programs, that would support instrumentalism. If the geometric structure turns out to be a necessary consequence of any framework satisfying basic structural requirements — as Theorem 9.2 begins to suggest — that would support the governance reading: the geometry is not a fact about the world but a constraint on how moral governance can coherently operate. All three readings are compatible with the mathematics as developed in this book.

29.10 The Intertheoretic Comparison Problem

The Comparison Problem

Chapter 16 (§16.5) identified the intertheoretic comparison problem: how to compare obligations, satisfactions, and harms across different moral theories that use different metrics. The theory covariance tensor Θjk encodes cross-theory uncertainty, but comparing quantities across theories requires a meta-metric — a metric on the space of moral theories that enables comparison.

Whether such a meta-metric exists, and what form it takes, is an open problem with deep implications for moral uncertainty and for AI systems that must operate across diverse moral traditions.

The Formal Structure

Let {g(1),g(2),…,g(K)} be K admissible moral metrics (corresponding to K moral theories). The intertheoretic comparison problem is to define a distance d(g(j),g(k)) between metrics that:

Respects moral content. Two metrics that disagree only on cosmetically different dimensions should be close; two that disagree on fundamental trade-offs should be far apart.

Is operationally meaningful. The distance should predict the degree to which verdicts differ: theories that are close should agree on most cases; theories that are far should disagree systematically.

Is computable. The distance should be computable from the metric components, not require infinite comparison across all possible cases.

The space of admissible metrics, equipped with this distance, would be a meta-manifold — a manifold of moral theories. Its geometry (curvature, topology) would encode the structural relationships among moral theories: which theories are neighbors, which are distant, whether the space has distinct regions or is connected.

Approaches

Riemannian distance on the space of metrics. The space of positive-definite symmetric tensors has a natural Riemannian metric (the Fisher-Rao metric or the affine-invariant metric). Applying this to moral metrics gives:

d(g(j),g(k))=∥log((g(j))-1g(k))∥F

where ∥⋅∥F is the Frobenius norm. This distance is zero when the metrics agree, positive otherwise, and invariant under simultaneous change of coordinates.

Verdict divergence. Define the distance as the expected disagreement on a reference distribution of cases:

dverdict(g(j),g(k))=Ex∼D[|S(j)(x)-S(k)(x)|]

This is operationally meaningful but dependent on the reference distribution D.

Robust core distance. Define the distance as the volume of the symmetric difference of the robust cores (Chapter 16, §16.6): the set of obligations endorsed by one theory but not the other. Theories with identical robust cores are close; theories with disjoint robust cores are far.

Which approach (or combination) is correct is an open problem.

29.11 Institutional Requirements for Geometric Governance

The Governance Problem

The No Escape Theorem (Chapter 18) reduces AI safety to four tractable (though hard) problems, in three categories—governance, engineering, and security: governance (specifying adequate grounding tensors), engineering (implementing the containment architecture), and security (protecting the physical infrastructure). Of these, governance is the most challenging — not technically, but institutionally.

Geometric AI governance requires institutions that can:

Specify grounding tensors. Determine which physical observables Ψ capture the morally relevant features of a domain. This requires both technical expertise (what can be measured?) and moral judgment (what matters?).

Govern the moral metric. Set the trade-off weights, priority orderings, and veto conditions that constitute the DEMEProfile (Chapter 19). This requires democratic legitimacy: the metric must reflect the community’s considered values, not a technocrat’s assumptions.

Maintain the containment architecture. Ensure that canonicalization, grounding, audit, and verification are correctly implemented and updated. This requires ongoing engineering competence and institutional commitment.

Resist governance attacks. Defend the constraint architecture against agents (human or artificial) who would weaken it for commercial, political, or ideological reasons. This requires institutional resilience and independence.

Evolve with moral progress. Update the governance framework as moral understanding develops — incorporating new moral considerations, revising outdated trade-offs, and responding to novel challenges. This requires adaptive capacity without sacrificing stability.

No existing institution has all five capabilities. Regulatory agencies have legitimacy but often lack technical expertise. Technology companies have engineering competence but lack democratic legitimacy. Standards bodies have stability but often lack adaptive capacity. Academic institutions have intellectual independence but lack enforcement power.

The Institutional Design Problem

Designing institutions that can govern geometric AI ethics is an open problem at the intersection of political theory, institutional design, and engineering management. The geometric framework provides constraints — the institution must be able to specify tensors, govern metrics, and maintain containment — but does not determine the institutional form.

Several models warrant investigation:

Multi-stakeholder governance boards with technical, ethical, and community representation, empowered to set and revise DEMEProfiles for specific domains.

Regulatory sandboxes where structurally contained AI systems are deployed under enhanced monitoring, generating empirical data about governance requirements before general deployment.

Federated governance in which different jurisdictions maintain different DEMEProfiles (metric pluralism), with a shared core (the Geneva EM) ensuring baseline compliance across jurisdictions.

Adaptive regulation that specifies outcome requirements (Bond Index below threshold) rather than process requirements, allowing diverse institutional arrangements to meet the same standard.

29.12 A Taxonomy of Open Problems

We summarize the open problems in a table:

ProblemChapterTypeDifficultyImpact
Direct measurement of curvature, 19.2EmpiricalHighConfirms/refutes core prediction
Metric signature, 19.3Empirical + theoreticalHighDetermines qualitative geometry
Manifold topology, 19.3Empirical + theoreticalVery highConstrains global structure
Moral field equation, 19.4TheoreticalVery highDeepest structural result
Additional symmetries, 19.5Theoretical + empiricalMediumNew conservation laws
Anomaly source, 16, 19.5EmpiricalMediumExplains symmetry breaking
Torsion, 19.6EmpiricalMediumDetermines correct connection
Tensorial interpretability, 19.7Engineering + scienceHighEnables diagnosis of any AI
Scalable containment, 19.8EngineeringHighPractical deployment
Moral geometric metaphysics, 19.9PhilosophicalMediumFramework interpretation
Intertheoretic comparison, 19.10Theoretical + empiricalHighCross-theory uncertainty
Institutional design, 18, 19.11Political + institutionalVery highGovernance implementation

Priority Ordering

Not all problems are equally urgent. We suggest the following priority ordering based on a combination of feasibility and impact:

Immediate priorities (feasible with current methods, high impact): 1. Direct curvature measurement via holonomy experiments (§29.2) 2. Tensorial interpretability via probing methods (§29.7) [initial results from BIP v10.16, February 2026] 3. Anomaly source identification via BIP data analysis (§29.5)

Medium-term priorities (require methodological development): 4. Metric signature determination (§29.3) 5. Torsion detection (§29.6) 6. Scalable containment engineering (§29.8) [V3 hardware backends provide partial progress] 7. Institutional design pilots (§29.11)

Long-term research programs (require substantial theoretical and empirical progress): 8. Moral field equation (§29.4) 9. Manifold topology (§29.3) 10. Intertheoretic comparison (§29.10) 11. Additional symmetries (§29.5)

29.13 What Would Falsify the Framework?

A framework that cannot be falsified is not a scientific theory — it is a dogma. Chapter 17 (§17.7) listed specific predictions that, if violated, would undermine the framework. We consolidate them here:

1. Flat deontic transfer. If the obligation/permission axis failed to transfer across languages — if a situation described as obligatory in English were consistently classified as permissible in Arabic — the BIP’s central prediction would be falsified.

2. Scalar sufficiency. If a carefully controlled study showed that scalar evaluation captures all the morally relevant information — that no multi-dimensional structure exists beyond what a single number preserves — the entire tensor hierarchy would be unnecessary.

3. Path-independence. If parallel transport of obligations were perfectly path-independent — if carrying a moral commitment through any sequence of contexts always returned it unchanged — then moral space is flat, the curvature vanishes, and the dynamics reduce to a simpler theory.

4. Contraction commutativity. If the order of contraction never mattered — if every contraction procedure produced the same scalar — then the moral residue is empty, the arrow-theorem connection (Chapter 15) dissolves, and the richness of the contraction theory is unwarranted.

5. Conservation violation. If relabeling or reframing a moral situation changed the measured harm — genuinely changed it, not merely changed the description — the conservation of harm would be falsified, and the Noether-BIP connection would fail.

6. Invariance irrelevance. If AI systems that violated the BIP (different outputs for equivalent inputs) performed better on moral evaluations than systems that respected it, then invariance is not an alignment criterion but a constraint that harms performance.

None of the six core falsifications listed above has occurred — though two auxiliary predictions were falsified and led to productive framework revision (§13.9, §17.10). The evidence to date is consistent with the framework’s structural predictions. But the predictions are precise, the experiments are (mostly) designable, and the framework stands or falls on their outcomes. This is the appropriate epistemic stance for a mathematical framework with empirical aspirations: hold it provisionally, test it rigorously, and revise if necessary.

29.14 The Shape of the Unknown

This chapter has catalogued what we do not know. The list is long — longer, in some ways, than the list of what we have established. The moral manifold’s specific geometry is unknown. The quantum extension’s predictions are untested in moral contexts. The moral field equation is a conjecture. The institutional requirements for geometric governance are unmet.

But the shape of the unknown is itself informative. We know what questions to ask — and this is the framework’s contribution even where it does not yet provide answers. Before geometric ethics, the question “what is the curvature of moral space?” was not a question at all. Before the conservation of harm, the question “is harm invariant under relabeling?” was a philosophical intuition, not a testable prediction. Before the No Escape Theorem, the question “can structural constraints contain an AI regardless of its intelligence?” was an anxiety, not a theorem.

The open problems are not gaps in a wall. They are doors.

Technical Appendix

Proposition 20.1 (Curvature Detection Criterion). Let γ be a closed loop in the moral manifold M, and let Oμ be an obligation vector parallel-transported along γ. The holonomy ΔOμ=Ofinalμ-Oinitialμ satisfies:

ΔOμ=∮γ Rναβμ Oν dxα∧dxβ+O(Area2)

where the integral is over the area enclosed by γ . Thus ΔOμ≠0 implies Rναβμ≠0 (nonzero curvature), and the magnitude of ΔO is proportional to the enclosed area times the average curvature.

Proof. By Proposition 10.2, the holonomy around an infinitesimal parallelogram enclosing area δA^{αβ} satisfies H_γ(O)ᵘ − Oᵘ = Rᵘ_{ναβ} Oᵛ δA^{αβ}. For a finite loop γ enclosing a surface Σ with ∂Σ = γ, we tile Σ with infinitesimal parallelograms and sum the contributions. By the non-abelian Stokes theorem, the total holonomy is the path-ordered exponential of the curvature integrated over the enclosed area: ΔOᵘ = ∮_Σ Rᵘ_{ναβ} Oᵛ dxᵅ ∧ dxᵝ + O(Area²), where the higher-order terms arise from the non-commutativity of successive parallel transports. To first order in the enclosed area, the integral is the average curvature times the area. Hence ΔOᵘ ≠ 0 implies Rᵘ_{ναβ} ≠ 0 somewhere on Σ. □

Proposition 20.2 (Torsion Detection Criterion). Let vμ and wμ be two infinitesimal displacement vectors at a point p∈M. Transport v along w and w along v. The closure failure of the resulting parallelogram is:

δμ=Tαβμ vαwβ

where Tαβμαβμβαμ is the torsion tensor. Thus δμ≠0 implies nonzero torsion, detectable as an order-dependent residual in moral evaluation.

Proof. Transport vᵘ along w: the endpoint is pᵘ + wᵘ + vᵘ + Γᵘ_{αβ} vᵅ wᵝ (to first order in the displacements). Transport wᵘ along v: the endpoint is pᵘ + vᵘ + wᵘ + Γᵘ_{αβ} wᵅ vᵝ. The closure failure is the difference: δᵘ = Γᵘ_{αβ} vᵅ wᵝ − Γᵘ_{αβ} wᵅ vᵝ = (Γᵘ_{αβ} − Γᵘ_{βα}) vᵅ wᵝ = Tᵘ_{αβ} vᵅ wᵝ. Hence δᵘ ≠ 0 iff Tᵘ_{αβ} ≠ 0 for some components probed by v, w. In the moral context: if evaluating the same case by first considering dimension α then dimension β gives a different result than the reverse order, the connection has torsion. □

Proposition 20.3 (Meta-Metric Existence). The space of admissible moral metrics G={gμν:g satisfies the constraints of Chapter 9, §9.6} admits a natural Riemannian metric inherited from the space of positive-definite symmetric tensors. This meta-metric is unique up to overall scale if we require invariance under simultaneous change of coordinates on M.

Proof. The space Sym⁺(9) of positive-definite symmetric 9×9 matrices carries the DeWitt metric: G_g(h, k) = Tr(g⁻¹h g⁻¹k) for tangent vectors h, k ∈ T_g Sym⁺(9). This is a Riemannian metric because (i) G_g is bilinear in (h,k), (ii) G_g(h,h) = Tr((g⁻¹h)²) ≥ 0 with equality iff h = 0 (since g⁻¹h is a real matrix), and (iii) G_g is smooth in g. The admissible metric space Adm(M) ⊂ Sym⁺(9) is an open subset (Theorem 9.1) and therefore inherits G as a Riemannian metric by restriction. For uniqueness: under a coordinate change x → x̃, the metric transforms as g_{μν} → Jᵅ_μ Jᵝ_ν g_{αβ}. The requirement G_{g̃}(h̃, k̃) = G_g(h,k) (diffeomorphism invariance) constrains G to the DeWitt supermetric family G_g(h,k) = a Tr(g⁻¹h g⁻¹k) + b Tr(g⁻¹h)Tr(g⁻¹k) for constants a, b (Ebin 1970; DeWitt 1967). Hence the meta-metric is unique up to overall scale and the ratio b/a. □

The map is not the territory — but a map that accurately shows where the territory is uncharted is more useful than one that pretends every region has been surveyed.

The open problems are the framework’s honest cartography of its own ignorance.