← Geometric Reasoning: From Search to Manifolds Contents Chapter 2: When the Space Has Shape →

Chapter 1: Reasoning as Search

“The problem solver’s search through this space is the fundamental process of problem solving.” — Allen Newell and Herbert A. Simon, Human Problem Solving (1972)

RUNNING EXAMPLE — DR. OKAFOR’S TRIAGE

Dr. Amara Okafor is an emergency physician at a large urban hospital. Tonight, three patients arrive simultaneously: a teenager with a skateboarding injury, an elderly man with chest pain, and a young woman with a severe headache. She has one CT scanner, one trauma bay, and thirty seconds to decide who gets what. Her reasoning feels instantaneous — but it is not. She is searching.

She considers orderings, weighs severities, estimates probabilities of deterioration, and balances urgency against resource constraints. Each mental configuration — “chest pain first, headache second, skateboard can wait” — is a state. Each reassessment is a transition. The heuristic guiding her search is twenty years of clinical experience, compressed into pattern recognition so rapid it feels like intuition.

This book argues that Dr. Okafor’s triage reasoning, and all reasoning, has the structure of search through a geometric space. The quality of her decisions depends on the quality of her heuristic — and the geometry of the space she is searching.

Every competent textbook on artificial intelligence opens with search. Breadth-first, depth-first, iterative deepening, A*—these algorithms fill the early chapters before the discussion moves on to what are considered more advanced topics: learning, planning, language, perception. The standard narrative treats search as a warm-up exercise, a necessary but elementary prologue to the real substance of intelligence.

This book argues the opposite. Search is not the prologue. Search is the plot. Reasoning is search, in a precise mathematical sense that we will develop across the coming chapters. What changes as we move from toy puzzles to genuine cognition is not the abandonment of search, but the enrichment of the space being searched. The space acquires distance, curvature, symmetry, and boundary—it acquires geometry. And the quality of reasoning, whether human or artificial, is determined by the interplay between that geometry and the heuristic signal guiding the search.

This first chapter traces the origin of the search paradigm for reasoning, establishes why the connection between search and reasoning is not merely metaphorical, surveys the spectrum from uninformed to optimal search, and identifies the critical limitation that motivates the rest of the book: classical search theory treats the state space as a graph, but real reasoning spaces have far richer structure than graphs can express.

1.1 The Problem Space Hypothesis

In 1972, Allen Newell and Herbert A. Simon published Human Problem Solving, a work that remains one of the most consequential contributions to the science of mind. The book is long, dense, and empirically grounded to a degree unusual in cognitive science. Its central claim, which they termed the Problem Space Hypothesis, can be stated in a single sentence:

All goal-oriented symbolic activity takes place in a problem space consisting of states, operators that transform states, and a control strategy that selects which operator to apply next.

Unpack this carefully. A state is a complete description of the current situation as the reasoner represents it. An operator is a cognitive action—an inference step, a retrieval, an analogy, a decomposition—that transforms one state into another. A control strategy is the policy that determines which operator to apply at each step. The resulting trajectory through state space is what we observe as reasoning.

Newell and Simon arrived at this formulation not through armchair philosophy but through meticulous analysis of human behavior on a range of tasks: cryptarithmetic puzzles, theorem proving in predicate logic, chess. They transcribed verbal protocols—asking subjects to think aloud—and then modeled the observed behavior as search through a problem space. The fit was remarkably good. Subjects did not simply “know” the answer and retrieve it. Nor did they combine raw associations until something useful emerged. Instead, they explored a structured space of possibilities, guided by heuristic evaluations that told them which directions looked promising.

The Problem Space Hypothesis has three features that make it exceptionally powerful.

First, it is domain-general. It does not say that some tasks involve search while others involve a fundamentally different process. It says that the computational structure of all goal-oriented reasoning has the form of search through a problem space. Chess, moral deliberation, mathematical proof, medical diagnosis, creative writing—these differ in the structure of their respective problem spaces and the quality of their available heuristics, but the computational architecture is shared. One framework, many instantiations.

Second, it is mechanistic. The hypothesis does not merely describe what reasoning achieves; it describes how reasoning proceeds. The granularity is at the level of individual state transitions, which means the hypothesis generates detailed, step-by-step predictions about the trajectory of thought. These predictions can be checked against empirical data, as Newell and Simon extensively demonstrated.

Third, it is constructive. Given a problem space, one can build a program that searches it. This is exactly what Newell, Simon, and their collaborators did, producing systems like the General Problem Solver (GPS) and its successors. The hypothesis bridges description and construction: to understand a cognitive process is to be able to specify the problem space in which it occurs, and to specify a problem space is to be able to build a system that searches it.

These three properties—generality, mechanistic detail, constructive power—are why the Problem Space Hypothesis has endured. But what exactly does it mean to say that reasoning is search? This is a stronger claim than it might first appear.

1.2 Search Is Not a Metaphor

There is a temptation to read the Problem Space Hypothesis as a loose analogy. “Reasoning is like search,” the weaker reading goes. “We cast about among possibilities, try different approaches, backtrack when we hit dead ends—it’s as if we were searching.” On this reading, search is a metaphor drawn from everyday experience and projected onto the mental domain, much as we speak of “grasping” an idea or “constructing” an argument without literally building anything.

This weaker reading misses the point entirely. Newell and Simon’s claim is not metaphorical; it is a claim about mathematical structure. Let us be precise.

A search problem is a tuple (S, s_0, A, T, G) where:

S is a set of states,
s_0 \in S is the initial state,
A is a set of actions (operators),
T: S \times A \to S is a transition function mapping a state and an action to a successor state,
G \subseteq S is a set of goal states.

A solution is a sequence of actions a_1, a_2, \ldots, a_n such that the trajectory s_0 \xrightarrow{a_1} s_1 \xrightarrow{a_2} \cdots \xrightarrow{a_n} s_n terminates in a goal state: s_n \in G.

[Modeling Axiom.] The Problem Space Hypothesis asserts that for any instance of goal-oriented reasoning, there exists a search problem (S, s_0, A, T, G) such that:

The states in S correspond to the cognitive configurations the reasoner can occupy.
The actions in A correspond to the cognitive operations the reasoner can perform.
The transition function T faithfully describes how cognitive operations transform cognitive configurations.
The goal set G corresponds to the cognitive configurations in which the reasoner judges the problem solved.
The actual trajectory of reasoning corresponds to an actual path through the resulting graph.

This is not a metaphor. It is a structural isomorphism claim. It says that the mathematical object defined by the search tuple and the computational structure of reasoning share the same abstract form. One can make the same kind of claim in physics: the structure of Newtonian mechanics is isomorphic to the structure of certain differential equations. Nobody considers this a metaphor.

The isomorphism claim has empirical content. It predicts, for instance, that reasoning should exhibit the phenomena characteristic of search: sensitivity to the size of the state space, improvement when heuristic guidance is available, degradation when irrelevant branches proliferate, and failure when the search process is terminated before reaching a goal state. All of these predictions hold robustly across human cognitive performance. People solve smaller state spaces more reliably than larger ones. People with domain expertise (better heuristics) solve problems faster. People confronted with many plausible alternatives take longer and make more errors. People forced to decide under time pressure produce worse solutions.

One could object that these predictions are too generic—that any theory of bounded cognition would predict such patterns. But the Problem Space Hypothesis goes further. Because it specifies the structure at the level of individual transitions, it predicts the specific paths that subjects take through the problem space, including the specific errors, the specific backtracking events, and the specific ordering of sub-goals. Newell and Simon’s protocol analyses confirm these fine-grained predictions repeatedly.

The claim is also falsifiable. If reasoning involved a fundamentally non-search-like process—direct holistic perception of the answer, for instance, without any sequential exploration—then the search model would fail to account for the observed trajectory of thought. For some cognitive acts (perceptual recognition, overlearned motor skills), this may indeed be the case; Newell and Simon were careful to restrict their hypothesis to goal-oriented symbolic activity. But for reasoning in the deliberative sense—solving a new problem, working through an argument, weighing competing considerations—the search model has not been falsified in over half a century.

What Newell and Simon did not have was the right mathematical vocabulary for characterizing the structure of the search space itself. Their problem spaces were discrete graphs: nodes and edges, states and transitions. This was appropriate for the tasks they studied (puzzles, symbolic logic), but it is radically insufficient for the spaces that arise in real reasoning. The space of possible moral judgments is not a graph. The space of possible scientific hypotheses is not a graph. The space of possible plans for navigating a complex social situation is not a graph—or, if we insist on representing it as one, the graph becomes so enormous and so irregular that the representation obscures more than it reveals.

We will return to this limitation in Section 1.5. For now, the key takeaway is this: Newell and Simon got the ontology right. Reasoning is search. The question that remains—the question this book addresses—is what happens when we take the structure of the search space seriously.

1.3 The Spectrum of Search

Not all search is created equal. Computer science has developed a rich taxonomy of search algorithms, and this taxonomy is more than a catalog of techniques—it maps, with surprising fidelity, to the spectrum of reasoning quality.

At one end of the spectrum is uninformed search. Breadth-first search (BFS) and depth-first search (DFS) explore the state space without any knowledge of which directions are promising. BFS explores all states at depth k before moving to depth k+1. DFS plunges down a single branch as far as possible before backtracking. Neither uses any information about the goal beyond the ability to recognize it upon arrival.

Uninformed search is the computational analogue of guessing. A student solving an algebra problem by trying random operations—add 3, multiply by 7, take a square root—is performing something like uninformed search. The student will eventually stumble onto the answer if the problem is small enough, but the process is exponentially expensive in the depth of the solution. For any non-trivial problem, uninformed search is hopeless.

The situation improves dramatically with greedy search. A greedy algorithm uses a heuristic function h(x) that estimates how close state x is to a goal. At each step, the algorithm moves to the neighbor that minimizes h. The greedy approach is fast—it makes one evaluation per expansion—but it is easily misled. If the heuristic is locally attractive but globally misleading, the search will march confidently in the wrong direction.

Greedy search is the computational analogue of intuitive, impulsive reasoning. The student who glances at an algebra problem and immediately tries the most “obvious-looking” operation is performing greedy search. Sometimes the intuition is right and the problem is solved quickly. Sometimes the intuition is a trap, and the student ends up farther from the answer than before. Greedy search has no mechanism for recognizing or recovering from such mistakes. It never looks back.

Informed search improves on greedy search by considering not only the estimated distance to the goal but also the cost already incurred. The best-known informed search algorithm is A*, which we will examine in detail in Section 1.4. The key innovation is the evaluation function f(x) = g(x) + h(x), where g(x) is the actual cost of reaching state x from the start and h(x) is the estimated cost from x to a goal. By balancing past investment against future promise, informed search avoids the pathologies of both uninformed and greedy approaches.

Informed search is the computational analogue of expert reasoning. The experienced clinician does not run through every possible diagnosis (uninformed search). Nor does she jump to the most salient diagnosis without considering its prior probability (greedy search). Instead, she integrates what she has observed so far (g) with her estimate of where each diagnostic path leads (h), and she pursues the path that minimizes the total expected cost. This is why expertise feels like neither random exploration nor snap judgment but like guided deliberation—a directed traversal through the space of possibilities, informed by experience and modulated by evidence.

At the far end of the spectrum is optimal search: search that is guaranteed to find the best solution, not merely a solution. A* under certain conditions provides this guarantee, as we will see. Optimal search is the computational analogue of perfect rationality—a standard that no bounded agent, human or artificial, can actually achieve in general, but that serves as a normative benchmark against which actual reasoning can be measured.

This spectrum—uninformed, greedy, informed, optimal—is not merely a textbook classification. It captures something real about the structure of reasoning quality. The novice reasons like BFS, casting about widely. The impulsive thinker reasons like greedy search, following the strongest local signal. The expert reasons like A*, integrating accumulated evidence with learned heuristics. The ideal reasoner follows the optimal path, wasting no effort on unpromising branches.

Two observations about this spectrum will be important for the rest of the book.

[Conditional Theorem.] The first is that movement along the spectrum is entirely determined by the quality of the heuristic h(x). Set h(x) = 0 for all x, and A* degenerates to uniform-cost search (essentially uninformed). Set h(x) = h^*(x)—the true distance to the nearest goal—and A* finds the optimal solution by expanding only the nodes on the optimal path; it never wastes a single expansion. The quality of reasoning, in this framework, is a direct function of the quality of the heuristic.

The second observation is that the heuristic can be wrong in systematic ways, and these systematic errors produce systematic reasoning failures. A heuristic that overestimates cost in one region will cause the search to avoid that region, even if the optimal path passes through it. A heuristic that underestimates cost near a trap will cause the search to wander into the trap before realizing its mistake. The character of the heuristic’s errors determines the character of the reasoning failures. This is not merely a theoretical possibility—it is a precise description of what happens when human judgment is distorted by framing effects, emotional anchoring, or social pressure, as we will argue in detail in Chapters 5 through 8.

1.4 A* and the Centrality of the Heuristic

The A* algorithm, introduced by Hart, Nilsson, and Raphael in 1968, is the canonical informed search algorithm and one of the most important algorithms in all of computer science. Its importance for our purposes lies not in its practical applications—though these are vast—but in what it reveals about the relationship between heuristic quality and reasoning quality.

The Algorithm

A* maintains two data structures: an open set of states that have been discovered but not yet fully evaluated, and a closed set of states that have been fully evaluated. Each state x is assigned a value:

f(x) = g(x) + h(x)

where g(x) is the cost of the cheapest known path from the initial state s_0 to x, and h(x) is a heuristic estimate of the cost of the cheapest path from x to a goal state.

At each step, A* selects the state x in the open set with the smallest f(x), moves it to the closed set, and expands it: for each successor x' of x, A* computes g(x') = g(x) + c(x, x') (where c(x, x') is the transition cost) and f(x') = g(x') + h(x'), and adds x' to the open set if it is not already there (or updates its value if the new path is cheaper).

The algorithm terminates when a goal state is selected for expansion, or when the open set is empty (indicating no solution exists).

Admissibility and Optimality

The foundational theorem of A* concerns the property of admissibility. A heuristic h is admissible if it never overestimates the true cost to reach a goal:

h(x) \leq h^*(x) \quad \text{for all } x \in S

where h^*(x) is the actual minimum cost from x to the nearest goal.

Theorem (Hart, Nilsson, and Raphael, 1968). [Established Mathematics.] If h is admissible, then A* is guaranteed to find an optimal solution—a solution with minimum total cost.

The proof is elegant and illuminating. Suppose A* terminates by selecting a goal state g_1 whose path cost is g(g_1) = C_1. Suppose there exists a better goal state g_2 reachable at cost C_2 < C_1. Then somewhere on the optimal path to g_2 there must be a state x still in the open set. For this state, f(x) = g(x) + h(x) \leq g(x) + h^*(x) = C_2 < C_1 = f(g_1). But then x would have been selected before g_1, contradicting the assumption that g_1 was selected. Hence no better goal is reachable, and g_1 is optimal.

The proof reveals the role of admissibility with crystalline clarity: an admissible heuristic prevents the search from prematurely committing to a suboptimal goal. By never overestimating the remaining cost, the heuristic ensures that any unexplored state on the truly optimal path will always look at least as promising as any suboptimal goal. The search cannot be fooled into accepting an inferior solution.

Consistency and Efficiency

A stronger property, consistency (also called monotonicity), further sharpens the guarantees. A heuristic h is consistent if, for every state x and every successor x' of x:

h(x) \leq c(x, x') + h(x')

This is a triangle inequality condition: the estimated cost from x to the goal cannot exceed the cost of stepping to x' plus the estimated cost from x' to the goal. Consistency implies admissibility (by induction along any path to a goal), but the converse does not hold.

When h is consistent, A* has a further property: every state is expanded at most once. This means that A* never needs to re-evaluate a state it has already processed, which eliminates a significant source of redundant computation. Consistent heuristics make A* not only optimal but efficient.

The Heuristic as the Bottleneck

Here is the insight that carries over to reasoning: **everything about A*’s performance is controlled by the heuristic.** Consider the extremes.

If h(x) = 0 for all x, then f(x) = g(x), and A* reduces to uniform-cost search (Dijkstra’s algorithm). It finds the optimal solution but does so by expanding every state whose cost is less than the optimal cost—potentially the entire graph. No intelligence is applied; the search is exhaustive.

If h(x) = h^*(x)—the perfect heuristic—then A* expands only the states on the optimal path, never wasting a single expansion on a branch that does not contribute to the optimal solution. In the vocabulary of reasoning, this corresponds to following the ideal chain of thought directly to the answer, without detours, without backtracking, without wasted effort. It is the computational ideal of perfect rationality.

Between these extremes, the closer h is to h^*, the fewer states A* expands. The relationship is not merely monotone; it has been shown to be quite sharp. Under mild conditions, if two admissible heuristics satisfy h_1(x) \leq h_2(x) \leq h^*(x) for all x, then every state expanded by h_2 is also expanded by h_1. The more informed heuristic strictly dominates the less informed one.

This dominance result has a direct cognitive interpretation: a reasoner with a better heuristic will consider a strict subset of the states considered by a reasoner with a worse heuristic, and will arrive at the same (optimal) answer. Expert reasoning is not different in kind from novice reasoning. It is different in efficiency, and the difference is entirely attributable to the quality of the heuristic.

What Goes Wrong: Corrupted Heuristics

The admissibility theorem tells us what happens when the heuristic is well-behaved: we get optimality. But what happens when the heuristic is corrupted?

If h(x) > h^*(x) for some states x, then the admissibility condition is violated, and A* may terminate at a suboptimal goal. The overestimate causes the search to avoid the region around x, even if the optimal path passes through it. The search has been misdirected by a faulty signal.

If the heuristic corruption is systematic—if h consistently overestimates in one region and underestimates in another—then the search will exhibit a systematic bias: it will favor the underestimated region and avoid the overestimated one. This bias will manifest as a predictable pattern of errors, not random noise.

Now consider an analogy that is, on the thesis of this book, more than an analogy. A human decision-maker evaluating a moral dilemma is (we claim) performing a search through a space of possible judgments. The heuristic guiding that search is a composite of moral intuitions, emotional responses, social expectations, and explicit ethical principles. If the problem is presented in a neutral frame, the heuristic may be roughly admissible—it provides reasonable guidance toward a well-considered judgment. But if the problem is presented with a vivid, emotionally charged frame, the heuristic is corrupted: the emotional salience overestimates the importance of certain features and underestimates others, causing the search to converge on a different—and measurably worse—judgment.

[Empirical.] In our empirical work on the Measuring AGI benchmarks (Chapter 13), we observe precisely this pattern. When a moral dilemma is presented with vivid emotional framing, language models produce judgments that differ significantly from their neutral-frame judgments—with displacements of 8.9 standard deviations in the most extreme cases. The heuristic has been corrupted by a surface feature that changes the presentation of the problem without changing its substance. Admissibility has been violated, and optimality has been lost.

The A* framework thus provides a precise vocabulary for diagnosing reasoning failures. A biased reasoner is not “irrational” in some vague, all-purpose sense. A biased reasoner has a specific heuristic corruption: it overestimates cost in certain regions of the search space and underestimates it in others, producing a predictable pattern of search trajectories that systematically miss the optimal path.

1.5 The Limitation: When Graphs Are Not Enough

We have argued that reasoning is search and that the quality of reasoning depends on the quality of the heuristic. So far, so good. But there is a fundamental limitation in the classical framework that we must confront honestly.

A* operates on a graph: a set of nodes connected by weighted edges. The search space is discrete, the transitions are enumerable, and the cost structure is specified by edge weights. This works beautifully for the 15-puzzle, for route planning, for theorem proving in finite axiom systems. It works less well—and ultimately not at all—for the kinds of search spaces that arise in real reasoning.

Consider the space of possible beliefs about climate change. A belief state is not a single node in a graph; it is a point in a high-dimensional continuous space characterized by degrees of confidence in various propositions, estimates of various physical quantities, assessments of the reliability of various evidence sources, and so on. Neighboring belief states are not connected by discrete edges; they shade into one another continuously. The cost of moving from one belief state to another depends not on a fixed edge weight but on the distance and direction of the move.

Or consider the space of possible strategies for a complex negotiation. Each strategy involves continuous parameters (how much to concede, how aggressive to be, how much to trust the counterpart) as well as discrete choices (what to propose, when to walk away). The space is high-dimensional, continuous in some dimensions and discrete in others, and its cost landscape is shaped by the interaction between the negotiator’s choices and the counterpart’s responses.

These are not edge cases. They are the typical case for human reasoning. And for these spaces, the graph abstraction is not merely inconvenient—it is misleading. A graph has no notion of distance between non-adjacent nodes (except as path length, which is a derived quantity that depends on the graph’s connectivity structure). A graph has no notion of direction. A graph has no notion of curvature—the way a space bends or funnels around a particular point. A graph has no notion of symmetry in the continuous sense—the way a space looks the same under rotation or scaling.

Let us be more precise about what graphs lack and why it matters.

Distance

In a graph, the “distance” between two nodes is the length of the shortest path connecting them. This is an adequate notion of distance only when all meaningful relationships between states are captured by the graph’s edge structure. In continuous spaces, we need a metric: a function d(x, y) that gives the distance between any two points, not just adjacent ones. The metric induces a notion of “closeness” that is intrinsic to the space, not an artifact of our choice of edges. Different metrics on the same set of points give rise to genuinely different geometric structures, and the choice of metric profoundly affects what “nearby” means—and therefore what “small move in reasoning” means.

Curvature

Curvature describes how a space deviates from flatness. In a flat space, geodesics (shortest paths) behave as Euclidean intuition suggests: parallel lines remain parallel, triangles have interior angles summing to \pi, small circles have circumference 2\pi r. In a curved space, these familiar properties fail. Positively curved spaces (like the surface of a sphere) cause parallel geodesics to converge; negatively curved spaces (like a saddle) cause them to diverge.

Why does curvature matter for reasoning? Because it determines the stability of the search trajectory. In a region of positive curvature, nearby search paths tend to converge—small perturbations to the heuristic do not drastically change the path. In a region of negative curvature, nearby paths diverge exponentially—a small perturbation can send the search in a completely different direction. Curvature tells us where reasoning is robust and where it is fragile, a distinction that graphs cannot express.

Symmetry

A symmetry of a space is a transformation that preserves its structure. Rotations are symmetries of a sphere; translations are symmetries of Euclidean space. Symmetries matter for reasoning because they identify distinctions without a difference. If a reasoning problem has a symmetry—if some transformation of the problem’s surface features leaves its underlying structure invariant—then a good reasoner should produce the same answer before and after the transformation.

This is precisely what fails in the framing effects we mentioned earlier. Reframing a moral dilemma in vivid language is a surface transformation that should be a symmetry of the moral judgment space (it changes the description, not the moral content). The fact that it changes the judgment reveals that the reasoner’s heuristic does not respect the symmetry of the underlying space. Graph-based search has no natural vocabulary for expressing this failure, because graphs have no natural notion of continuous symmetry.

Boundaries

Real reasoning spaces have boundaries: regions that are forbidden, incoherent, or unacceptable. A medical diagnosis that violates known physiological constraints is out of bounds. A legal argument that contradicts established precedent is (in most contexts) out of bounds. A moral judgment that endorses gratuitous cruelty is out of bounds. These boundaries are not arbitrary walls in a graph; they are submanifolds of the reasoning space, and they may have their own geometric structure (they may be curved, they may have corners, they may be porous in some places and rigid in others).

The Upshot

The graph abstraction served Newell and Simon well for the discrete puzzle-solving tasks they studied. But for the continuous, high-dimensional, richly structured spaces that arise in genuine reasoning—moral judgment, scientific hypothesis formation, strategic planning, creative design—graphs are a procrustean bed. We need the language of geometry: metrics, curvature, geodesics, symmetry groups, boundaries, fiber bundles, and the rest of the apparatus that mathematics has developed for describing the structure of spaces.

This does not mean we abandon Newell and Simon. We generalize them. The Problem Space Hypothesis is correct: reasoning is search. But the search space is not a graph. It is a manifold—a space that is locally smooth, globally structured, and endowed with geometric properties that profoundly affect the dynamics of any search process that traverses it. [Modeling Axiom.]

1.6 Preview: Geometry Changes Everything

Let us close this chapter with a preview of what happens when we take the geometry of the search space seriously. The payoff is not merely a more elegant formalism. It is a fundamentally more powerful theory of reasoning—one that explains phenomena that graph-based search theory cannot, generates predictions that graph-based theory does not, and opens engineering interventions that graph-based theory cannot motivate.

The Heuristic Becomes a Field

In classical A*, the heuristic h(x) is a function that assigns a number to each node in a graph. There is no relationship between the heuristic values at adjacent nodes except whatever consistency condition we impose. But when the search space has geometry—when it is a manifold with a metric—the heuristic becomes a scalar field: a smooth function on a smooth space. Scalar fields have gradients, level sets, critical points, and curvature. These properties are not available on graphs.

The gradient \nabla h at a point x tells us the direction of steepest estimated improvement—the direction the search should move. The level sets of h (the surfaces where h is constant) are the “contour lines” of the reasoning landscape. Critical points—where \nabla h = 0—are the local minima, local maxima, and saddle points that shape the search dynamics. Minima are potential traps; saddle points are narrow passages that the search must traverse to escape one basin and enter another.

This richer structure immediately suggests new diagnostic questions that have no analogue in graph-based search. Is the heuristic field smooth, or does it have sharp discontinuities? Are its critical points located near the true optimal path, or are they displaced by systematic biases? Does the curvature of the heuristic field match the curvature of the underlying space, or are there regions of mismatch where the heuristic is actively misleading?

Optimal Reasoning Becomes Geodesic

In graph-based search, the optimal solution is the shortest (or cheapest) path. In a geometric search space, the optimal solution is a geodesic: the shortest curve connecting the initial state to the goal, where “shortest” is measured by the metric on the manifold.

Geodesics are determined by the geometry of the space. On a flat plane, geodesics are straight lines. On a sphere, they are great circles. On a general Riemannian manifold, they satisfy the geodesic equation:

\frac{d^2 x^\mu}{dt^2} + \Gamma^\mu_{\alpha\beta} \frac{dx^\alpha}{dt} \frac{dx^\beta}{dt} = 0

where \Gamma^\mu_{\alpha\beta} are the Christoffel symbols encoding the connection structure of the manifold. The geodesic equation tells us that the optimal trajectory depends not only on the start and end points but on the curvature of the space between them. A reasoning space with high curvature in certain regions will bend the optimal path in ways that would be invisible on a flat graph.

[Modeling Axiom.] This identification—optimal reasoning as geodesic traversal—is the central theoretical contribution of this book. It allows us to define reasoning quality as the deviation of the actual reasoning trajectory from the geodesic. A perfect reasoner follows the geodesic exactly. A good reasoner follows a path that stays close to the geodesic. A poor reasoner—or a corrupted one—follows a path that deviates substantially, wasting effort on detours or spiraling into local traps.

Reasoning Failures Become Geometric Pathologies

When reasoning goes wrong, the geometric framework provides a precise diagnosis. Instead of vague labels like “irrational” or “biased,” we can identify the specific geometric pathology:

Heuristic field corruption: the gradient of h points away from the geodesic in a systematic direction, caused by irrelevant features warping the heuristic. This is the geometric characterization of framing effects, emotional anchoring, and social pressure—the surface perturbation distorts \nabla h, bending the search trajectory away from the optimal path.
Local minimum entrapment: the heuristic field has a local minimum (a point where \nabla h = 0 and the Hessian is positive definite) in a region far from the goal. The search descends into this basin and cannot escape without a sufficiently large perturbation. This is the geometric characterization of premature convergence—settling on a plausible but wrong intermediate answer.
Geodesic instability: in regions of negative curvature, nearby geodesics diverge exponentially. A small perturbation to the initial condition or the heuristic sends the search along a dramatically different path. This is the geometric characterization of reasoning fragility—a system that is highly sensitive to irrelevant variations in problem presentation.
Symmetry violation: the reasoning space has a symmetry (a transformation that preserves the underlying problem structure), but the heuristic field does not respect it. The search produces different trajectories—and potentially different answers—for inputs that are related by the symmetry. This is the geometric characterization of invariance failure, and we will argue in Chapter 8 that it is the most fundamental diagnostic of reasoning quality.

The Engineering Implications

Perhaps most importantly, the geometric framework does not merely diagnose failures— it suggests specific interventions:

If the heuristic field is corrupted, we can attempt to reshape it through training on data that explicitly varies the corrupting features while holding the relevant features constant. This is group-theoretic data augmentation, and we discuss it in Chapter 14.
If the reasoning space has regions of high negative curvature (instability), we can attempt to smooth those regions through adversarial training that forces the model to produce consistent outputs under small perturbations.
If the heuristic violates a known symmetry, we can either enforce the symmetry architecturally (by building invariance into the model) or train for it explicitly (by penalizing symmetry-violating outputs).
If the search gets trapped in local minima, we can introduce mechanisms for escaping basins: temperature-based exploration, explicit backtracking, or meta-cognitive monitoring that detects when the search has stalled.

None of these interventions is motivated by graph-based search theory. They arise naturally from the geometric perspective, which is why the geometric perspective is not merely a more elegant formalization but a genuinely more powerful one.

The Road Ahead

This book develops these ideas in four parts.

Part I (Chapters 1–4) establishes the theoretical framework. Having introduced reasoning as search in this chapter, we will enrich the search space with geometric structure in Chapter 2, develop the theory of the heuristic field in Chapter 3, and formalize optimal reasoning as geodesic traversal in Chapter 4.

Part II (Chapters 5–8) applies the framework to the analysis of reasoning failures. We will show that heuristic corruption (Chapter 5), sycophancy (Chapter 6), premature convergence (Chapter 7), and symmetry violation (Chapter 8) all have precise geometric characterizations that explain their causes and predict their behavior.

Part III (Chapters 9–11) addresses the control layer: metacognition as search monitoring (Chapter 9), robustness measurement (Chapter 10), and alignment as heuristic shaping (Chapter 11).

Part IV (Chapters 12–14) presents the empirical program: benchmarks as geometric probes (Chapter 12), the five convergent measurements from the Measuring AGI suite (Chapter 13), and the engineering pipeline from theory to practice (Chapter 14).

The final chapters (15–16) survey open questions and the prospects for geometric reasoning as a research field.

But all of it begins here, with Newell and Simon’s foundational insight. Reasoning is search. The computational structure of thought literally has the structure of traversal through a space of possibilities. What this book adds is the recognition that the space has shape—that it is not a bare graph but a geometric object with distance, curvature, symmetry, and boundary. And once you see the shape, you can see things that were previously invisible: why certain errors are predictable, why certain capabilities are fragile, why certain interventions work and others do not.

The space has geometry. The geometry changes everything.

Worked Example: Dr. Okafor’s First Decision

Let us return to Dr. Okafor’s triage. We can formalize her decision as a search problem (S, s_0, A, T, G).

States. Each state is an assignment of patients to resources: who gets the CT scanner, who goes to the trauma bay, who waits. With three patients and three dispositions (CT, trauma, wait), there are 3! = 6 possible assignments. But Dr. Okafor also considers partial assignments (“put chest pain in trauma, decide about the other two after vitals”), so the state space is somewhat larger — roughly 15 states if we include partial assignments and the option to defer.

Initial state. All three patients in the waiting area, unassigned.

Operators. Assign a patient to a resource. Defer a decision pending more information. Reassign a patient already assigned. Each operator transforms one state into another.

Goal. A complete assignment that maximizes expected patient outcomes under the resource constraints.

The heuristic. Dr. Okafor does not enumerate all 15 states. She uses a heuristic — one built from two decades of pattern recognition — that estimates the urgency of each patient. The chest pain patient triggers a learned pattern: male, over 60, chest pain → high probability of cardiac event → highest urgency. The headache patient is a weaker signal but non-zero: sudden-onset headache in a young woman → possible subarachnoid hemorrhage → medium urgency. The skateboarding injury is vivid but clinically routine → low urgency.

Her heuristic ranks states by estimated clinical risk, and she searches the top-ranked states first. Within seconds she has converged: chest pain to trauma, headache to CT, skateboard waits. A* with a good heuristic, executed in biological hardware.

What could go wrong. Suppose the skateboarding teenager arrives screaming and covered in blood — a dramatic presentation for what is actually a superficial wound. The emotional salience corrupts Dr. Okafor’s heuristic: the vivid distress signal inflates the urgency estimate for the skateboard case and deflates the relative urgency of the quiet elderly man sitting calmly with his chest pain. If the corruption is strong enough, she assigns the teenager to trauma and sends the cardiac patient to the waiting room.

This is heuristic corruption, and it is not a failure of knowledge (she knows chest pain is more dangerous) but a failure of the heuristic signal under a specific perturbation. Chapter 5 will develop this analysis with empirical measurements from AI systems showing displacements of 4.6 to 8.9 standard deviations under analogous perturbations.

The example also previews the geometric extension. In a graph, the six assignment states are just nodes. On a manifold, they occupy a space with structure: the “distance” between assignments that differ by one patient swap is smaller than the distance between completely different orderings. Some regions of the assignment space are forbidden (two patients cannot both occupy the single CT scanner). The geometry — distance, curvature, boundaries — shapes the search dynamics in ways that the next chapter will make precise.

Technical Appendix

A* Optimality (Theorem 1.1, restated formally). [Established Mathematics.] Let (S, s_0, A, T, G) be a finite search problem with non-negative edge costs. Let h: S \to \mathbb{R}_{\geq 0} be an admissible heuristic: h(x) \leq h^*(x) for all x \in S, where h^*(x) is the true cost of an optimal path from x to the nearest goal. Then A* with evaluation function f(x) = g(x) + h(x) terminates at a goal state s^* \in G such that g(s^*) = g^*(s_0), the cost of an optimal solution.

Proof. Suppose A* terminates at a goal s' with g(s') > g^*(s_0). Let P^* be an optimal path from s_0 to some optimal goal s^*. At termination, there exists some node n on P^* that is on the frontier (open list). Then f(n) = g(n) + h(n) \leq g(n) + h^*(n) = g^*(s_0) < g(s') = f(s'), where the first inequality uses admissibility and the last equality uses h(s') = 0 for any goal state. But A* selected s' over n, meaning f(s') \leq f(n), a contradiction. \square

Dominance (Proposition 1.1). If h_1(x) \geq h_2(x) for all non-goal states x, and both are admissible, then every node expanded by A* using h_1 is also expanded by A* using h_2. The tighter heuristic produces a search that is a subset of the looser one.

Notes on Sources

The Problem Space Hypothesis is presented in full in Newell and Simon (1972), Human Problem Solving. The definitive treatment of A* is Hart, Nilsson, and Raphael (1968), “A Formal Basis for the Heuristic Determination of Minimum Cost Paths,” IEEE Transactions on Systems Science and Cybernetics, 4(2), 100–107. The proof of A* optimality given here follows the presentation in Russell and Norvig (2021), Artificial Intelligence: A Modern Approach, 4th edition. For the dominance result on heuristics, see the same source, Chapter 3. The connection between search and cognitive architecture is further developed in Newell (1990), Unified Theories of Cognition, and Laird (2012), The Soar Cognitive Architecture. The manifold hypothesis for neural representations is discussed in Bengio, Courville, and Vincent (2013), “Representation Learning: A Review and New Perspectives,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 1798–1828. The geodesic formulation referenced in Section 1.6 is developed in Bond (2026a), Geometric Methods in Computational Modeling, Chapter 6.

← Geometric Reasoning: From Search to Manifolds Contents Chapter 2: When the Space Has Shape →