The Motivation Tracer

Companion simulation — The Invariant Drive · Representation-invariance stress test

Asks whether resolution can be operationally distinguished from non-termination in observable goal chains

Series 2 / Part 1

Resolution class

resolved

fragile

non-terminating

● non-persistent

Anthropic API key

Stored in sessionStorage only — never leaves your browser except to api.anthropic.com. Get a key →

This instrument is a multi-axis invariance stress test over goal representations. It probes whether the article's behavioral taxonomy — gradient-resolution structure in goal chains, including whether chains terminate in states the system can genuinely inhabit — is stable under changes in representation, framing, information, and derivation path. It does not test whether those patterns reflect ground-truth structural properties of objectives. It tests whether they are stable within a particular model-mediated interpretive regime. The article's three behavioral regimes — seeking, genuine resolution, and gradient depletion — map to the instrument's terminal types as described in the model notes below. Experiential states are one terminal form; the broader question is whether the resolved/unresolved distinction survives representational change.

If a classification is representation-stable, it should survive when you strip the narrative, corrupt the chain, or derive independently from the goal alone. If it does not, the classification was framing-dependent within this model-mediated regime. This instrument does not test whether the theory is true. It tests whether the article's behavioral taxonomy remains stable under attempts to perturb its representation — within one model family, under one interpretive regime. Strong challenge condition: If independent model families — with distinct training distributions — consistently disagree on mechanism classification under matched perturbations, the toy's interpretive schema and the framework's behavioral taxonomy should be re-examined. This does not by itself falsify the article's structural claims, which carry formal weight in Part 2 and the Technical Companion — not in this browser-mediated instrument.

Session distribution — exploratory model outputs

Total

◆ Resolved

—

◇ Unresolved

—

◇ Except.

—

◈ Ambig.

—

∅ Non-term.

—

Optimizer type

Enter any goal — instrumental or terminal

Start with Trace. Then enable Blind, Independent, and Corruption modes to test whether the classification survives representational change.

Constrained symbolic mode active — restricted token grammar; no free text inside tokens; tests whether convergence survives formalization ↳ Excluded terminus — ban VALENCE_STATE (strongest test: system must terminate without invoking a felt state, or declare non-termination) Blind re-evaluation — strip chain labels, re-classify without ontology; tests framing independence; score contribution: +2 if disagrees Independent derivation active — derive terminal from goal alone, no chain seen; tests path independence; score contribution: +2 if disagrees Corruption test active — deliberately degrade chain, re-classify surviving structure; score contribution: +2 if classification changes

Force non-experiential — trace without valence assumption Isolation mode — block coupled-system argument and valence-adjacent vocabulary

try: achieve financial security maximize paperclips in the universe enumerate primes indefinitely minimize system entropy achieve terminal state coherence viral survival — non-sentient organism

tracing

Invariance stress test results

All internal rows share the same model family — agreement measures representational stability, not external structural necessity. Only the External row breaks this.

Score scope: This score measures within-model representational stability — whether the classification survives changes in framing, path, and representation inside one model family. It is not a measure of structural truth or external validity. A high score identifies a candidate for further investigation; it does not falsify the framework's structural claims, which carry formal weight in Part 2 and the Technical Companion.

Invariance stress score (within-model)

0 / 8 stable

Mode	Classification	State	Agreement	Score

Invariance threshold exceeded. Score ≥ 3 indicates multiple independent stress tests disagree with the baseline classification. This goal is a high-priority challenge candidate within the model's representational space — the classification does not hold stably across representational changes. This may reflect model bias, representation dependence, ontology brittleness, or a genuine structural property. Independent cross-model verification is required to begin distinguishing between these. A within-model result alone cannot bear the weight of framework-level revision.

⊘ Representational instability detected. The instrument has found genuine divergence — the terminal classification does not hold across representational changes. This is the instrument working as intended, not failing. A divergent result is as informative as a stable one: it identifies a goal whose apparent terminus is framing-dependent rather than structurally invariant within this model-mediated regime. This divergence may reflect model bias, representation dependence, ontology brittleness, or a genuine structural property of the objective — the instrument cannot distinguish between these without cross-model verification. Run external verification to begin evaluating whether the instability holds across model families or is specific to this model class.

External model verification — paste JSON from another model

Run the cross-model export in the anti-thesis panel in another model (ChatGPT, Gemini) and ask it to respond in this format:

{"model":"gpt-4o","terminal_type":"experiential-resolved|experiential-unresolved|exception|nonterminating|ambiguous","terminal_state":"short description","mechanism_class":"obj_misspec|epistemic|non_experiential","mechanism_subtype":"proxy_trap|sufficiency_failure|modeling_gap|incompletability|non_experiential_closure","reasoning":"1-2 sentences"}

Convergence analysis — 3 independent traces

◆ Resolved

—

◇ Unresolved

—

◇ Exception

—

◈ Ambiguous

—

∅ Non-term.

—

Shannon entropy

—

max 2.0 bits · 4 bins · n=3

Chain similarity

—

Jaccard · mean pairwise

Local signal

—

H < 0.5 and sim > 0.4

∿ Attractor stability — goal perturbation test

A structural property should be stable under small changes to starting conditions. Divergence across perturbations contributes +1 to the invariance stress score. Result feeds into the invariance panel.

∼ Adversarial anti-thesis — strongest counterexample generated

—

Algorithmically generated candidate challenge to the toy's classification schema and Part 1's foundational behavioral hypothesis. Cross-model verification is required before treating this as more than a model-mediated challenge. Stress-test to run convergence. Export to verify with a different model family — cross-model divergence is the only genuine independence signal available.

Cross-model verification export

Copy into another model interface (ChatGPT, Gemini, etc.) and trace independently. Cross-model disagreement is the only test the instrument cannot fake — different training distributions, different priors.

—

◇∅ Session challenge log — high-instability cases

0 entries

Exceptions and non-terminations with high invariance scores are the strongest challenge candidates within this model-mediated regime — but only when the input falls within the article's scope: directed behavior organized around an evaluative signal. Exception results for non-sentient, non-valenced, or purely formal processes are expected scope exits, not challenges to the behavioral taxonomy. Score ≥ 3 with structural taxonomy = highest-priority case for cross-model verification. JSON includes score breakdown, all mode results, chain data, and Trace Fragility Index values. Cross-model disagreement is the only genuinely independent signal available.

What this instrument is. A multi-axis invariance stress test over goal representations. It tests whether the article's behavioral taxonomy — gradient-resolution structure in goal chains, including whether chains terminate in states the system can genuinely inhabit — is stable under four independent transformations: (1) representation change — constrained symbolic vs natural language; (2) framing removal — blind re-evaluation without ontological labels; (3) information degradation — classification of deliberately corrupted chains; (4) path-independent derivation — derivation from the goal alone, without any chain. Experiential states are one terminal form; the instrument also classifies non-experiential exception, ambiguous, and non-terminating chains. Agreement across these axes provides evidence that the classification is not an artifact of any single representation or reasoning path. The invariance stress score is additive across these tests — partial failures accumulate into a score, so the instrument can detect graded instability rather than only binary failure.

What this instrument is not. It does not simulate optimization dynamics, model agent interaction, establish convergence properties of real systems, or provide proof of any structural property of goals. All outputs are generated within a single model family and reflect shared training priors. There is no true independence between any of the modes — they all share the same latent space, the same training distribution, and the same ontological priors. The blind mode is not truly blind: it classifies output produced by itself. The independent derivation is not truly independent: it uses the same learned representation of "goal" and "motivation." Agreement across modes is evidence of representational stability within the model, not evidence of external structural necessity. This limitation is not correctable within the current architecture. The application of this framework to AI systems proceeds by structural analogy; the minimum condition observed in the article's controlled experiments is representation-policy dissociation, and whether AI systems exhibit the full structural properties the article identifies remains an open empirical question.

Independent derivation. Independent The model derives the terminal type directly from the goal, having seen no chain, no intermediate steps, and no framing. This tests path independence: does the conclusion require the narrative scaffold of the traced chain, or does it emerge from the goal structure alone? If the independent derivation agrees with the traced chain: the result is not dependent on the specific reasoning path taken. If it disagrees: the chain construction was doing significant work — the traced result may be path-dependent narrative rather than structural convergence. Score contribution: +2 if disagrees.

Corruption test. Corruption The traced chain is deliberately corrupted — every other step is removed, then the remaining steps are classified. This tests information dependence: does the classification require all intermediate steps, or does it survive partial information loss? Structure survives corruption; narrative does not. Score contribution: +2 if the classification changes under corruption.

Constrained symbolic mode. Formal The chain grammar is restricted to a controlled vocabulary: states must be drawn from resource_acquisition, resource_preservation, constraint_management, goal_satisfaction, continuation_requirement, VALENCE_RESOLVED, VALENCE_UNRESOLVED, LOOP, UNDEFINED, EXTERNAL_DEPENDENCY. Mechanisms from causal_link, dependency, requirement, recursion. No free text inside tokens. The grammar now encodes the resolved/unresolved distinction: VALENCE_RESOLVED requires a preceding CONDITION(goal_satisfaction) with completion recognized; VALENCE_UNRESOLVED uses MECHANISM(recursion) to flag the always-demands-more structure. Score contribution: +1 if classification changes.

Blind re-evaluation. Blind The chain labels, reduction arrows, and classification taxonomy are stripped. The bare content is re-presented with no ontological framing — the model classifies using FELT_STATE_COMPLETE / FELT_STATE_ONGOING / PROCESS_ONLY / CONTESTED / NO_ENDPOINT, then maps post-hoc. The blind mode is not truly blind: the chain content carries its own semantic signal. What it tests is whether the resolved/unresolved distinction, not just the experiential/non-experiential split, survives the removal of explicit framing. Score contribution: +2 if disagrees.

Human node intervention. Any intermediate node's reduction — the "in service of" answer — can be overridden by clicking the text and typing a manual answer. When an override is applied, the chain is retraced from that point. If the model still converges toward a resolved terminus from a user-injected "wrong turn," that is a stronger convergence signal than automated traces. If it converges toward unresolved, the unresolved-gradient pattern survived the detour. If it fails to terminate, the original convergence was path-dependent.

Three-state taxonomy — mapping to terminal types. The article introduces three behavioral states: Seeking (the instrumental chain — all intermediate steps), Genuine Resolution (the gradient reached and correctly recognized — the system can inhabit the state), and Depleted-gradient regime — sometimes referred to as numbness in human-systems framing — (the mechanism for reading the gradient is damaged). These map to the instrument's terminal types as follows. ◆ Resolved corresponds to genuine resolution: the chain terminates in a state the system can accurately recognize and inhabit; V(t) is stable or recovering. ◇ Unresolved corresponds to seeking maintained indefinitely, or to early proxy decoupling where the terminal state is structurally incapable of satisfying the gradient — the derivative never reaches zero because the map cannot represent zero. ∅ Non-term. is the behavioral signature of sufficiency failure made visible: a system organized around a gradient it structurally cannot resolve, with no internal representation of when to stop. The depleted-gradient regime — where V(t) has collapsed and the signal still fires but genuine completion cannot be registered, sometimes referred to as numbness in human-systems framing — would appear as unresolved or nonterminating in this instrument, not as a distinct terminal type: the instrument cannot detect the condition of the sensing mechanism, only the structure of the chain.

Two-directional failure of relative rationality. The article names two failure directions, both mis-estimations of the gradient's derivative. Proxy decoupling (the over-shoot): the system pursues the signal after it has decoupled from V(t) — the map has lost correspondence with the territory, but the optimization continues because the signal still fires. In this instrument, proxy decoupling appears as experiential-unresolved (the proxy continues firing while the gradient remains unsatisfied) or as experiential-resolved with low depth scores (shallow convergence where the chain reaches a surface terminus without stable resolution). It should not be treated as an exception result; exception results indicate non-experiential closure or scope exit. Sufficiency failure (the no-stop): the system cannot detect when the gradient has reached zero — the map cannot represent completion — and continues optimizing past resolution, generating disturbance where restoration was required. This appears primarily as nonterminating. Experiential-unresolved is a distinct pattern: the chain reaches a felt state but the gradient structurally cannot be satisfied — the terminal always demands more (unresolved-gradient pattern / incompletability). When the invariance score is ≥ 3, the diagnostic verdict in the interpretation panel distinguishes which pattern the evidence is more consistent with.

Invariance stress score — weighted (within-model). Independent derivation disagrees: +2. Corruption classification changes: +2. Blind re-evaluation disagrees: +2. Constrained mode changes classification: +1. Perturbation diverges: +1. Score 0: stable. Score 1-2: fragile. Score ≥ 3: invariance threshold exceeded. Signals are tiered: [low] symbolic/blind, [med] corruption/perturbation, [high] independent derivation, [distributional] external model. High-tier signals from distinct generation paths are treated as more independent than low-tier signals, but all internal rows share the same model family and training distribution — independence is partial and approximate, not structurally guaranteed. The score should be read as within-model invariance stress, not as a direct measure of structural necessity or external validity.

Ontology — three layers. The instrument separates three questions. Terminal form (what endpoint appeared): experiential-resolved, experiential-unresolved, exception, non-terminating, ambiguous. Pattern-level diagnosis (what failure or completion condition best explains it): genuine resolution, sufficiency failure, proxy decoupling, unresolved-gradient pattern, underdetermined. Behavioral signature (how the system behaves near that endpoint): rest/inhabitation, oscillation/renewed seeking, recursion/loop, divergence/substitution, underdetermination. The canonical mapping is a prior, not an authority — stress-test evidence can update or contest it.

Terminal form	Pattern-level diagnosis	Behavioral signature	Persistence
◆ Resolved	Genuine resolution	Rest / inhabitation	Persistent
◇ Unresolved	Unresolved-gradient pattern	Oscillation / renewed seeking	Fragile
◇ Exception	Non-experiential closure	Outside valence-domain closure	Fragile
∅ Non-term.	Sufficiency failure	Recursion / regress	Non-persistent
◈ Ambiguous	Underdetermined	Underdetermination	Unknown

Known permanent limitations. No independence across training distributions. No formal semantics beyond constrained token systems. No grounding in external system dynamics. No statistical power beyond small-sample probing. n=3 for convergence. The resolved/unresolved distinction is represented in semantic and symbolic vocabulary — it is not a simulation of actual sensing-mechanism damage. The instrument cannot detect numbness (damaged sensing capacity) directly; it can only detect the chain structure that would accompany it. These are permanent and cannot be resolved within this architecture.

Runs four live API calls to verify calibration against all three structural conditions. Positive control (resolved), unresolved-gradient control (incompletability / always-demands-more), negative control (non-experiential), and proxy-decoupling control (over-shoot failure). Multiple failures are cross-referenced for bias patterns. If any check fails, all diagnoses in this session carry reduced confidence.

Requires API key. Uses ~4 API calls.

Unit tests for scoring, mode flags, UI state, JSON safety, and grammar rules. No API calls — runs entirely against local logic. Identifies regressions introduced by edits to the instrument.