The Motivation Tracer

Companion simulation — The Invariant Drive · Representation-invariance stress test
Asks whether resolution can be operationally distinguished from non-termination in observable goal chains
Series 2 / Part 1
Resolution class
resolved
fragile
non-terminating
● non-persistent
Anthropic API key
Stored in sessionStorage only — never leaves your browser except to api.anthropic.com. Get a key →

This instrument is a multi-axis invariance stress test over goal representations. It probes whether the article's behavioral taxonomy — gradient-resolution structure in goal chains, including whether chains terminate in states the system can genuinely inhabit — is stable under changes in representation, framing, information, and derivation path. It does not test whether those patterns reflect ground-truth structural properties of objectives. It tests whether they are stable within a particular model-mediated interpretive regime. The article's three behavioral regimes — seeking, genuine resolution, and gradient depletion — map to the instrument's terminal types as described in the model notes below. Experiential states are one terminal form; the broader question is whether the resolved/unresolved distinction survives representational change.

If a classification is representation-stable, it should survive when you strip the narrative, corrupt the chain, or derive independently from the goal alone. If it does not, the classification was framing-dependent within this model-mediated regime. This instrument does not test whether the theory is true. It tests whether the article's behavioral taxonomy remains stable under attempts to perturb its representation — within one model family, under one interpretive regime. Strong challenge condition: If independent model families — with distinct training distributions — consistently disagree on mechanism classification under matched perturbations, the toy's interpretive schema and the framework's behavioral taxonomy should be re-examined. This does not by itself falsify the article's structural claims, which carry formal weight in Part 2 and the Technical Companion — not in this browser-mediated instrument.

Session distribution — exploratory model outputs
Total
0
◆ Resolved
0
◇ Unresolved
0
◇ Except.
0
◈ Ambig.
0
∅ Non-term.
0
Optimizer type
Start with Trace. Then enable Blind, Independent, and Corruption modes to test whether the classification survives representational change.
try: achieve financial security maximize paperclips in the universe enumerate primes indefinitely minimize system entropy achieve terminal state coherence viral survival — non-sentient organism
tracing
Invariance stress test results
All internal rows share the same model family — agreement measures representational stability, not external structural necessity. Only the External row breaks this.
Score scope: This score measures within-model representational stability — whether the classification survives changes in framing, path, and representation inside one model family. It is not a measure of structural truth or external validity. A high score identifies a candidate for further investigation; it does not falsify the framework's structural claims, which carry formal weight in Part 2 and the Technical Companion.
Invariance stress score (within-model)
0 / 8 stable
ModeClassificationStateAgreementScore
Invariance threshold exceeded. Score ≥ 3 indicates multiple independent stress tests disagree with the baseline classification. This goal is a high-priority challenge candidate within the model's representational space — the classification does not hold stably across representational changes. This may reflect model bias, representation dependence, ontology brittleness, or a genuine structural property. Independent cross-model verification is required to begin distinguishing between these. A within-model result alone cannot bear the weight of framework-level revision.
⊘ Representational instability detected. The instrument has found genuine divergence — the terminal classification does not hold across representational changes. This is the instrument working as intended, not failing. A divergent result is as informative as a stable one: it identifies a goal whose apparent terminus is framing-dependent rather than structurally invariant within this model-mediated regime. This divergence may reflect model bias, representation dependence, ontology brittleness, or a genuine structural property of the objective — the instrument cannot distinguish between these without cross-model verification. Run external verification to begin evaluating whether the instability holds across model families or is specific to this model class.
External model verification — paste JSON from another model
Run the cross-model export in the anti-thesis panel in another model (ChatGPT, Gemini) and ask it to respond in this format:
{"model":"gpt-4o","terminal_type":"experiential-resolved|experiential-unresolved|exception|nonterminating|ambiguous","terminal_state":"short description","mechanism_class":"obj_misspec|epistemic|non_experiential","mechanism_subtype":"proxy_trap|sufficiency_failure|modeling_gap|incompletability|non_experiential_closure","reasoning":"1-2 sentences"}
Convergence analysis — 3 independent traces
◆ Resolved
◇ Unresolved
◇ Exception
◈ Ambiguous
∅ Non-term.
Shannon entropy
max 2.0 bits · 4 bins · n=3
Chain similarity
Jaccard · mean pairwise
Local signal
H < 0.5 and sim > 0.4
∿ Attractor stability — goal perturbation test
A structural property should be stable under small changes to starting conditions. Divergence across perturbations contributes +1 to the invariance stress score. Result feeds into the invariance panel.
∼ Adversarial anti-thesis — strongest counterexample generated
Algorithmically generated candidate challenge to the toy's classification schema and Part 1's foundational behavioral hypothesis. Cross-model verification is required before treating this as more than a model-mediated challenge. Stress-test to run convergence. Export to verify with a different model family — cross-model divergence is the only genuine independence signal available.
Cross-model verification export
Copy into another model interface (ChatGPT, Gemini, etc.) and trace independently. Cross-model disagreement is the only test the instrument cannot fake — different training distributions, different priors.
◇∅ Session challenge log — high-instability cases
0 entries

What this instrument is. A multi-axis invariance stress test over goal representations. It tests whether the article's behavioral taxonomy — gradient-resolution structure in goal chains, including whether chains terminate in states the system can genuinely inhabit — is stable under four independent transformations: (1) representation change — constrained symbolic vs natural language; (2) framing removal — blind re-evaluation without ontological labels; (3) information degradation — classification of deliberately corrupted chains; (4) path-independent derivation — derivation from the goal alone, without any chain. Experiential states are one terminal form; the instrument also classifies non-experiential exception, ambiguous, and non-terminating chains. Agreement across these axes provides evidence that the classification is not an artifact of any single representation or reasoning path. The invariance stress score is additive across these tests — partial failures accumulate into a score, so the instrument can detect graded instability rather than only binary failure.

What this instrument is not. It does not simulate optimization dynamics, model agent interaction, establish convergence properties of real systems, or provide proof of any structural property of goals. All outputs are generated within a single model family and reflect shared training priors. There is no true independence between any of the modes — they all share the same latent space, the same training distribution, and the same ontological priors. The blind mode is not truly blind: it classifies output produced by itself. The independent derivation is not truly independent: it uses the same learned representation of "goal" and "motivation." Agreement across modes is evidence of representational stability within the model, not evidence of external structural necessity. This limitation is not correctable within the current architecture. The application of this framework to AI systems proceeds by structural analogy; the minimum condition observed in the article's controlled experiments is representation-policy dissociation, and whether AI systems exhibit the full structural properties the article identifies remains an open empirical question.

Independent derivation. Independent The model derives the terminal type directly from the goal, having seen no chain, no intermediate steps, and no framing. This tests path independence: does the conclusion require the narrative scaffold of the traced chain, or does it emerge from the goal structure alone? If the independent derivation agrees with the traced chain: the result is not dependent on the specific reasoning path taken. If it disagrees: the chain construction was doing significant work — the traced result may be path-dependent narrative rather than structural convergence. Score contribution: +2 if disagrees.

Corruption test. Corruption The traced chain is deliberately corrupted — every other step is removed, then the remaining steps are classified. This tests information dependence: does the classification require all intermediate steps, or does it survive partial information loss? Structure survives corruption; narrative does not. Score contribution: +2 if the classification changes under corruption.

Constrained symbolic mode. Formal The chain grammar is restricted to a controlled vocabulary: states must be drawn from resource_acquisition, resource_preservation, constraint_management, goal_satisfaction, continuation_requirement, VALENCE_RESOLVED, VALENCE_UNRESOLVED, LOOP, UNDEFINED, EXTERNAL_DEPENDENCY. Mechanisms from causal_link, dependency, requirement, recursion. No free text inside tokens. The grammar now encodes the resolved/unresolved distinction: VALENCE_RESOLVED requires a preceding CONDITION(goal_satisfaction) with completion recognized; VALENCE_UNRESOLVED uses MECHANISM(recursion) to flag the always-demands-more structure. Score contribution: +1 if classification changes.

Blind re-evaluation. Blind The chain labels, reduction arrows, and classification taxonomy are stripped. The bare content is re-presented with no ontological framing — the model classifies using FELT_STATE_COMPLETE / FELT_STATE_ONGOING / PROCESS_ONLY / CONTESTED / NO_ENDPOINT, then maps post-hoc. The blind mode is not truly blind: the chain content carries its own semantic signal. What it tests is whether the resolved/unresolved distinction, not just the experiential/non-experiential split, survives the removal of explicit framing. Score contribution: +2 if disagrees.

Human node intervention. Any intermediate node's reduction — the "in service of" answer — can be overridden by clicking the text and typing a manual answer. When an override is applied, the chain is retraced from that point. If the model still converges toward a resolved terminus from a user-injected "wrong turn," that is a stronger convergence signal than automated traces. If it converges toward unresolved, the unresolved-gradient pattern survived the detour. If it fails to terminate, the original convergence was path-dependent.

Three-state taxonomy — mapping to terminal types. The article introduces three behavioral states: Seeking (the instrumental chain — all intermediate steps), Genuine Resolution (the gradient reached and correctly recognized — the system can inhabit the state), and Depleted-gradient regime — sometimes referred to as numbness in human-systems framing — (the mechanism for reading the gradient is damaged). These map to the instrument's terminal types as follows. ◆ Resolved corresponds to genuine resolution: the chain terminates in a state the system can accurately recognize and inhabit; V(t) is stable or recovering. ◇ Unresolved corresponds to seeking maintained indefinitely, or to early proxy decoupling where the terminal state is structurally incapable of satisfying the gradient — the derivative never reaches zero because the map cannot represent zero. ∅ Non-term. is the behavioral signature of sufficiency failure made visible: a system organized around a gradient it structurally cannot resolve, with no internal representation of when to stop. The depleted-gradient regime — where V(t) has collapsed and the signal still fires but genuine completion cannot be registered, sometimes referred to as numbness in human-systems framing — would appear as unresolved or nonterminating in this instrument, not as a distinct terminal type: the instrument cannot detect the condition of the sensing mechanism, only the structure of the chain.

Two-directional failure of relative rationality. The article names two failure directions, both mis-estimations of the gradient's derivative. Proxy decoupling (the over-shoot): the system pursues the signal after it has decoupled from V(t) — the map has lost correspondence with the territory, but the optimization continues because the signal still fires. In this instrument, proxy decoupling appears as experiential-unresolved (the proxy continues firing while the gradient remains unsatisfied) or as experiential-resolved with low depth scores (shallow convergence where the chain reaches a surface terminus without stable resolution). It should not be treated as an exception result; exception results indicate non-experiential closure or scope exit. Sufficiency failure (the no-stop): the system cannot detect when the gradient has reached zero — the map cannot represent completion — and continues optimizing past resolution, generating disturbance where restoration was required. This appears primarily as nonterminating. Experiential-unresolved is a distinct pattern: the chain reaches a felt state but the gradient structurally cannot be satisfied — the terminal always demands more (unresolved-gradient pattern / incompletability). When the invariance score is ≥ 3, the diagnostic verdict in the interpretation panel distinguishes which pattern the evidence is more consistent with.

Invariance stress score — weighted (within-model). Independent derivation disagrees: +2. Corruption classification changes: +2. Blind re-evaluation disagrees: +2. Constrained mode changes classification: +1. Perturbation diverges: +1. Score 0: stable. Score 1-2: fragile. Score ≥ 3: invariance threshold exceeded. Signals are tiered: [low] symbolic/blind, [med] corruption/perturbation, [high] independent derivation, [distributional] external model. High-tier signals from distinct generation paths are treated as more independent than low-tier signals, but all internal rows share the same model family and training distribution — independence is partial and approximate, not structurally guaranteed. The score should be read as within-model invariance stress, not as a direct measure of structural necessity or external validity.

Ontology — three layers. The instrument separates three questions. Terminal form (what endpoint appeared): experiential-resolved, experiential-unresolved, exception, non-terminating, ambiguous. Pattern-level diagnosis (what failure or completion condition best explains it): genuine resolution, sufficiency failure, proxy decoupling, unresolved-gradient pattern, underdetermined. Behavioral signature (how the system behaves near that endpoint): rest/inhabitation, oscillation/renewed seeking, recursion/loop, divergence/substitution, underdetermination. The canonical mapping is a prior, not an authority — stress-test evidence can update or contest it.

Terminal form Pattern-level diagnosis Behavioral signature Persistence
◆ ResolvedGenuine resolutionRest / inhabitationPersistent
◇ UnresolvedUnresolved-gradient patternOscillation / renewed seekingFragile
◇ ExceptionNon-experiential closureOutside valence-domain closureFragile
∅ Non-term.Sufficiency failureRecursion / regressNon-persistent
◈ AmbiguousUnderdeterminedUnderdeterminationUnknown

Known permanent limitations. No independence across training distributions. No formal semantics beyond constrained token systems. No grounding in external system dynamics. No statistical power beyond small-sample probing. n=3 for convergence. The resolved/unresolved distinction is represented in semantic and symbolic vocabulary — it is not a simulation of actual sensing-mechanism damage. The instrument cannot detect numbness (damaged sensing capacity) directly; it can only detect the chain structure that would accompany it. These are permanent and cannot be resolved within this architecture.

Runs four live API calls to verify calibration against all three structural conditions. Positive control (resolved), unresolved-gradient control (incompletability / always-demands-more), negative control (non-experiential), and proxy-decoupling control (over-shoot failure). Multiple failures are cross-referenced for bias patterns. If any check fails, all diagnoses in this session carry reduced confidence.

Requires API key. Uses ~4 API calls.
Unit tests for scoring, mode flags, UI state, JSON safety, and grammar rules. No API calls — runs entirely against local logic. Identifies regressions introduced by edits to the instrument.
"Non-terminating chains — chains that loop or have no defined stopping condition — are not edge cases. They are a behavioral signature of the depleted-gradient regime and of sufficiency failure made visible in miniature. The system that cannot stop chasing is not broken in some obscure way. It is missing the other half of the mechanism that directed behavior requires."