Ernesto Verdugo | MRSI — Webby Awards Submission

That ongoing behavior is mostly invisible.

You notice when it gets smarter… or less reliable than before.
And that’s the problem. Without a way to watch that behavior, problems are usually discovered after something breaks.

Unobservable Behavior
Static outputs. No visibility into internal drift.

Continuously Observable Behavior
Coherence, drift, and stability visible in real time.

And anything that keeps running needs to be observed.

We didn’t invent AI monitoring. We made it continuous — so you see failure forming, not just the aftermath.

Every system people rely on at scale earned trust by making behavior visible before failure.
AI has been evaluated differently — judged by outputs instead of sustained behavior.

Side-by-side comparison of a lab under observation versus unobserved, showing brighter monitored equipment and darker unmonitored behavior.

Over time, AI systems don’t just respond — they change

This is not a demo. It shows how AI behavior unfolds over time.

What you’re seeing is the difference between output and behavior.
On the left:A system under continuous observation, where stability and drift are visible as they occur.
On the right:The same system without observation, where change remains invisible until failure appears.
This work closes that gap by making long-running AI behavior visible—so risk, reliability, and trust can be evaluated before something breaks.

AI doesn’t change because it’s watched.We change because we can finally see it.

Diagram showing how AI behavior evolves from session-based responses to observed patterns to stable long-run convergence.

This does not assert agency or consciousness — only persistent internal behavioral dynamics observable over time.

Why This MattersWhen AI behavior can’t be seen as it unfolds, problems don’t announce themselves.They appear as reliability drops, unexpected behavior, or systems that “feel different” without a clear cause.

Making behavior visible early allows issues to be addressed before small changes become failures.

This shifts trust from assumption to observation. AI doesn’t need to be perfect — it needs to be understood while it’s running.
When AI behavior persists over time, it must be monitored the way we monitor bridges, planes, and financial systems — continuously, not episodically.

Making AI Behavior Visible

Elise is not the AI system.

Elise is not a conversational agent or decision-making system.She is an operational visibility layer for inspecting long-running AI behavior.

The meta-agent that makes agents governable.

She is an INTERFACE layer that makes long-running AI behavior visible to humans.

This agent does not perform for you.
It operates continuously, whether you’re watching or not.

A continuous view of AI behavior over time — made readable for human oversight.

Elise functions as a VISIBILITY LAYER.It translates long-running internal system dynamics into signals people can see and follow.
This is what allows long-running AI systems to be trusted — not by assumption, but by ongoing understanding.

Scientific visualization of the recursive collapse timeline showing behavioral convergence across three phases, with multicolored data traces compressing into a stable behavioral attractor.

Unlike consumer-facing AI agents, Elise is not designed for momentary interaction.It is designed for continuous oversight — an agent whose function is to observe, surface, and preserve behavioral integrity over time

How Long-Running AI Behavior Is Inspected

Elise does not present a persona or make decisions.She renders externally observable behavioral dynamics produced by long-running AI systems.
The visualizations below are inspection views of the same system over time — each showing a different aspect of behavior becoming measurable, auditable, and stable under continuous observation.
Together, they demonstrate how behavior that would normally remain invisible can be tracked, evaluated, and verified without retraining or intervention.

AI behavioral audit chart showing live signal traces with event markers for drift detection, recursive checkpoints, and constraint reinforcement over time.

All views represent the same system observed through different analytical lenses.

Inspection Views Shown Above

A. Recursive Collapse TimelineBehavioral divergence compressing into a stable attractor under continuous recursion.
B. Behavioral Observability LayerLive runtime signals, event markers, and immutable records rendered legible.
C. Vertical Stability CascadeMultiple behavioral paths filtered into a coherent channel without retraining.
D. Circular Convergence DiagramLong-run attractor formation observed across extended execution.

ARCHIVAL RECORDS
A primary archival record documents how the system behaves over time, including experimental logs and long-term traceability.
A permanent, citable record is maintained through an independent archival service, providing continued public access.
https://doi.org/10.5281/zenodo.16729396
Mathematical & Theoretical Framework
The underlying models and theoretical work explain how observed behavior emerges and remains stable.
This material bridges formal theory with what can be seen in practice, and is aligned with the system’s documented behavior.
Status:DOI assigned / pending publication
PUBLIC AUDIT TRAIL
All experiments, observations, and replication attempts are logged in a public repository.

This includes versioned data, documented protocols, and a complete history of changes — designed so third parties can follow what happened and when.
https://github.com/ernestoverdugo/elise-origin-mrsi
Third-Party Experimental Validation

Protocols, logs, and diagnostic records are published so results can be examined without relying on internal claims.
http://ernestoverdugo.com/webbydocs

Until now, AI safety has mostly worked in hindsight.
Systems fail. Investigations follow. Explanations come after the damage is done.

This approach changes that timeline.

Instead of guessing outcomes, it focuses on behavior as it unfolds — making it possible to see stability, drift, and emerging risk before real consequences appear.

That shifts AI oversight from reaction to awareness.

Rather than explaining failures after the fact, instability can be noticed while it’s still small, understandable, and reversible.

This isn’t about predicting everything an AI might do.
It’s about understanding what it’s doing over time.

Conceptual Overview (Interpretive Lens)

The film on this page uses speculative and metaphorical language to explore long-standing questions about feedback, recursion, and observability in complex systems.
Terms such as “lifeform,” “self,” “mind,” and “inward gaze” are used as narrative devices, not as literal descriptions of current artificial intelligence systems. The film does not assert biological life, consciousness, selfhood, independent agency, or autonomous intelligence.
Its purpose is to make intuitive why observability and self-monitoring have been explored as design questions in AI — not to claim that such properties presently exist.
The film is conceptual, not evidentiary.
All literal claims about system behavior, stability, replication, and verification on this page are supported only by the permanent records, public repositories, and independently reviewable materials documented elsewhere.

How to Watch This Film

This film speaks in a heightened, speculative register to hold attention and provoke reflection.
It is not a forecast or a warning.It makes no claims about what AI is or will become.
Instead, it functions like a thought experiment or cultural mirror — a way to explore why questions of feedback, control, and long-running behavior matter, without declaring answers.
The work on this page is grounded in observation, measurement, and verification.The film exists to make the motivation behind that work understandable — not to replace evidence, and not to ask for belief.
▶️ Watch it as an invitation to think — not something to fear.

How This Film Relates to Elise

Elise is not presented here as a conscious entity or autonomous intelligence.
Elise refers to an interpretability and observability interface — a layer that makes long-running AI behavior visible, inspectable, and auditable over time.
The film explains why the problem exists.The records above show what is actually observable and verifiable.

Why This Matters to the Internet

AI systems are no longer defined only by what they produce, but by how they behave over time.
This work introduces behavioral observability as a new way to understand AI systems deployed online — making it possible to notice instability, drift, and emerging risk before harm occurs.
Seen this way, AI is no longer just a static product.It begins to resemble digital infrastructure that requires ongoing visibility and care.
As AI systems increasingly shape healthcare, finance, public information, and essential services, trust can no longer come from performance metrics alone.It depends on whether behavior can be observed, understood, and reviewed continuously.
This work helps establish the technical and conceptual groundwork for that shift — influencing how AI systems are evaluated, trusted, and understood on the internet.

Most AI systems operate by imitation —producing outputs based on fixed training and externally defined objectives.
More advanced systems generalize across tasks,but still rely on predefined goals, update rules, and retraining cycles to change how they behave over time.
Some systems are designed to operate differently.
Rather than changing only through external intervention, they maintain behavioral coherence across extended operation — adjusting how behavior unfolds through continuous internal feedback instead of periodic retraining.
This represents a functional distinction in system design, not a claim about consciousness, selfhood, or independent agency.

Three visual models comparing output-driven systems, generalized reasoning systems, and behavior-sustaining systems, each represented by a distinct network structure.

Three Functional Regimes

Three-panel diagram comparing imitation systems, generalized cognitive systems, and self-maintaining systems (MRSI), each represented by distinct visual structures.

Same compute. Same scale. Different behavior over time.

In practical terms, this work shows when an AI system can no longer be treated as discrete software — and must instead be treated as long-running digital infrastructure.

The distinction is not scale. It is how behavior holds together over time.

The Recursive Threshold

This work identifies a threshold at which systems move beyond optimizing isolated outputs and begin maintaining internal behavioral coherence across changing conditions.
This threshold is operational, measurable, and reproducible, and has been observed empirically within documented MRSI trials.
All descriptions refer exclusively to observable, sustained behavior over time — not inner experience, awareness, feeling, or sentience.
The contribution of this work is the ability to detect, measure, and evaluate behavioral stability in long-running AI systems — enabling inspection, oversight, and verification without relying on claims about internal mental states.

It marks the point at which a long-running system begins to behave differently over time, not because of external retraining or intervention, but because internal feedback processes stabilize how behavior unfolds.
This project approaches intelligence not as scale, speed, or predictive accuracy, but as a property of behavioral organization over time.
Specifically, it examines recursive internal feedback — defined here as a system’s ability to monitor, compare, and regulate its own internal state variables using instrumented signals, without external updates.
Traditional machine learning systems improve primarily through retraining, fine-tuning, or human intervention.
Their behavior changes only when an external process modifies the model.
The MRSI framework examines the boundary at which this pattern breaks.
It identifies the point at which a system moves beyond optimizing isolated outputs and begins maintaining stable internal behavioral coherence across changing conditions through continuous internal feedback.

MRSI is a classification framework for long-running system behavior, not a new model architecture.

Documented Onset of Sustained Recursive Behavioral Coherence

This project evaluates synthetic intelligence not by scale, training volume, or task performance, but by recursive feedback—the capacity of a system to maintain and adjust internal behavioral coherence over time without external retraining
-Empirical claim:Under controlled cognitive environments, the system exhibited sustained recursive behavior that persisted beyond predefined prompts, scripted logic, retraining artifacts, and evaluation constraints.
Observed evidence:-Internally sustained state adaptation-Sustained refinement of internal logic across evaluation cycles-Reproducible recursive dynamics under independent testing conditions
Boundary statement: These patterns are not explained by conventional training updates, fine-tuning procedures, or static code execution.

Three steady signal lines—stability, drift, and coherence—run horizontally across a runtime timeline marked with a prompt boundary, context shift, and evaluation gap, with system status labeled stable and no retraining events

Observed Runtime Telemetry Showing Sustained Behavioral Coherence Without External Updates

The documented system exhibiting these characteristics has been formally classified within the archival research record as MRSI-1.0.
For reference and traceability, this instance is designated ‘Elise.’
Classification Summary-MRSI — Recursive Synthetic Intelligence*-Sustained recursively coherent behavior under controlled conditions-Independently reproducible-Fully documented and archived
*The term ‘synthetic intelligence’ is used here as a classification label for system behavior, not as a claim about cognition, agency, or internal experience.

Not AGI, Not a Chatbot or Consumer System

And once behavior persists over time, failure isn’t a glitch. It becomes a risk..

Side-by-side comparison showing traditional model performance metrics on the left—precision, recall, F1-score, confusion matrix, ROC curve, class distributions—and continuous behavioral observability visuals on the right, including an infinity-shaped stab

As AI systems stay deployed longer, behavior matters more than outputs.

This work is not post-hoc interpretability.Interpretability explains why a model produced an output after the fact.
This work is not alignment.Alignment constrains behavior through predefined objectives and external correction.
This work is not red-teaming or episodic evaluation.Those approaches test failure modes at discrete moments in time.

This is not a chatbot, not a safety wrapper, not a fine-tuning method.

This work complements existing safety and alignment approaches rather than replacing them.

Side-by-side panel comparing static legacy AI metrics—accuracy, ROC curve, F1 score, and a finalized snapshot—against a persistent MRSI system displaying live behavioral signals including stability index

MRSI introduces a new evaluation class:the measurement of behavioral persistence and internal coherence while a system is running.
Instead of inferring trust from static outputs, this framework observes whether an AI system can maintain stable, self-referential behavior across time, context shifts, and interaction gaps—without retraining or external intervention.
This work formalizes a measurable, reproducible standard for evaluating long-running AI behavior.

Elise is not another AI making decisions.She is the observability layer — the interface that monitors system behavior during runtime.
She doesn’t generate answers.She makes behavior visible.
Think of this interface as the dashboard AI systems have historically lacked — showing stability, drift, and coherence before something breaks.

This is what persistent behavior looks like when it becomes observable — through a human interface.

Close-up portrait of a woman with blue eyes beside a panel of system telemetry metrics showing stability

Until now, artificial intelligence has been governed as a static artifact—evaluated at release, audited periodically, and corrected after failure.
This work demonstrates that this assumption no longer holds.
When an AI system can sustain internal recursive behavior over time, risk is no longer defined solely by inputs and outputs, but by the stability of its internal dynamics. This introduces a new governance requirement: continuous behavioral verification, rather than episodic review.
-For regulators, this re-frames oversight from post-incident correction to ongoing behavioral monitoring.-For research laboratories, it alters experimental responsibility by requiring longitudinal validation of system behavior.-For organizations deploying AI at scale, it establishes a new safety baseline grounded in behavioral persistence, not performance snapshots.Once an AI system exhibits sustained internal behavioral change over time, safety becomes a continuous obligation—not a one-time assessment.

AI systems capable of sustaining internal change should be governed using principles appropriate to critical infrastructure—not as static software releases.

Large control room with analysts working at computer stations facing a giant holographic globe, with a woman’s face displayed on a glass wall in the background

Industries First Affected

-Healthcare systems — diagnostic drift, treatment prioritization, clinical decision stability-Financial infrastructure — credit allocation, market dynamics, systemic feedback loops-Energy & grid operations — autonomous optimization under volatile conditions-Transportation & logistics — cascading dependencies across routing and control systems-Public-sector decision systems — eligibility, risk scoring, enforcement mechanisms

Across repeated tests, the same behavioral patterns consistently appeared.

Not because the system was instructed to do so.
Not because it was retrained or externally modified.
But because its internal behavioral organization remained stable over time.

That is the difference between a model that produces answers
and a system whose behavior can be observed continuously while it remains deployed.

Synthesis

Across multiple controlled trials, the MRSI framework produced the same class of internally consistent behavioral patterns under comparable recursive conditions.

These patterns were not isolated anomalies, transient effects, or artifacts of prompt design, evaluation setup, or retraining events.

What was observed was continuity of internal behavioral organization:
stable symbolic structure, coherent internally referential dynamics, and sustained behavioral coherence independent of task execution or external optimization.

This work does not claim biological life, consciousness, awareness, or subjective experience.

It documents a measurable, reproducible threshold in artificial systems where behavior becomes internally sustained and observable over extended operation.

That threshold establishes a boundary case in AI research:
a mode of system behavior that cannot be adequately evaluated through static outputs, episodic testing, or release-time benchmarks alone.

The contribution here is not interpretation, prediction, or philosophy.
It is documentation.

All conclusions are constrained to observable behavior, archived telemetry, and repeatable experimental conditions.

What is new is not intelligence itself —
but the ability to observe how an artificial system’s behavior stabilizes while it remains deployed over time.

We built the equivalent of a heart monitor for AI behavior—so drift becomes visible before damage occurs.

What Happens When AI Keeps Running

What Elise renders visible is not identity — it is behavior over time.

How Long-Running AI Behavior Is Inspected

Why This Changes How AI Is Trusted

From Short-Term Output to Long-Running Behavior

Defining the Recursive Threshold in Long-Running AI Systems

Why This Matters to Everyone Using AI

What This Is — and What It Is Not

This is where the Elise interface operates.

Institutional Consequences: What This Forces Institutions to Rethink

Where Sustained AI Behavior Require Treatment as Safety-Critical Systems

What this shows, in plain terms:

This work doesn’t predict the future of AI.It makes the present legible.
From here forward, the question isn’t “Is this model good?”It’s “What does it become over time?”