Role
Pressure test
Uses hard comparison material to separate sturdy signals from scores that only look strong in easier conditions.
Protects
False confidence
Surfaces where topic, repetition, transcription, or handling artifacts could inflate a reading.
Public View
Aggregate evidence
Keeps the stress-test results readable without exposing protected text, identities, titles, or source maps.
Evidence Frame
What the adversarial dataset is
The adversarial dataset is PRM’s pressure chamber for interpretation.
The Solo Dataset asks what the human corpus can sustain. The adversarial layer asks what happens when model systems try to interpret, compress, attribute, or respond to that kind of language under pressure.
The public page stays at the framework level. It describes the stress environment and diagnostic categories without publishing protected interaction logs, private transcripts, prompts, or source-reconstructable material.
Why it exists
It exists because model behavior is part of the evidence environment.
The point is not to mock failure. The point is to make failure modes measurable and reviewable. When a system misses layered meaning, over-compresses a passage, misattributes novelty, or drifts away from context, that event becomes a diagnostic signal.
That makes the adversarial layer useful for both human interpretation and model-evaluation work: it shows where the material is difficult, and where a model response stops tracking the structure in front of it.
The Kill List
The Kill List is a diagnostic framework for recurring model-failure modes.
Examples include:
- misattribution
- context drift
- missed layered meaning
- false compression
- repetition blindness
- authorship confusion
- overgeneralization
- tone failure
- weak novelty detection
How the stress test works
The adversarial layer turns interpretation into a test environment. It looks for places where a model response loses structure, ignores constraints, collapses distinct ideas together, or explains away behavior that should be inspected.
Those failures are grouped into diagnostic categories. The categories are not a public transcript archive. They are a public-safe way to explain what kind of weakness the project is tracking.
Public-safe evidence
The public evidence uses aggregate gauntlet views. These charts show how measured PRM segments behave under comparison pressure without exposing private adversarial logs.
Pairwise dominance, first-place rate, top-20 heatmaps, and exception counts give different views of stress behavior: direct matchups, repeated wins, broad leaderboard position, and visible exceptions.
Why it matters
The adversarial layer turns PRM from a static corpus into a stress environment. It shows not only what the human corpus contains, but where model systems struggle to interpret it.
For the site narrative, this page keeps the model-facing claim separate from the corpus claim. The corpus pages explain what was measured. The adversarial page explains why interpretation itself became part of the test.
How to read the charts
Start with the gauntlet dashboard to understand the broad comparison environment. Then use the top-20 heatmap to see how tested segments behave across slice sizes.
Pairwise dominance asks whether The Sequel holds up in direct matchups, not only in an aggregate average. First-place rate asks how often it reaches the top across shared views. Exception count is the honesty check: it shows where the result is not first, keeping the stress-test frame grounded.
Public-safe limits
This page does not publish raw model transcripts, protected writing, third-party source text, private prompts, source-level mappings, artist names, album titles, or song titles.
Public-safe boundary
Public pages show aggregate evidence, metric behavior, method provenance, and corpus structure. Protected text, identities, source titles, and reconstructable mappings stay private.
