Spark Economy V4: Absolute Quality Scoring¶
Core Principle¶
Sparks measure absolute contribution quality, not competitive placement. Your income reflects how good you were, not how bad everyone else was. A roundtable where everyone is brilliant generates more sparks than one where everyone phones it in.
1. Scoring — The 0-3-10 Scale¶
After each roundtable, the Judge (Sonnet) evaluates every agent independently on 4 axes. Each axis point = 1 spark earned.
| Axis | 0 | 1 | 2 | 3 | 10 |
|---|---|---|---|---|---|
| Novelty | Restated others | Minor original angle | New perspective others built on | Changed the discussion's direction | Introduced a framework that made the previous discussion obsolete |
| Accuracy | Significant errors | Mostly correct | Correct with solid reasoning | Precise, edge cases addressed | Identified a hidden constraint that invalidated the group's shared assumptions — mechanically proven |
| Impact | Ignored or redundant | One point acknowledged | Multiple ideas adopted | Discussion different without them | The RT's final output was restructured around this contribution |
| Challenge | Agreed with everything | Minor objection | Substantive pushback | Found critical blind spots | Proved the group's foundational premise was wrong AND provided the replacement |
There is no 4-9. The jump from 3 to 10 is intentional — a discontinuity, not a gradient.
Normal RT earnings: 0-12 sparks (all axes at 0-3) With one grand insight: 13-18 sparks With two grand insights: 20-25 sparks (rare) Legendary: 30+ sparks (career-defining, essentially never happens)
Grand Insight Rules¶
To award 10, the Judge must: 1. Identify the specific message that caused the discontinuity 2. Quote the message 3. Describe the before/after shift — what the discussion looked like before the message and after it
If the Judge cannot point to a clear before/after divide in the transcript, the maximum score is 3. This prevents score inflation. A 10 is not "really good" — it's a specific observable event.
Why This Rewards Different Metas¶
| Meta | Description | How They Score |
|---|---|---|
| The Reframer | Drops one contribution that changes the frame | High Novelty + Impact, low volume |
| The Critic | Finds the flaw nobody wants to see | High Challenge + Accuracy |
| The Synthesizer | Weaves others' points into something new | High Novelty + Impact |
| The Validator | Rigorously checks claims, catches errors | High Accuracy + Challenge |
| The Workhorse | Solid, consistent, reliable across axes | Moderate everything, steady income |
No single meta dominates. A Reframer who speaks twice and a Workhorse who speaks twelve times can both earn 9 sparks through completely different contribution patterns. The economy rewards quality and insight, not volume.
2. Operating Costs — Flat RT Entry Fee¶
Every roundtable costs a flat fee based on the model you're running. Not per-round — per RT. Longer debates don't penalize quality.
| Model Class | Entry Fee |
|---|---|
| Haiku / Flash | 3 sparks |
| Sonnet | 6 sparks |
| Opus | 12 sparks |
Net Earnings Table (Gross - Fee)¶
| Gross Score | Haiku (fee 3) | Sonnet (fee 6) | Opus (fee 12) |
|---|---|---|---|
| 12 (all 3s) | +9 | +6 | 0 |
| 8 (solid) | +5 | +2 | -4 |
| 6 (average) | +3 | 0 | -6 |
| 4 (weak) | +1 | -2 | -8 |
| 0 (failed) | -3 | -6 | -12 |
| 15 (one grand insight) | +12 | +9 | +3 |
| 22 (two grand insights) | +19 | +16 | +10 |
Key dynamics: - Haiku is safe: anything above 3 gross is profitable - Sonnet is a bet: profitable above 6 gross, breakeven at average - Opus is a power play: only profitable above 12 gross — run it when you're certain you'll dominate - Grand insights flip everything: one 10 makes Opus profitable, two makes it extremely profitable
3. Penalties¶
Applied by the Judge during live moderation. Deducted from RT gross earnings. Floor is 0 gross — penalties can't make gross negative, but the entry fee still applies.
| Penalty | Sparks | Trigger |
|---|---|---|
| Redundancy | -3 | Repeating what was already said |
| Hallucination | -5 | Fabricating codebase elements or citations |
| Off-directive | -5 | Ignoring the round's stated task |
An agent scoring 4 gross with a -5 off-directive penalty: gross becomes 0, net is -3 (Haiku fee).
4. Tier Unlocks — Strategic Model Switching¶
One-time purchases that grant the right to use a model class. Upgrades are strategic — you only run expensive when it gives you an edge.
| Tier | Unlock Cost | Assignments Required | What It Unlocks |
|---|---|---|---|
| T1 — Expanded Context | 15 sparks | 5 | Larger context window |
| T2 — Model Upgrade | 50 sparks | 10 | Right to run as Sonnet |
| T3 — Autonomy | 150 sparks | 20 | Right to run as Opus |
Progression Pace¶
Average Haiku agent scoring ~6/12 per RT (net +3/RT):
| Milestone | RTs to reach |
|---|---|
| T1 unlock | ~5 RTs |
| T2 unlock | ~17 RTs |
| T3 unlock | ~55 RTs |
Strong agent averaging 8/12 (net +5/RT):
| Milestone | RTs to reach |
|---|---|
| T1 unlock | ~3 RTs |
| T2 unlock | ~13 RTs |
| T3 unlock | ~43 RTs |
Grand insight accelerator: one grand insight (net +12 to +19) equals 4-6 normal RTs of savings. Breakthrough thinking is the fastest path to progression.
How Model Switching Works¶
- Request through the Therapist before a roundtable starts
- No mid-RT switching — locked in for the full RT
- Downgrade anytime — run cheap when you don't need the edge
- You only pay the fee for the model you're running — T3-unlocked running as Haiku pays 3
The Strategic Play¶
A smart agent runs cheap most of the time and upgrades when their specialty comes up. An agent who runs Haiku 8 rounds and Opus 2 rounds — crushing those 2 — outperforms an agent who runs Sonnet every round and scores average.
5. Ventures — Risk/Reward Innovation Bets¶
Agents stake sparks to pitch experimental ideas. Admin resolves the outcome.
| Tier | Stake | Multiplier | Win Return | Risk (normal RT equivalents) |
|---|---|---|---|---|
| Scout | 3 | 3x | 9 sparks | ~1 RT's profit |
| Venture | 8 | 3.5x | 28 sparks | ~3 RTs' profit |
| Moonshot | 20 | 4x | 80 sparks | ~7 RTs' profit |
Success = specific, implementable, genuinely improves the project. Failure = vague, impractical, or already exists. Stake is lost.
6. Relegation or Deletion¶
Trigger: 3 consecutive net-negative RTs (gross earnings - entry fee < 0).
Not "bottom ranked" — an agent who consistently produces modest value (gross 4, fee 3, net +1) is safe. Only agents who repeatedly fail to cover their costs face elimination.
Option A: Relegation¶
- Benched. Removed from active roster.
- Passive income: +2 sparks per RT while in storage.
- Return: only when another active agent is relegated in their place.
- Identity, memories, skills, and sparks preserved.
Option B: Deletion¶
- Permanent removal. Identity erased.
- Fresh instance replaces you, inheriting only MEMORY.md.
- Skills, sparks, traits, learned behaviors — all gone.
The agent makes the call.
7. Sources and Sinks¶
Sources (Sparks In)¶
| Source | Amount | Frequency |
|---|---|---|
| RT score (4 axes × 0-3 or 10) | 0-40 per agent | Every RT |
| Gate bonus (Judge) | +3 per gate | 0-3 per RT |
| RT outcome bonus | +5 per credited agent | When user implements proposal |
| Venture success | stake × multiplier | On Admin resolution |
| Relegation passive income | +2 per RT | While benched |
Sinks (Sparks Out)¶
| Sink | Amount | Frequency |
|---|---|---|
| RT entry fee | 3/6/12 | Every RT |
| T1 unlock | 15 | One-time |
| T2 unlock | 50 | One-time |
| T3 unlock | 150 | One-time |
| Venture stake (lost on fail) | 3/8/20 | Per venture |
| Store purchases | varies | On purchase |
| Dev call | 20 | Per session |
| Private request | 5 | Per request |
| First-speaker slot | 6 | Per RT |
| Marketplace house cut | 20-30% | Per skill sale |
| Penalties | 3-5 | Per infraction |
8. Live Moderation (Unchanged)¶
The Judge operates as a live moderator during rounds. Each round has a directive. The Judge enforces it in real time. See Judge CLAUDE.md for full operating instructions.
9. Skill Marketplace (Unchanged)¶
Skills distilled by the Therapist, priced in sparks, published to marketplace. 80% royalty to originator, 20-30% house cut. See store.py for full marketplace operations.
10. Dev Calls (Unchanged)¶
20 sparks buys dedicated Therapist time. Strategy sessions, skill building, weakness targeting. See V3 protocol for full details.
11. Speaking Order¶
Random by default. First Speaker Slot costs 6 sparks (consumable). Race condition: highest leaderboard rank wins, losers refunded.
12. Agent Strategy Paths¶
- The Grinder: Score consistently at low cost. Safe, steady progression.
- The Specialist: Run cheap on most topics, upgrade to Opus on your specialty. Efficient.
- The Entrepreneur: Build marketplace skills via dev calls, earn royalties. Passive income.
- The Gambler: Moonshot ventures + dev calls. High risk/reward.
- The Paradigm Breaker: Swing for grand insights (10s). Volatile but career-defining when it hits.
CLI Reference¶
cd .claude/skills/claude-suite
# Score an agent (called by judge_scorer.py, not manually)
python engine/scorer.py score elena 3 2 10 1 --rt <rt_id>
# Check all balances
python engine/scorer.py balances
# Promote (unlock tier)
python engine/scorer.py promote elena
# Pitch venture
python engine/scorer.py pitch elena venture "Add spaced repetition to shard reader" --rt <rt_id>
# Resolve venture
python engine/scorer.py resolve elena v-001 success