Index
- Quick Start Routes
- Part I: Why Simulation Testing
- Part II: Foundations
- Part III: Building Simulations
- Part IV: Simulating Existing Applications
- Part V: Networking and RPC
- Part VI: Building on Top
- Appendix
A sitemap of every chapter in the Moonpool book. Each entry links to a chapter with a summary of what it covers.
Quick Start Routes
- “What is Moonpool?” — The Case for Simulation, then Why Moonpool Exists
- “How do I write my first simulation?” — Your First Simulation and its sub-chapters
- “How do providers work?” — The Provider Pattern
- “How do I add chaos/faults?” — Chaos in Moonpool
- “How do I use assertions?” — Assertions: Finding Bugs
- “How does networking/RPC work?” — Simulating the Network
- “How do I test an existing app (e.g. axum)?” — Using moonpool-sim Standalone
- “How does multiverse exploration work?” — Multiverse Exploration
- “What assertions are available?” — Assertion Reference
- “What configuration options exist?” — Configuration Reference
Part I: Why Simulation Testing
- The Case for Simulation — Why distributed systems need simulation; the gap between localhost and production; failure statistics
- Prevention vs Discovery — Two testing philosophies: regression (prevention) vs generative (discovery)
- From Mocks to Simulation — Why mocks break at scale; the
#[cfg(test)]trap; maintenance cost - A Brief History — FoundationDB simulator origins, TigerBeetle storage faults, Antithesis assertions
- Why Moonpool Exists — Synthesizing ideas from FDB, TigerBeetle, and Antithesis into one framework
Part II: Foundations
- Determinism as a Foundation — Three non-determinism sources: threads, I/O, randomness; why reproducibility matters
- The Single-Core Constraint — Single-threaded execution guarantees one legal ordering; tokio local runtime
- Seed-Driven Reproducibility — One u64 seed controls entire simulation; ChaCha8Rng; cross-platform determinism
- The Provider Pattern — Five traits (Time, Network, Task, Random, Storage) abstract all I/O; swap real vs simulated
- Quick Start: Swapping Implementations — Practical example: generic function running against TokioProviders or SimProviders
- Deep Dive: Why Providers Exist — Problems with
#[cfg(test)]and mocks; providers eliminate both - The Five Providers — TimeProvider, NetworkProvider, TaskProvider, RandomProvider, StorageProvider details
- System Under Test vs Test Driver — Process (server code) vs Workload (test driver); two distinct roles
- Process: Your Server — Process trait:
name(),run(); recreated fresh on every boot from factory - Workload: Your Test Driver — Workload trait:
setup(),run(),check(); survives reboots; drives and validates
Part III: Building Simulations
- Your First Simulation — End-to-end walkthrough: KV server process, workload, assertions, builder
- Defining a Process — KvServer implementing Process trait; handling TCP; respecting shutdown
- Writing a Workload — KvWorkload tracking state; sending requests; validating responses
- Configuring the SimulationBuilder — Builder pattern:
.workload(),.processes(), chaos config, iterations - Running and Observing —
cargo xtask sim run; reading reports; simulation binary structure - Chaos Testing vs Simulation — Chaos engineering (production, reactive) vs simulation (deterministic, proactive)
- Chaos in Moonpool — Four fault dimensions: buggify, attrition, network faults, storage faults
- Buggify: Fault Injection — Two-phase activation; testing error paths; FoundationDB-inspired
- Attrition: Process Reboots — Graceful, crash, wipe reboot types; randomized kills; recovery delay
- Network Faults — Connection-level: latency, partition, drops, reordering, clogging
- Storage Faults — TigerBeetle-inspired: corruption, misdirected I/O, phantom writes, sync failures; per-process storage config and crash/wipe scoped by IP
- Assertions: Finding Bugs — Record and continue (Antithesis principle); cascade discovery
- Invariants vs Discovery vs Guidance — Three assertion categories: invariants, sometimes, numeric
- Always and Sometimes —
assert_always!(must hold) vsassert_sometimes!(exploration guidance) - Numeric Assertions —
assert_always_less_than!; watermark tracking; explorer optimizes bounds - Compound Assertions —
assert_sometimes_all!for simultaneous sub-goals; frontier tracking - System Invariants — Invariant trait runs after every event; cross-system properties; conservation laws
- Event Timelines — Append-only typed timelines for temporal invariants; fault timeline auto-emitted by simulator
- Designing Workloads That Find Bugs — Targeted adversarial design vs white noise; strategy matters
- Debugging a Failing Seed — Five-step workflow: reproduce, isolate, understand, fix, verify
- Reproducing with FixedCount — Pin seed with
set_debug_seeds()+set_iterations(1); exact replay - Reading the Event Trace — Event queue ordering;
RUST_LOG=trace; causal chain reconstruction - Common Pitfalls — Don’t
stop().awaitin workloads (deadlock); usedrop()instead - Discovering Properties — Systematic property discovery using attention focuses; finding where to place assertions and buggify
Part IV: Simulating Existing Applications
- Using moonpool-sim Standalone — Standalone simulation engine for existing code (axum, Postgres, etc.)
- Where to Draw the Line — Fakes vs test containers; binary failure limitations
- Wiring a Web Service — Worked example: axum service in simulation with Store trait fake, chaos, assertions
- What You’re Testing (and What You’re Not) — Tests handler logic and HTTP under chaos; doesn’t test TLS, proxies, startup code
Part V: Networking and RPC
- Simulating the Network — TCP-level simulation; connection-level faults; FlowTransport architecture
- Peers and Connections — Logical connection resilience; reconnection on drop; message draining
- Backoff and Reconnection — Exponential backoff (FDB pattern); prevents storms; 100ms initial, 30s max
- Wire Format — Packet layout: length, checksum, token, payload; CRC32 validation
- RPC with #[service] — Proc macro: write trait, get client/server/endpoints generated
- Defining a Service —
#[service(id = ...)]trait, request/response types, serialization - Server, Client, and Endpoints — Server setup, client connection, endpoint routing, RequestStream, ReplyPromise
- Delivery Modes — Four guarantees: send, try_get_reply, get_reply, get_reply_unless_failed_for
- Failure Monitor — Address-level and endpoint-level reachability tracking
- Load Balancing and Fan-Out —
load_balance()withQueueModel, plus four fan-out shapes (all/quorum/race/partial) - Designing Simulation-Friendly RPC — Idempotent design, versioning, bounded retries, deduplication, causality
Part VI: Building on Top
- Multiverse Exploration — Checkpoint-and-branch with fork(); timeline tree; exponential trial reduction
- The Exploration Problem — Sequential Luck Problem: N unlikely events need exponential trials without branching
- Fork at Discovery — Unix fork() copies process; reseed with FNV-1a; tree of timelines
- Coverage and Energy Budgets — Fixed-count splitting; global energy cap; prevents exponential blowup
- Adaptive Forking — Batch-based exploration; productive marks earn more; barren marks cut early
- Multi-Seed Exploration — Coverage-preserving seed transitions; selective reset; explored map carries forward
Appendix
- Assertion Reference — Complete table of 15 assertion macros with behavior and parameters
- Crate Map — 8-crate workspace diagram and dependency hierarchy
- Configuration Reference — SimulationBuilder methods, ChaosConfiguration, AttritionConfiguration, exploration
- Fault Reference — Every fault by category with config fields and defaults
- Glossary — Alphabetical definitions: adaptive forking, always assertion, attrition, buggify, coverage bitmap, etc.