Of Faults and Favorites: Model Checking a Pagination Algorithm with Alloy

You can run me!

This article is a literate Alloy 6 model. Download of-faults-and-favorites.md and run alloy6 exec of-faults-and-favorites.md to verify the results yourself.

Preamble: My Background With Model Checkers

I first encountered property testing through PropEr in the Elixir world, and it felt like a superpower. Generate thousands of tests, exercise the bounds of your inputs, validate that properties hold. A lazy programmer's dream.

Model checking takes this further: rather than testing many random cases, you verify that invariants hold across the entire state space.

Back in 2022, I took a short TLA+ training course at the penultimate Strange Loop conference and have wanted to use model checking on a real problem ever since.

Though model checking would have helped many times, I always chose to ship features instead — the gap between knowing what model checkers do and writing an effective model was too wide for any single deadline.

Recently, I ran into a problem at work that seemed like a candidate for model checking. I began drawing all the possible states and realized I didn't trust myself to cover every possible scenario. That's when I remembered that model checking is perfect for this.

With Claude 4.6 Opus available, formal model checking for the algorithms that warrant it is much more tenable. I built this exploration with Claude. I brought the problem and the domain knowledge, the requirements and properties of the system, and used Claude to generate the Alloy syntax and work through the properties.

LLMs handle the Alloy syntax well. An Alloy expert could improve these specifications, but the output is accurate enough for this problem.

The rest of this article covers the problem, the algorithm, and two competing implementations - one with bugs, and one correct.


Table of Contents

What is Model Checking?

We verify software by testing it. Unit tests, integration tests, maybe property-based tests — pick some inputs, check that outputs match expectations. This works, but it has a fundamental limitation: we can only check the cases we think to write. If a pagination bug only surfaces with three events on the same day where one is favorited, we'd better hope someone writes that exact test.

Model checking inverts this. We describe the rules of our system and the properties we want to hold, and the model checker exhaustively searches for violations. If it finds one, it gives us a concrete counterexample. If it doesn't, we know the property holds across every scenario it explored.

We have two flavors:

TechniqueWhat it provesLimitation
Bounded model checkingProperty holds for all states reachable within N stepsOnly covers traces up to length N
Inductive proofProperty holds for all reachable states, periodRequires a strong enough invariant

The inductive approach is the interesting one. It works like mathematical induction: prove the property holds initially (base case), prove that if it holds in state S it still holds after any transition to S' (inductive step), and you've proven it for all reachable states regardless of trace length. No bounds, no caveats.

Why Alloy?

I chose Alloy 6 for this work. A few reasons:

  • Readability. Alloy's syntax is closer to set theory and predicate logic than TLA+'s ASCII notation. The specifications read almost like mathematical statements.
  • Temporal operators. Alloy 6 added always, eventually, after, and before operators for reasoning about state over time, which is exactly what pagination needs.
  • Counterexamples. When a check fails, Alloy gives you a concrete scenario you can step through: not just "your property is violated" but "here is exactly the state sequence that breaks it."
  • Literate markdown. Alloy can execute code blocks inside markdown files. This article is itself an executable specification. Let's start with the Alloy header:
module Pagination

open util/ordering[Day]
open util/ordering[Event] as eventOrd

A brief glossary of Alloy syntax used throughout this article:

KeywordMeaning
sigDeclares a type (like a class or struct)
one sigA singleton — exactly one instance exists
var sig X in YA mutable subset of Y that can change over time
funDefines a function that returns a value
predDefines a predicate (boolean constraint)
factA constraint that always holds (an axiom)
assert / checkA property to verify — Alloy searches for counterexamples
loneZero or one (like Option or nullable)
#Set cardinality (count of elements)
<=>If and only if (biconditional)
Model Checking in Rust

My first attempt at this was with Stateright in Rust. Stateright is a model checker, so it exhaustively explores the entire state space. With Stateright I could cover more of the state space than Alloy allows (up to 12 events rather than 5) in a reasonable amount of time. Alloy's inductive proofs are the decisive advantage: with some additional insight into the model structure, I can mathematically prove the algorithm correct rather than merely test it at scale.

The Problem

Personalization complicates everything!

We are looking to present a list of upcoming events with a very specific ordering. Events must be returned chronologically with the exception that events that a user has "favorited" sort before "non-favortied" events on the same calendar day.

EventDayFavorited
E10
E20
E31
E41
E53

The combined list would be sorted as such:

 E1*, E2, E4*, E3, E5, ...

When you have the entire dataset of events and favorites in memory, this operation is trivial. In fact, it would be equally simple if you could ensure that you have the complete data for any given day. In our situation though, the events come to us paginated from two different sources and the pages don't line up with day boundaries. This is where the "fun" begins and why I'm doing this deep dive.

We are provided with:

A list of the upcoming favorited events sorted chronologically

-and-

A list of all events, including the favorites, sorted chronologically - but the favorites are not identified in this list (AllEvents below)

-or-

A list of all events with the favorites filtered out, sorted chronologically (ExcludeFavorites below)

The question: how do we construct the combined list when the combined list is fetched in pages and the sources are paginated?

The Domain Model

Events occur on a specific day. We use a global event ordering (via util/ordering[Event]) that is consistent with day ordering: all events on an earlier day precede all events on a later day. This means Event$0 is always the chronologically earliest event, Event$1 is next, and so on — making counterexample traces easy to read.

sig Day {}

sig Event {
    day: one Day
}

fact EventOrdering {
    all disj e1, e2: Event | dayBefore[e1.day, e2.day] implies eventOrd/lt[e1, e2]
}

Some events are favorited. We require at least 3 events and at least one favorite to keep scenarios interesting.

sig Favorited in Event {}

fact InterestingScenarios {
    #Event >= 3
    some Favorited
}

Assumptions

A few simplifications:

  • Tokens are cursors. Each source list has an integer cursor (token) tracking position. none means end-of-list.
  • No concurrent mutations. The event data doesn't change while we're paginating.
  • Chronological ordering. Events are totally ordered: first by day, then by position within the same day. The global event ordering is constrained to be consistent with day ordering.
one sig Config {
    pageSize: one Int
}

fact PageSizeBounds {
    Config.pageSize >= 1
}

Helpers

These predicates and functions define ordering rules and shared infrastructure for both implementations.

Day and Event Ordering

pred dayBefore[earlierDay, laterDay: Day] {
    lt[earlierDay, laterDay]
}

pred chronoBefore[earlierEvent, laterEvent: Event] {
    eventOrd/lt[earlierEvent, laterEvent]
}

Merge Ordering

The desired ordering for the final output is: earlier days first, favorites before non-favorites within the same day, and chronological within the same day and favorite status.

The pagination algorithm doesn't have global knowledge of which events are favorited. It only knows an event is favorited if it appeared in the favorites list candidates for the current page. Events sourced from the standard list are treated as non-favorited for merge ordering, even if they are globally favorited.

mergeOrderBefore captures this: it takes a knownFavorites parameter — the set of events sourced from the favorites list — and uses that instead of global Favorited membership.

pred mergeOrderBefore[firstEvent, secondEvent: Event, knownFavorites: set Event] {
    dayBefore[firstEvent.day, secondEvent.day]
    or (firstEvent.day = secondEvent.day and firstEvent in knownFavorites and secondEvent not in knownFavorites)
    or (firstEvent.day = secondEvent.day and
        ((firstEvent in knownFavorites) <=> (secondEvent in knownFavorites)) and
        eventOrd/lt[firstEvent, secondEvent])
}

The Favorited List

Both implementations share the same favorited list: a mapping from position to event, built by counting how many other favorites precede each one chronologically.

fun favList: Int -> Event {
    { position: Int, currentEvent: Favorited |
        position = #{otherEvent: Favorited | chronoBefore[otherEvent, currentEvent]} and
        position >= 0 and position < #Favorited
    }
}

State

The algorithm maintains mutable state: which events have been emitted so far, what was emitted on the last page (for detecting duplicates), and cursor positions into each list.

var sig PriorEmitted in Event {}
var sig LastPageOutput in Event {}

one sig Tokens {
    var stdTok: lone Int,
    var favTok: lone Int
}

fact TokenBounds {
    always (
        (some Tokens.stdTok implies Tokens.stdTok >= 0) and
        (some Tokens.favTok implies Tokens.favTok >= 0)
    )
}

Token Advancement

After emitting a page, we advance each token past contiguous emitted events. "Contiguous" is doing real work here: the algorithm walks forward through emitted positions and stops at the first gap. If position 0 is emitted, position 1 is not, and position 2 is emitted, the token stops at 1.

fun advanceToken[sourceList: Int -> Event, token: lone Int, emittedEvents: set Event]: lone Int {
    no token => none
    else
    let gap = min[{position: Int | position >= token and some sourceList[position] and sourceList[position] not in emittedEvents}] |
    some gap => gap
            else none
}

In plain terms: starting from the current token position, find the first un-emitted event in the list. If there is one, that becomes the new token. If every remaining event has been emitted, the token becomes none (end of list).

Stutter

A stutter transition lets the system idle without loading a new page, standard in temporal models to avoid forcing progress at every step. We clear LastPageOutput since no page was loaded.

pred stutter[] {
    PriorEmitted' = PriorEmitted
    no LastPageOutput'
    Tokens.stdTok' = Tokens.stdTok
    Tokens.favTok' = Tokens.favTok
}

Observable Properties

These describe the output behavior we care about, independent of implementation internals.

No Duplicates

No event should appear on more than one page. Since PriorEmitted is a set, it can't contain duplicates by construction — adding an event that's already present is a no-op. We detect duplicates instead by tracking LastPageOutput separately and using the temporal before operator to check whether an event in the current page was already in PriorEmitted in the previous state. This check appears in the DuplicatesOccur assertions rather than as a predicate here.

Duplicate prevention in the inductive proofs comes from the structural invariants (tokens past emitted + contiguity), which prevent loadMore from selecting already-emitted events in the first place.

Ordering

Within each day, favorites should come before non-favorites. Within the same favorite status, events should be in chronological order.

pred favoritesFirstWithinDay[] {
    all favoritedEvent: Favorited, nonFavoritedEvent: Event - Favorited |
        favoritedEvent.day = nonFavoritedEvent.day and nonFavoritedEvent in PriorEmitted
        implies favoritedEvent in PriorEmitted
}

pred favoritesChronological[] {
    all earlierEvent, laterEvent: Favorited |
        earlierEvent.day = laterEvent.day and laterEvent in PriorEmitted and eventOrd/lt[earlierEvent, laterEvent]
        implies earlierEvent in PriorEmitted
}

pred nonFavoritesChronological[] {
    all earlierEvent, laterEvent: Event - Favorited |
        earlierEvent.day = laterEvent.day and laterEvent in PriorEmitted and eventOrd/lt[earlierEvent, laterEvent]
        implies earlierEvent in PriorEmitted
}

pred orderingInvariant[] {
    favoritesFirstWithinDay[] and
    favoritesChronological[] and
    nonFavoritesChronological[]
}

Inductive Properties

The property we care about (no duplicates) is not inductive by itself. Knowing there are no duplicates in state S doesn't give us enough to prove there won't be duplicates after a transition to S'. We need more context about internal state.

To make the proof work, we strengthen the invariant with two auxiliary properties:

  1. Tokens point past emitted events. If a token is at position P, every event before P has been emitted. When a token is exhausted (none), all events in that list have been emitted. The exhausted case matters: without it, the invariant is vacuously true for none tokens, and the solver can construct unreachable states where a list appears exhausted but events remain unemitted.
  2. Emitted events are contiguous. Within each source list, emitted events form a contiguous prefix with no gaps.

These are internal to the algorithm, not things we'd show a user. But contiguity guarantees the token advancement logic can't re-fetch an already-emitted event, which is what prevents duplicates.

To prove ordering (favorites before non-favorites within each day), we further strengthen the invariant by including orderingInvariant. The token and contiguity properties provide enough context for ordering to be inductive: when a non-favorite on day D is emitted, all favorites on day D are either already emitted (positions before the token) or in the current candidate set (where merge ordering places them first).

We use the exact same strengthened invariant for both implementations. The only difference is which standard list we plug in. One maintains the invariant. The other can't.

pred tokensPointPastEmittedGeneric[stdList: Int -> Event] {
    some Tokens.stdTok implies
        all position: Int | position >= 0 and position < Tokens.stdTok implies
            (some stdList[position] implies stdList[position] in PriorEmitted)

    no Tokens.stdTok implies
        all position: Int | position >= 0 implies
            (some stdList[position] implies stdList[position] in PriorEmitted)

    some Tokens.favTok implies
        all position: Int | position >= 0 and position < Tokens.favTok implies
            (some favList[position] implies favList[position] in PriorEmitted)

    no Tokens.favTok implies
        all position: Int | position >= 0 implies
            (some favList[position] implies favList[position] in PriorEmitted)
}

pred emittedAreContiguousGeneric[stdList: Int -> Event] {
    all earlierPosition, laterPosition: Int |
        earlierPosition < laterPosition and
        some stdList[laterPosition] and
        stdList[laterPosition] in PriorEmitted
        implies
        (some stdList[earlierPosition] implies stdList[earlierPosition] in PriorEmitted)

    all earlierPosition, laterPosition: Int |
        earlierPosition < laterPosition and
        some favList[laterPosition] and
        favList[laterPosition] in PriorEmitted
        implies
        (some favList[earlierPosition] implies favList[earlierPosition] in PriorEmitted)
}

pred inductiveInvariantGeneric[stdList: Int -> Event] {
    tokensPointPastEmittedGeneric[stdList] and
    emittedAreContiguousGeneric[stdList]
}

pred inductiveInvariantWithOrderingGeneric[stdList: Int -> Event] {
    inductiveInvariantGeneric[stdList] and
    orderingInvariant[]
}

Implementation A: AllEvents

The first approach includes all events in the standard list regardless of favorite status. Since the favorited list contains only favorites, favorites appear in both lists.

This was the approach we were hoping to take. It had one flaw that we knew about: Since the events are sorted favorites first, by day it was possible that favorite events might show favorites further into the future than non-favorites. For example:

Events Page 1Favorites Page 1
E1 (Day1)E1 (Day1)
E2 (Day1)E8 (Day3)
E3 (Day1)E9 (Day4)
E4 (Day2)
E5 (Day2)

The combined list would look like this:

┌──Day 1──┐┌Day 2┐┌D3┐ D4
 E1, E2, E3, E4, E5, E8, E9

What's missing are the events from the 2nd page of the "Event Page", E6 and E7 which occur after E5 and also on Day2. These would appear when the next page loads. This was a trade off we were comfortable with. But the proof found a more sinister issue...

We accepted this trade-off. The proof found a worse problem.

fun stdListAllEvents: Int -> Event {
    { position: Int, currentEvent: Event |
        position = #{otherEvent: Event | chronoBefore[otherEvent, currentEvent]} and
        position >= 0 and position < #Event
    }
}

fun pageOutputAllEvents[stdToken, favToken: lone Int]: set Event {
    let stdCandidates = (no stdToken) => none
                    else stdListAllEvents[{position: Int | position >= stdToken}],
        favCandidates = (no favToken) => none
                    else favList[{position: Int | position >= favToken}],
        candidates = stdCandidates + favCandidates |
    { event: candidates | #{otherEvent: candidates | mergeOrderBefore[otherEvent, event, favCandidates]} < Config.pageSize }
}

pred loadMoreAllEvents[] {
    let pageOutput = pageOutputAllEvents[Tokens.stdTok, Tokens.favTok] {
        some pageOutput
        PriorEmitted' = PriorEmitted + pageOutput
        LastPageOutput' = pageOutput
        Tokens.stdTok' = advanceToken[stdListAllEvents, Tokens.stdTok, pageOutput]
        Tokens.favTok' = advanceToken[favList, Tokens.favTok, pageOutput]
    }
}

pred initAllEvents[] {
    no PriorEmitted
    no LastPageOutput
    Tokens.stdTok = 0
    Tokens.favTok = 0
}

pred traceAllEvents[] {
    initAllEvents[]
    always (loadMoreAllEvents[] or stutter[])
}

pred inductiveInvariantAllEvents[] {
    inductiveInvariantGeneric[stdListAllEvents]
}

pred inductiveInvariantWithOrderingAllEvents[] {
    inductiveInvariantWithOrderingGeneric[stdListAllEvents]
}

The set comprehension on the last line of pageOutputAllEvents selects events from the candidate pool where fewer than pageSize other candidates precede them in merge order — effectively, the first pageSize events.

Why It Breaks

The overlap between the two lists creates gaps. Consider three events: E0 on Day0, E1 on Day1 (not favorited), and E2 on Day1 (favorited).

stdListAllEvents: [0→E0, 1→E1, 2→E2]
favList:          [0→E2]

On the first page, the algorithm emits E0 and E2 (E2 comes before E1 because favorites come first within a day). The token advancement for the standard list walks forward from position 0: position 0 (E0) is emitted, advance. Position 1 (E1) is not emitted. Stop. The standard token is now at position 1.

However, position 2 (E2) is emitted. There's a gap. On the next page load, the algorithm fetches candidates from position 1 onward: {E1, E2}. E2 gets emitted again. Duplicate.

The contiguity invariant catches this: after the first page, position 2 is emitted but position 1 is not. The invariant breaks at state 1, and the duplicate materializes at state 2.

Verification (AllEvents)

assert AllEvents_BaseCase {
    initAllEvents[] implies inductiveInvariantAllEvents[]
}

assert AllEvents_InductiveStep {
    always (
        (inductiveInvariantAllEvents[] and loadMoreAllEvents[]) implies after inductiveInvariantAllEvents[]
    )
}

assert AllEvents_Stutter {
    always (
        (inductiveInvariantAllEvents[] and stutter[]) implies after inductiveInvariantAllEvents[]
    )
}

assert AllEvents_DuplicatesOccur {
    traceAllEvents[] implies always (
        all e: LastPageOutput |
            not before (e in PriorEmitted)
    )
}

assert AllEvents_Ordering {
    traceAllEvents[] implies always orderingInvariant[]
}

assert AllEvents_Ordering_InductiveStep {
    always (
        (inductiveInvariantWithOrderingAllEvents[] and loadMoreAllEvents[]) implies after inductiveInvariantWithOrderingAllEvents[]
    )
}

Implementation B: ExcludeFavorites

The second approach excludes favorited events from the standard list. Favorites appear only in the favorited list. The two lists are disjoint.

fun stdListExcludeFav: Int -> Event {
    { position: Int, currentEvent: Event |
        currentEvent not in Favorited and
        position = #{otherEvent: Event | otherEvent not in Favorited and chronoBefore[otherEvent, currentEvent]} and
        position >= 0 and position < #(Event - Favorited)
    }
}

fun pageOutputExcludeFav[stdToken, favToken: lone Int]: set Event {
    let stdCandidates = (no stdToken) => none
                    else stdListExcludeFav[{position: Int | position >= stdToken}],
        favCandidates = (no favToken) => none
                    else favList[{position: Int | position >= favToken}],
        candidates = stdCandidates + favCandidates |
    { event: candidates | #{otherEvent: candidates | mergeOrderBefore[otherEvent, event, favCandidates]} < Config.pageSize }
}

pred loadMoreExcludeFav[] {
    let pageOutput = pageOutputExcludeFav[Tokens.stdTok, Tokens.favTok] {
        some pageOutput
        PriorEmitted' = PriorEmitted + pageOutput
        LastPageOutput' = pageOutput
        Tokens.stdTok' = advanceToken[stdListExcludeFav, Tokens.stdTok, pageOutput]
        Tokens.favTok' = advanceToken[favList, Tokens.favTok, pageOutput]
    }
}

pred initExcludeFav[] {
    no PriorEmitted
    no LastPageOutput
    Tokens.stdTok = 0
    Tokens.favTok = 0
}

pred traceExcludeFav[] {
    initExcludeFav[]
    always (loadMoreExcludeFav[] or stutter[])
}

pred inductiveInvariantExcludeFav[] {
    inductiveInvariantGeneric[stdListExcludeFav]
}

pred inductiveInvariantWithOrderingExcludeFav[] {
    inductiveInvariantWithOrderingGeneric[stdListExcludeFav]
}

Why It Works

With disjoint lists, each event lives in exactly one source. When we emit a favorite from the favorited list, it doesn't exist in the standard list. No gap forms. Token advancement proceeds cleanly through each list, and no event can be fetched a second time from a different source.

The same invariant that AllEvents can't maintain, ExcludeFavorites maintains cleanly. Same proof structure, same auxiliary properties, different result.

Verification (ExcludeFavorites)

assert ExcludeFav_BaseCase {
    initExcludeFav[] implies inductiveInvariantExcludeFav[]
}

assert ExcludeFav_InductiveStep {
    always (
        (inductiveInvariantExcludeFav[] and loadMoreExcludeFav[]) implies after inductiveInvariantExcludeFav[]
    )
}

assert ExcludeFav_Stutter {
    always (
        (inductiveInvariantExcludeFav[] and stutter[]) implies after inductiveInvariantExcludeFav[]
    )
}

assert ExcludeFav_DuplicatesOccur {
    traceExcludeFav[] implies always (
        all e: LastPageOutput |
            not before (e in PriorEmitted)
    )
}

assert ExcludeFav_Ordering_BaseCase {
    initExcludeFav[] implies inductiveInvariantWithOrderingExcludeFav[]
}

assert ExcludeFav_Ordering_InductiveStep {
    always (
        (inductiveInvariantWithOrderingExcludeFav[] and loadMoreExcludeFav[]) implies after inductiveInvariantWithOrderingExcludeFav[]
    )
}

assert ExcludeFav_Ordering_Stutter {
    always (
        (inductiveInvariantWithOrderingExcludeFav[] and stutter[]) implies after inductiveInvariantWithOrderingExcludeFav[]
    )
}

Results

// === AllEvents (expect failures) ===
check AllEvents_BaseCase for 6 Event, 3 Day, 6 Int, 1 steps
check AllEvents_InductiveStep for 6 Event, 3 Day, 6 Int, 2 steps
check AllEvents_Stutter for 6 Event, 3 Day, 6 Int, 2 steps
check AllEvents_DuplicatesOccur for 6 Event, 3 Day, 6 Int, 6 steps
check AllEvents_Ordering for 6 Event, 3 Day, 6 Int, 5 steps
check AllEvents_Ordering_InductiveStep for 6 Event, 3 Day, 6 Int, 2 steps

// === ExcludeFavorites (expect all pass) ===
check ExcludeFav_BaseCase for 6 Event, 3 Day, 6 Int, 1 steps
check ExcludeFav_InductiveStep for 6 Event, 3 Day, 6 Int, 2 steps
check ExcludeFav_Stutter for 6 Event, 3 Day, 6 Int, 2 steps
check ExcludeFav_DuplicatesOccur for 6 Event, 3 Day, 6 Int, 6 steps
check ExcludeFav_Ordering_BaseCase for 6 Event, 3 Day, 6 Int, 1 steps
check ExcludeFav_Ordering_InductiveStep for 6 Event, 3 Day, 6 Int, 2 steps
check ExcludeFav_Ordering_Stutter for 6 Event, 3 Day, 6 Int, 2 steps

In Alloy, check asks whether a counterexample to an assertion exists within the given scope. UNSAT (unsatisfiable) means no counterexample was found — the assertion holds. SAT (satisfiable) means a counterexample exists — the assertion is violated. This is counterintuitive: SAT is the bad result.

Running alloy6 exec of-faults-and-favorites.md produces:

00. check AllEvents_BaseCase           UNSAT
01. check AllEvents_InductiveStep      SAT    ← invariant breaks
02. check AllEvents_Stutter            UNSAT
03. check AllEvents_DuplicatesOccur    SAT    ← duplicates found
04. check AllEvents_Ordering                   UNSAT
05. check AllEvents_Ordering_InductiveStep     SAT    ← ordering breaks too
06. check ExcludeFav_BaseCase                   UNSAT
07. check ExcludeFav_InductiveStep              UNSAT  ← invariant holds
08. check ExcludeFav_Stutter                    UNSAT
09. check ExcludeFav_DuplicatesOccur            UNSAT  ← no duplicates
10. check ExcludeFav_Ordering_BaseCase          UNSAT
11. check ExcludeFav_Ordering_InductiveStep     UNSAT  ← ordering proved
12. check ExcludeFav_Ordering_Stutter           UNSAT

The three SAT results for AllEvents are the key findings:

  • AllEvents_InductiveStep: SAT means Alloy found a state where the invariant holds, but after one loadMore transition it breaks. The strengthened invariant cannot be maintained when the lists overlap. This is the inductive proof that the design is broken at any scale.
  • AllEvents_DuplicatesOccur: SAT means Alloy found a concrete trace where the same event appears in LastPageOutput when it was already in PriorEmitted. This is the bounded proof that duplicates actually happen.
  • AllEvents_Ordering_InductiveStep: SAT means the ordering invariant also cannot be maintained inductively. The overlap between lists allows the standard list to emit events on a day before the favorites list has delivered all favorites for that day.

Every ExcludeFavorites check returns UNSAT. The bounded checks (like DuplicatesOccur) confirm no violations exist within the given scope of 6 events and 3 days. The inductive checks go further: the base case plus inductive step together prove the properties hold in all reachable states, for any event count, day count, page size, or trace length. Induction doesn't depend on scope — if the invariant is preserved across one arbitrary transition, it's preserved across all of them.

Conclusion

Testing tells you "we checked these cases and they passed." Model checking tells you "this property holds for every reachable state." And when a design is broken, you don't get a bug report — you get proof that the design can't satisfy the property at any scale.

The practical barrier has always been the learning curve. TLA+ and Alloy require a different kind of thinking and an understanding of some unfamiliar syntax. Leveraging LLMs allows us to ease into the syntax and concepts, while focusing on the logic.

Not every algorithm needs formal verification. But when subtle interactions between components can produce bugs that testing won't catch until production, a model checker is the right tool. The hard part is unchanged: you need to understand your problem well enough to confirm the checker is checking what you intend.