Contributing a ballpark item

A ballpark item is a single paper’s entry point into the Econ-ARK ecosystem: enough structure, context, and formalization that an ambitious graduate student can progress the paper from “interesting” through “formal recursive model” toward a REMARK or DemARK candidate in one semester.

This file specifies what a ballpark item should contain and how to submit one. For background on the project, see ‘In the ballpark’ of the Econ-ARK ....

Before you start¶

Is the paper in scope? Ballpark items are papers that are either (a) serious structural models producing interesting quantitative results, or (b) strong empirical evidence that begs for a model. See ‘In the ballpark’ of the Econ-ARK ... for the two tracks (models/ vs empirical/).
Has the paper been cited enough to matter? The paper must have at least 3 citations in Google Scholar to be eligible as a ballpark candidate. This is a hard gate: it filters for papers whose ideas have begun to circulate in the literature, without excluding recent papers that have not yet accumulated many citations. Paste the Google Scholar citation count (as of submission date) into the submission PR description.
Is it already here? Check models/We-Would-Like-In-Econ-ARK/ for an existing subdirectory under the paper’s citekey. If one exists, open a PR improving it rather than creating a parallel entry.
Is it listed but not yet claimed? If there is a subdirectory but it is thin (legacy “slideware” — one notebook of markdown + figures), your contribution can be to refactor it to the canonical structure below.
None of the above? Open a Wanted Ballpark Paper issue using the provided form (citation, DOI, Google Scholar citation count, track, topic, 3-sentence pitch). We will confirm before you invest effort.

Three layers of a ballpark item¶

A canonical ballpark item has three layers. The exposition layer is required; the formalization layer turns a summary into a modular-DP scaffold (this is the output of a course-project workflow); the asset layer holds source material for human and AI re-reading.

1. Exposition layer — required¶

Four notebooks assembled by one <citekey>.md, plus a README.md orientation file. Name the notebooks (and the assembler markdown file) with the paper’s citekey prefix, e.g. benhabib2019_intro.ipynb and benhabib2019.md.

File	Content
`README.md`	A short orientation file. Written in GitHub-flavored markdown (not MyST) so GitHub auto-renders it inline below the directory listing for anyone landing on the GitHub view of the entry. Use Unicode for math (e.g. `Π_r`, `g(·;τ,r)`, `Π_τ ⊗ Π_r`, `c ≤ a`) rather than $...$ . Provides: (1) a prominent callout link to the rendered MyST page (e.g. `> 📖 Rendered version: [econ-ark.github.io/ballpark/<citekey>/](…)`) at the very top, so a directory-view reader can immediately jump to the proper rendered entry; (2) a brief reading guide for any technical artifacts in the directory — for entries with the Formalization layer, name the conventions used (dolo-plus YAML format, the Matsya iteration loop that produced the YAMLs, inline `# unresolved:` flags), the excerpt-plus-YAML pairing pattern, and where the construction audit trail lives; (3) a one-line index of the main files in the directory. Keep it short — orientation only, not duplication.
`<citekey>.md`	MyST page with `{include}` directives for the four notebooks below (in order), plus YAML frontmatter giving the rendered title and metadata. Name the file with the paper’s citekey, not `index.md` — mystmd derives the rendered URL slug from the filename, and a citekey-named file gives a clean URL like `/<citekey>/` (e.g. `/benhabib-et-al-2019/`); whereas multiple entries all named `index.md` would auto-collide and yield ugly slugs (`/index-1/`, `/index-2/`, …).
`<citekey>_intro.ipynb`	Full citation with DOI link. Original ballpark author (name + date). Updated by (latest + date). 3-sentence pitch: why the paper is in-ballpark for Econ-ARK.
`<citekey>_prior-literature.ipynb`	Where the paper sits in the foundational literature (Bewley / Huggett / Aiyagari / de Nardi / ...). Use `{cite:t}` citations rendered from `references.bib`.
`<citekey>_summary.ipynb`	Non-technical motivation + findings overview (required at Draft), extended at Primer promotion with a “The Model” section stating the recursive formulation explicitly: no `u(c)` placeholders, explicit CRRA or EZ kernel, explicit bequest function, explicit transitions, explicit shock distributions, explicit constraint set. This “The Model” section is what the formalization layer will build on.
`<citekey>_subsequent-literature.ipynb`	Research directions that followed the paper. Cite from `subsequent-literature.bib`.

Audience and voice for _intro.ipynb and _summary.ipynb. Write for a reader (PhD student or faculty researcher) who is deciding whether to invest 90 minutes in the paper itself. Two self-checks before submitting: (1) Does the first screen of _summary.ipynb give the paper’s headline quantitative result — numbers, not adjectives? (2) Does it explain what makes the result credible — the identification strategy or theoretical mechanism, not just the methodology label? If either answer requires the reader to scroll past several paragraphs of prose to find, condense.

The Benhabib_et_al_2019 item is the reference instance of this layer.

2. Formalization layer — stretch (recommended for coursework)¶

Four files that take the explicit recursive formulation from _summary.ipynb and lift it toward a dolo-plus stage.

File	Content
`bellman-excerpt.md`	A standalone modular-DDSL Bellman statement produced by iterating with Matsya (see “The formalization iteration” below). Contains: a comprehensive symbol table; a timing convention (numbered steps within one period); a decomposition of the problem into periods, stages, and perches (arrival $\prec$ / decision ∘ / continuation $\succ$ ); a table listing state, control, shock, constraint, payoff at their native perches; the stage operator $\mathbb{T} = \mathbb{I} \circ \mathbb{B}$ ; explicit utility/bequest forms; and — once the iteration has matured — a “Stage composition” subsection and an EGM channel discussion where applicable (aligned to SolvingMicroDSOPs §§12–13).
`dolo-plus-draft.yaml`	A minimal one-stage YAML (interior period only is acceptable). Any features that do not map cleanly onto canonical dolo-plus syntax must be flagged inline with a `# workaround:` or `# unresolved:` comment rather than silently fudged.
`verification.md`	One paragraph stating what was accepted, edited, or rejected from Matsya’s output, and why — verified against the published paper, not only the ballpark summary.
`matsya-session.txt`	A single line: the `--session` string used on every Matsya call for this item (e.g. `topics2026-<citekey>`). Staff can inspect the server-side conversation by session name; you do not paste the transcript.

The three workaround categories you are most likely to hit are mechanical (non-optimized) deductions, $\hat{\Gamma}^{1-\gamma}$ -style value-function scaling inside expectations, and state-contingent shock distributions. Flag them; do not hide them.

The formalization iteration (the meat of the work)¶

The single hardest — and most valuable — part of the formalization layer is arriving at a decomposition of the paper’s problem into periods, stages, and perches. This is not a one-shot translation; it is a loop between a general-purpose AI (Claude, Cursor) and Matsya, the DDSL-aware evaluator. bellman-excerpt.md is the single evolving artifact that this loop produces — not two separate files for “before Matsya” and “after Matsya.”

The iteration runs as follows:

Extract. Ask an AI (Claude Opus recommended) to read the paper (<citekey>.mmd preferred over .pdf) and the recursive formulation in _summary.ipynb, and to draft bellman-excerpt.md as a modular-DDSL Bellman statement: symbol table, timing, a candidate period/stage/perch decomposition, transitions, and movers.
Evaluate. Feed bellman-excerpt.md to Matsya (same --session name each time — record it in matsya-session.txt) and ask it to identify missing or under-specified elements of the decomposition: symbols that appear without a row in the symbol table, perches that are referenced but not defined, transitions that are unlabeled, movers that collapse silently rather than being explicitly labeled as identities, constraint sets that are not carried through, parameters whose domain is unstated, etc.
Improve. Take Matsya’s critique back to the AI and ask it to revise bellman-excerpt.md to address each flagged gap — either by filling it in from the paper, or by marking it as genuinely absent from the paper (a signal that the paper does not pin this down and the formalizer will have to make an explicit modeling choice).
Repeat steps 2–3 until one of two terminating conditions is reached:
- Success: Matsya reports no further missing elements, the symbol table is closed under reference, every perch and transition is either defined or explicitly labeled degenerate/identity, and the decomposition is coherent. The item is ready for the dolo-plus-draft YAML.
- Failure (informative): It becomes clear that the paper itself does not specify its problem clearly enough to admit a periods/stages/perches decomposition — e.g. the budget at terminal is undefined, the timing of shock realization is ambiguous, the constraint set changes silently across sections. At this point, stop iterating and record the blocking ambiguities in verification.md under a “Paper under-specifies” heading. The item remains at Primer tier; it cannot reach Formalized until the ambiguities are resolved (by the formalizer making explicit modeling choices, by a companion note reconciling the paper’s inconsistencies, or by correspondence with the authors).

The iteration is where the economics lives. bellman-excerpt.md on its own is a file; the value is in the judgments made during the loop — which of Matsya’s gaps are real, which are artifacts of the evaluator’s priors, which expose genuine ambiguities in the paper. Those judgments are what verification.md records.

3. Asset layer — required in part¶

File	Required?	Content
`<citekey>.pdf`	required	The paper. If license forbids redistribution, replace with a DOI-only pointer in `_intro.ipynb`.
`<citekey>.mmd`	local-only — gitignored, do not commit	Markdown conversion of the paper PDF for AI ingestion (Cursor, Claude, Matsya). Produce locally via Mathpix (better for math-heavy papers) or `pandoc <citekey>.pdf -o <citekey>.mmd`. `*.mmd` is gitignored at the repo root because the markdown is a derivative work of the publisher’s PDF and inherits its copyright. Each contributor maintains their own local copy.
`references.bib`	required	Bib entries cited from `_prior-literature.ipynb` and `_summary.ipynb`. A superset is acceptable — uncited entries (e.g., a broader reading list the contributor maintains) do not need to be pruned. MyST renders only cited entries in the published bibliography.
`self.bib`	recommended	The paper’s own bib entry. Keeps the paper citation separable from its context.
`subsequent-literature.bib`	required if the notebook is non-empty	Bib entries cited from `_subsequent-literature.ipynb`.
Figures / tables (e.g. `fig1.png`, `Table2.png`)	as needed	Use paper’s own labels where possible.

4. REMARK-ready extension — optional¶

If the formalization layer has stabilized and you have working code, add a replication/ subdirectory with reproduce.sh, CITATION.cff, binder/environment.yml, and a validated (not draft) dolo-plus stage. At that point you are eligible to move the item to REMARK or DemARK per the criteria in those repos.

Machine-readable metadata (for AI indexing)¶

Ballpark entries are designed to be discovered and cited by both humans and AI agents. The <citekey>.md frontmatter and an optional AGENTS.md provide the structured signals that make this work.

Required frontmatter fields on `<citekey>.md`¶

---
title: "<Paper title> — Ballpark Entry"
schema_type: ScholarlyArticle              # schema.org type; Dataset also acceptable
about:
  doi: 10.XXXX/YYYY                        # paper DOI
  authors: [LastName, LastName, LastName]
  year: 2019
  journal: American Economic Review
keywords: [kebab-case, tags]               # free-form topical tags
econ_ark_topic:                            # controlled vocabulary — pick from:
  - HA-macro                               #   HA-macro, lifecycle, wealth-distribution,
  - wealth-distribution                    #   monetary, fiscal-policy, optimal-taxation,
  - lifecycle                              #   housing, labor, business-cycles,
                                           #   computational-methods, open-economy,
                                           #   liquidity-trap, demographics,
                                           #   financial-crisis, inequality
jel: [D31, E21, J62]                       # JEL codes (array)
difficulty: stretch                        # good-first-ballpark | stretch | research-grade
tier: formalized                           # draft | primer | formalized — see "Ballpark tiers" below
has_formalization_layer: true              # true iff the formalization-layer files exist
ballpark_contributor:
  name: "<name>"
  orcid: "0000-0000-0000-0000"             # optional but strongly encouraged
updated_by:                                # one entry per material revision; most recent last
  - name: "<name>"
    orcid: "..."
    date: 2026-01-27
---

MyST renders this frontmatter as JSON-LD on the published page, which Google Scholar, LLM training pipelines, and retrieval agents recognize. The same frontmatter powers the browsable catalog’s filter UI (one source of truth).

Optional frontmatter extensions (recommended)¶

doi: 10.5281/zenodo.XXXXXXX                # Zenodo DOI for this ballpark entry itself
superseded_by: https://github.com/econ-ark/REMARK/...   # once promoted
requires: [CRRA, EGM, bequest-utility]     # model features — free-form tags

`AGENTS.md` (required for items with a formalization layer; recommended otherwise)¶

A short structured brief aimed at coding agents (Claude Code, Cursor, etc.) that a user’s local session will read when the directory is opened. Distinct from the human-readable <citekey>.md. See the Benhabib_et_al_2019 worked example.

Purpose:

Point agents at the right file to read first (the bellman-excerpt.md if present, otherwise the summary notebook — not the paper PDF).
Surface the Matsya session name so new calls continue the existing thread.
List known workarounds / unresolved features so agents don’t re-discover them.
Suggest common next tasks so agents proposing work have a grounded starting point.

How to produce your `AGENTS.md`¶

Copy the template below into AGENTS.md in your item directory and fill in the six sections. Every section has a grounded source in files you have already produced — you should not be inventing content.

Section	Where its content comes from
Paper	`<citekey>_intro.ipynb` — citation, DOI, one-sentence pitch of why the paper is in-ballpark. Copy verbatim; this is the one place duplication with `<citekey>.md` is intentional, because the agent may open `AGENTS.md` first.
If a user asks to work on this item	`<citekey>_summary.ipynb` (section “The Model”) is the authoritative recursive statement. `<citekey>.mmd` is the AI-friendly paper source — locally-produced (gitignored), so an agent may need to produce one from the PDF if not already present. If your formalization layer is present, point at `bellman-excerpt.md` as “read first” instead of the summary notebook.
Formalization status	Tick which layer files you committed: `bellman-excerpt.md`, `dolo-plus-draft.yaml`, `verification.md`, `matsya-session.txt`. Be honest about what is not yet present.
Known model features requiring attention	Pull from `verification.md` (the items you rejected or edited) and from the inline `# workaround:` / `# unresolved:` comments in `dolo-plus-draft.yaml`. This is the single most useful section for an agent — it is the list of things it should not re-discover. If the formalization layer is absent, list the model features you already know will be awkward (state-contingent shocks, mechanical deductions, non-standard normalizations, etc.).
Common next tasks	List what you intentionally left undone. Examples from real items: “add terminal-period stage to YAML”, “formalize the dynasty wrapper”, “add age-varying wage overrides”. Cite the specific file or line a next-task should touch. Do not list tasks you would have liked to do but have no grounding for.
Workflow reminders	Mostly boilerplate. Keep the Matsya session-naming convention (`topics2026-<slug>` for coursework), the paper-verification reminder, and the workaround-comment convention. Delete anything that does not apply to your item.

Template (copy and fill in):

# Ballpark entry: <Authors> (<Year>)

> Structured brief for coding agents (Claude Code, Cursor, etc.). Human-facing content lives in [`<citekey>.md`](<citekey>.md).

## Paper

- **Citation:** <Authors (Year), "Title," Journal vol(issue), pages>.
- **DOI:** [<doi>](https://doi.org/<doi>)
- **Core model:** <one-sentence description: lifecycle / HA / OLG / ..., key state and control, what's stochastic, what closes the problem>.
- **Why in-ballpark:** <one sentence: what makes this paper interesting for Econ-ARK>.

## If a user asks to work on this item

1. **Read first:** <file> — <why this is authoritative>.
2. **Paper source for AI ingestion:** `<citekey>.mmd` (locally-produced via Mathpix or pandoc; gitignored, not committed). Prefer this over `<citekey>.pdf` for AI ingestion; produce a local `.mmd` from the PDF if one isn't already present.

## Formalization status

- Explicit recursive formulation: <present in `_summary.ipynb` | not yet stated>.
- `bellman-excerpt.md`: <committed | not committed> (product of the Matsya iteration loop).
- `dolo-plus-draft.yaml`: <committed | not committed>.
- `verification.md`: <committed | not committed>.
- `matsya-session.txt`: <committed | not committed>.

## Known model features requiring attention in a formalization pass

- <feature 1>: <what's awkward and why; how you worked around it or plan to>.
- <feature 2>: ...
- <feature 3>: ...

## Common next tasks (grounded)

1. <task 1, tied to a specific file or section>.
2. <task 2>.
3. <task 3>.

## Workflow reminders

- **Matsya session:** use `topics2026-<slug>` for new work on this item.
- **Paper verification:** Matsya output must be checked against the paper PDF (or `.mmd`), not only against the ballpark `_summary.ipynb`.
- **When flagging workarounds in YAML:** use inline `# workaround:` or `# unresolved:` comments rather than silently fudging non-canonical syntax.

AI-assisted drafting (recommended). Once your formalization layer is present, ask a coding agent (Claude Code, Cursor) to draft AGENTS.md from your item’s files:

Read <citekey>.md, <citekey>_intro.ipynb, <citekey>_summary.ipynb, bellman-excerpt.md, dolo-plus-draft.yaml, and verification.md in this directory. Draft an AGENTS.md following the template in the repo-root CONTRIBUTING.md. Do not invent content — if a section lacks a grounded source in these files, write TBD for that section and explain what you would need.

Then review carefully. Agents occasionally invent plausible-sounding “next tasks” or “workarounds” that are not grounded in your verification notes. Rewrite anything you cannot trace to a specific file. The point of AGENTS.md is that a later agent can trust it; that trust is wasted if you pass through hallucinations.

Repo-level artifacts (maintained centrally, not per item)¶

llms.txt at the repo root — a plain-text sitemap for LLMs following the llmstxt.org convention. Update this file when you add or rename an item.
items.json (auto-generated from frontmatter during the MyST build) — machine-readable catalog; one object per item with the full frontmatter flattened.
sitemap.xml and atom.xml — emitted by the MyST build.

Content-form conventions for LLM legibility¶

Every committed .ipynb is also exported to .md at build time. Reviewers and LLMs read the .md; the .ipynb remains authoritative.
.mmd files are local-only (gitignored). A locally-produced Mathpix or pandoc conversion of the paper PDF is much easier for Cursor / Claude / Matsya to ingest, but *.mmd is gitignored at the repo root because the markdown is a derivative of the publisher’s PDF and inherits its copyright. Each contributor produces and maintains their own local copy.
Every equation carries an :alt: attribute describing it in prose, for models that can’t render LaTeX but can read HTML.
Every figure has alt-text (WCAG and LLM indexability are the same action).

Model structure as first-class data (stretch)¶

For items with a committed dolo-plus-draft.yaml, a generated model.json extracts the stage(s) into a programmatic form. This lets retrieval agents answer structural queries like “find all ballpark items with an EGM-compatible interior stage” or “which items have Markov-chain employment states.” The extractor is maintained centrally; contributors do not hand-write model.json.

AI provenance (optional)¶

If AI tools materially shaped the formalization layer, add ai-provenance.md documenting which tools played which role and linking the session artifacts. This gives both credit and traceability.

What does not belong in a ballpark item¶

_build/ — gitignored.
Build-artifact directories named by UUID (e.g. LaTeX *.aux, *.out, *.synctex.gz trees) — gitignored.
*.slides.html — generated on demand; the source .ipynb is authoritative.
Duplicate summary notebooks from older naming conventions (e.g. both <citekey>_summary.ipynb and <ShortName>_summary.ipynb) — delete the duplicate at refactor time.
<citekey>.zip — the .pdf is enough; .zip is only appropriate if it contains replication code, in which case it belongs under replication/.
An item-level README.md duplicating the project README.md — the entry’s <citekey>.md is the entry point for site rendering, and README.md (per the row above) serves a different purpose: GitHub-directory orientation in GFM.

Authorship and provenance¶

The intro notebook carries provenance as visible section content, not buried frontmatter:

**Original ballpark author:** <name>, <YYYY-MM-DD>
**Updated by:** <name>, <YYYY-MM-DD>
**Superseded by:** <link to REMARK / DemARK if promoted>

When you revise an existing item, add (do not overwrite) an Updated by line. When an item is promoted to REMARK or DemARK, add a Superseded by pointer rather than deleting the ballpark entry — the ballpark retains historical interest.

Ballpark tiers¶

Ballpark items progress through three tiers of increasing formalization completeness — analogous in spirit to REMARK’s standard/published distinction but scoped entirely to pre-implementation work. The ballpark’s job is to land a well-specified model ready for a coder; the implementation step (working reproduce.sh, CITATION.cff, binder/environment.yml) happens in REMARK / DemARK, not here.

Each tier is a plateau with a concrete, reviewable qualifying checklist. Contributors can stop at any tier indefinitely.

Tier	One-line characterization	Typical effort from the previous tier (AI-assisted, PhD-course-assignment units)
Draft	Paper identified, claimed, and minimally cataloged.	≈ 1 weekly assignment (from zero / from a `wanted-ballpark` issue).
Primer	A reader can understand the paper and its context without reading the paper.	≤ 2 weekly assignments (from Draft).
Formalized	The model is stated in modular-DDSL form, with a dolo-plus YAML draft.	≤ 2 weekly assignments (from Primer).

Each name presupposes the tier below it: a primer is a completed introductory treatment of what a draft only sketches; a formalized specification is the rigorous re-expression of what the primer states informally. Rank order is unambiguous from the names alone.

(A pre-tier state, Wanted, is an open issue labeled wanted-ballpark with bibliographic info. It has no directory.)

Draft¶

“I am claiming this paper and committing to minimal cataloging.”

Qualifying checklist:

Item directory exists under models/We-Would-Like-In-Econ-ARK/<citekey>/ (or empirical/<citekey>/).
<citekey>.md with required frontmatter (including tier: draft). NOT named index.md — see Exposition layer above for the slug-collision rationale.
<citekey>_intro.ipynb with citation, DOI link, Original ballpark author + date, and a 3-sentence pitch of why the paper is in-ballpark for Econ-ARK.
<citekey>_summary.ipynb with a non-technical motivation + findings overview of the paper (a graduate student can tell from this notebook what the paper is about and what it finds, without having read the paper). The rigorous “The Model” section stating the explicit recursive formulation is not required at Draft — it is added at Primer promotion.
references.bib (may be empty at Draft).
Paper committed as <citekey>.pdf OR replaced by a DOI pointer with a license note in _intro.ipynb.

Draft is the minimum mergeable contribution. It converts a wanted-ballpark issue into a claimed directory.

Primer¶

“A graduate student can orient themselves around this paper without reading it.”

Qualifying checklist — everything in Draft, plus:

<citekey>_prior-literature.ipynb situating the paper in its foundational literature, with {cite:t} citations resolving from references.bib. Cite at least 3 and no more than 6 prior papers — enough to establish context, few enough that the notebook stays focused.
<citekey>_summary.ipynb extended (the notebook already exists from Draft, carrying the non-technical motivation + findings overview) with a “The Model” section stating the recursive formulation explicitly: no u(c) placeholders, explicit CRRA or EZ kernel, explicit bequest function (if any), explicit transitions, explicit shock distributions, explicit constraint set.
<citekey>_subsequent-literature.ipynb + subsequent-literature.bib. No hard citation count is required, since recent papers may have few subsequent citations; the notebook should cite whatever subsequent work exists (typically 0–6 papers) and note explicitly if the paper is too recent to have accumulated much. (Paper eligibility itself is gated by the Google-Scholar-≥3 rule in “Before you start.”)
self.bib with the paper’s own bib entry.
(Note: <citekey>.mmd — a locally-produced Mathpix or pandoc conversion of the paper PDF — makes AI ingestion much smoother for Cursor / Claude / Matsya, but *.mmd is gitignored and not part of this checklist. Each contributor maintains their own local copy.)
Root myst.yml updated to add the entry to its TOC (per-entry myst.yml is no longer required — the root config is authoritative for the site build); myst build --html at the repo root completes cleanly with the entry included.
<citekey>.md {include}s all four exposition notebooks in order.

Primer is the current aspirational target for the typical legacy-slideware refactor. Benhabib_et_al_2019 is the reference instance of this tier.

Formalized¶

“The model has been translated into a modular-DP specification ready for a coder.”

Qualifying checklist — everything in Primer, plus:

bellman-excerpt.md — the product of the Matsya iteration loop described in “The formalization iteration” above. A standalone modular-DDSL Bellman statement containing comprehensive symbol table, timing convention, a periods / stages / perches decomposition, stage operator, and (where the iteration converged rather than terminating in informative failure) a “Stage composition” subsection aligned to SolvingMicroDSOPs §§12–13 and an EGM-channel discussion where the utility is invertible. The two required components in detail:
- Symbol table. Lists every object that appears — or might appear — in the formalized statement of the model: states, controls, shocks, parameters, value functions, marginal-value functions, constraints, income maps, deterministic deductions, normalization factors, timing indices, type / family indices, and any other quantity referenced anywhere in the Bellman equation, transitions, or mover blocks. Each row gives the symbol, its role (state / control / shock / parameter / derived / …), its space or domain, and a one-line description. The intent is that a reader (human or agent) can read the symbol table alone and know what every subsequent symbol in the document means without hunting through prose.
- Perch decomposition. Required for every stage. Names the three perches — arrival ( $\prec$ ), decision (∘), continuation ( $\succ$ ) — and at each perch lists the state variables carried, the value function, and (at the decision perch) the control. Then names the two within-stage transitions: $\mathrm{g}_{\prec\circ}$ (arrival-to-decision, resolving shocks + building the decision-perch state) and $\mathrm{g}_{\circ\succ}$ (decision-to-continuation, the savings or poststate identity). Then names the two movers: the backward mover $\mathbb{B}$ (continuation-to-decision, which performs the $\max$ over the control) and the forward / arrival mover $\mathbb{I}$ (decision-to-arrival, which integrates over next-period shocks). Finally states the stage operator $\mathbb{T} = \mathbb{I} \circ \mathbb{B}$ .
  Stub or degenerate perches are fine and should be explicitly labeled as such. A perch can be degenerate in several natural ways: an arrival mover $\mathbb{I}$ can collapse to the identity when there are no within-period shocks (as in Benhabib et al. 2019); an arrival-to-decision transition can be degenerate when the decision-perch state is carried unchanged from the arrival perch; a continuation perch can be a stub when the stage has no intertemporal linkage (rare, but allowed). What is not acceptable is omitting a perch, a transition, or a mover from the decomposition because it happens to be trivial — a reviewer must be able to tell “this is an identity” from “this was forgotten.”
dolo-plus-draft.yaml — one-stage YAML (interior period sufficient); all unresolved features flagged with inline # workaround: or # unresolved: comments.
verification.md — one paragraph stating what was accepted / edited / rejected from Matsya’s output, compared against the published paper (not only the _summary.ipynb).
matsya-session.txt — the --session string used, if AI-assisted; or a file containing N/A — hand-written otherwise.
AGENTS.md — required at Formalized. See the section above for how to produce it.

Formalized is the ballpark’s top tier. A Formalized item is ready to be picked up by a coder (human or agent) and promoted to REMARK or DemARK — the implementation work happens there, not here.

Beyond Formalized: promotion out of the ballpark¶

Once a Formalized item has working code reproducing paper results, it is eligible for promotion to REMARK (for substantial replications) or DemARK (for demonstrations). REMARK itself has a tiering (standard vs. published-with-DOI); those criteria are documented at the REMARK repo and are not this repository’s concern.

When an item is promoted, add a Superseded by pointer in _intro.ipynb rather than deleting the ballpark entry — the entry retains historical and pedagogical interest.

Promotion mechanics within the ballpark¶

Each tier is a plateau; indefinite residence is fine.
A promotion PR adds the next tier’s files and updates tier: in the frontmatter.
PR title pattern: Promote <citekey> to Primer / Promote <citekey> to Formalized.
The PR body quotes the qualifying checklist for the target tier and ticks each box with a file-line citation.
Tier regression (e.g. Formalized → Primer) is allowed when an item’s formalization is found to be incorrect and is being withdrawn for revision; it should be rare and the PR must explain the defect.

Review policy¶

Review requirements depend on the target tier.

Draft and Primer: self-serve. The contributor opens the PR, ticks the target tier’s qualifying checklist in the PR body with file-line citations, and merges once the automated checks (see below) pass. No designated reviewer is required at these tiers because the qualifying criteria are mechanically checkable.
Formalized: automated gate, then designated human reviewer. The contributor opens the PR the same way, but:
1. Automated rigorous check runs first. CI runs the full Formalized checklist as executable checks (file-existence, YAML validity, MyST build, bib resolution, AGENTS.md section structure, symbol-table presence, perch-decomposition keyword presence, etc. — see “Automated checks” below). The PR cannot be assigned to a human reviewer until CI passes.
2. Human reviewer from REVIEWERS.md (to be added; starts with the maintainer list) then approves before merge. The reviewer’s job is specifically the things CI cannot check: economic correctness of the Bellman equation, correctness of the perch decomposition, defensibility of the YAML workarounds, whether verification.md actually compares to the published paper (versus merely claiming to), and quality of the model exposition.
Rationale: Formalized is the tier where content can be plausible-looking-but-wrong, and catching that needs a reviewer with DP background. Gating the human review behind CI ensures reviewer time is spent on judgment, not on finding missing files.

Automated checks (CI)¶

A .github/workflows/ballpark-check.yml action (forthcoming in a follow-up PR) will run per-tier checks and post a status on the PR. A contributor’s checklist tick is not sufficient at any tier — CI must also pass.

Per-tier mechanical gates the CI will enforce:

Draft: directory path correct; <citekey>.md frontmatter present with required fields and tier: value in controlled set; <citekey>_intro.ipynb exists and contains citation / DOI / author; <citekey>_summary.ipynb exists (non-technical motivation + findings overview — “The Model” heading is not yet required); references.bib exists; paper .pdf committed or DOI pointer present.
Primer (additive): the remaining exposition notebooks (_prior-literature.ipynb, _subsequent-literature.ipynb) exist; _summary.ipynb now contains a “The Model” heading; _prior-literature.ipynb resolves 3–6 unique {cite:t} references against bib files; self.bib and subsequent-literature.bib exist; root myst.yml includes the entry in its TOC, and myst build --html at the repo root succeeds with the entry built (per-entry myst.yml is no longer required); <citekey>.md {include}s all four notebooks; every {cite:t} resolves; every referenced figure exists. (<citekey>.mmd — gitignored, local-only — is no longer a Primer-tier CI check.)
Formalized (additive): bellman-excerpt.md, dolo-plus-draft.yaml, verification.md, matsya-session.txt, AGENTS.md all exist; dolo-plus-draft.yaml parses as YAML; bellman-excerpt.md contains a markdown table (heuristic: at least one pipe-delimited row with a Symbol column) and references all three perch names (arrival, decision, continuation); AGENTS.md contains the six required top-level sections (heading-based check).

What CI does not check at Formalized (and therefore what the human reviewer is responsible for): the Bellman equation being correct, the perch decomposition being correct, the YAML workarounds being defensible, and verification.md genuinely comparing against the published paper.

Badges¶

Each item’s rendered page carries a tier badge (Draft / Primer / Formalized) at the top. Catalog cards show the badge so visitors can filter by tier (e.g. “show me all Primer items that need promotion to Formalized” — a natural call-to-contribute).

The badge derives from the tier: frontmatter field; the MyST build pipeline renders it automatically. Contributors do not hand-insert badge markdown.

Effort calibration (for contributors and instructors)¶

Effort is expressed in PhD-course-assignment units assuming AI-assisted workflow (Cursor + Claude + Matsya). These estimates are generous upper bounds:

Step	Upper bound
→ Draft	1 weekly assignment
Draft → Primer	≤ 2 weekly assignments
Primer → Formalized	≤ 2 weekly assignments
Total from zero to Formalized	≤ 5 weekly assignments

These estimates guide course-project scoping: a full semester leaves ample room for a student to take a paper all the way to Formalized and start on the replication step (which then belongs in REMARK, not here).

Pre-merge checklist¶

The target tier determines the checklist. Copy the target tier’s qualifying checklist from the section above into your PR body and tick each box with a file-line citation. In addition, every PR (regardless of tier) must confirm:

<citekey>.md {include}s exactly the four exposition notebooks, in order (T2 and above).
Root myst.yml builds the full site without errors (myst build --html at the repo root, with BASE_URL=/ballpark for GitHub-Pages-style deploy).
Every {cite:t} reference resolves against the bib files.
Every figure the notebooks reference exists and renders.
_intro.ipynb carries visible Original ballpark author and (if applicable) Updated by lines.
No _build/, UUID build directories, or .slides.html files are committed.

Submitting¶

Fork the repo and branch from master with a descriptive name (e.g. add-<citekey> or refactor-<citekey>).
Commit the item in its own directory under models/We-Would-Like-In-Econ-ARK/<citekey>/ (or empirical/<citekey>/).
Open a PR titled Add <citekey> or Refactor <citekey>.
In the PR description, state which layers you produced and which you intentionally skipped.