Agent guide¶
Important
Spec §10.3 stability contract. Content here is what jellycell prompt emits.
Patch-version releases MUST NOT change this page. Minor-version releases can
update it; every change gets a CHANGELOG entry.
What jellycell is¶
jellycell is a reproducible-analysis notebook tool. You write .py files
in jupytext percent format with optional PEP-723 dependency blocks; jellycell
runs them via a subprocess Jupyter kernel, caches cell outputs by content hash,
and serves a live HTML catalogue.
Agents: run jellycell prompt at the start of any new jellycell project to
pull this guide into your context.
Project layout¶
Every jellycell project has a jellycell.toml at the root and this canonical
directory layout (all paths configurable via [paths] in jellycell.toml):
my-project/
├── jellycell.toml # project config
├── notebooks/ # .py source notebooks (jupytext percent format)
├── data/ # input data, read by jc.load
├── artifacts/ # writable output files (images, parquet, json, ...)
├── site/ # rendered HTML output
├── manuscripts/ # narrative docs (optional)
└── .jellycell/
└── cache/ # content-addressed cache (git-ignored)
Single-project vs monorepo¶
A single-project repo puts one jellycell.toml at the repo root — use this
for a focused analysis. A jellycell monorepo is one Python environment
(pyproject.toml at the root) hosting several jellycell projects side by side,
each with its own jellycell.toml in a subdirectory. Caches, artifacts, and
site/ output stay scoped to each project’s subtree; one AGENTS.md at the
root covers every project inside. jellycell init <subdir> and
jellycell prompt --write <subdir> are monorepo-aware — they detect an outer
AGENTS.md and refuse to scatter duplicates by default. Pass --nested for
intentional inner scoping (polyglot pattern: jellycell’s guide at the Python
subtree, repo-wide AGENTS.md at the git root) or --force to bypass all
checks. From the monorepo root, run commands via
jellycell run <subdir>/notebooks/foo.py or jellycell --project <subdir> render. See project-layout
for the full convention including polyglot patterns and pnpm script recipes.
File format¶
A notebook is a single .py file in jupytext percent format:
# /// script
# requires-python = ">=3.11"
# dependencies = ["pandas"]
# ///
# %% [markdown]
# # Analysis title
# %% tags=["jc.load", "name=raw"]
import pandas as pd
df = pd.read_csv("data/input.csv")
# %% tags=["jc.step", "name=summary", "deps=raw"]
summary = df.describe()
Rules:
The PEP-723 block (if present) must be at the top of the file. A lint rule rejects mid-file blocks with a
--fixthat moves them.Cells are separated by
# %%markers. Markdown cells use# %% [markdown].Cell metadata (especially
tags=[...]) goes in the marker line.The file is valid Python —
uv run --script notebook.pyworks.
The jc.* API¶
What you import inside a notebook cell:
import jellycell.api as jc
# writes — all take optional caption=/notes=/tags= metadata
jc.save(obj, "artifacts/summary.json", caption="Headline stats")
jc.save(df, "artifacts/data.parquet")
# jc.figure — two forms:
# 1) Path-only: register an existing image file as an artifact (no re-encode).
jc.figure("artifacts/plot.png", caption="Figure 1: mortality by country")
# 2) Render: save a matplotlib figure.
jc.figure(
"artifacts/plot.png",
fig=plt.gcf(),
caption="Figure 1: mortality by country",
notes="DE stands out; JP lowest.",
tags=["result", "figure"],
)
jc.table(df, name="results", caption="Table 1: per-country totals")
# reads (registers a dep edge on the producing cell when inside a run)
jc.load("artifacts/summary.json")
# explicit deps — AST-walked statically, so they enter the cache key
# before the cell runs (the runner parses jc.deps(...) out of cell source).
jc.deps("raw", "processed")
# function-level memoization — uses the same CacheStore as cells.
@jc.cache
def expensive(x, y):
return heavy_compute(x, y)
# cache context (read-only)
jc.ctx.notebook # "notebooks/foo.py"
jc.ctx.cell_id # "foo:3"
jc.ctx.cell_name # "summary" or None
jc.ctx.project # the jellycell.paths.Project
jc.ctx.inside_run # True inside `jellycell run`; False standalone
All jc.* calls work standalone (plain file ops) OR inside jellycell run
(path resolution relative to the project root; artifacts tracked in the
cell’s manifest).
Supported jc.save formats: .parquet, .csv, .json, .pkl, .png.
Manuscript generation: jellycell.tearsheets (1.4.0+)¶
For analyses that emit manuscript-style markdown (FINDINGS.md,
METHODOLOGY.md, per-notebook audit reports) as part of the cache
graph — not as a post-hoc CLI step — call the three helpers in
jellycell.tearsheets from inside a jc.step-tagged cell:
import jellycell.tearsheets as jt
# Summary of estimator results — one H2 + table per top-level key.
jt.findings(
results={
"twfe": {"att": 0.2, "n_obs": 1234},
"cs": {"att": 0.25, "n_obs": 1180},
},
out_path="manuscripts/FINDINGS.md",
project="showcase-rat-containerization",
template_overrides={
"author": "Blaise",
"author_url": "https://ubik.studio",
"month_year": "April 2026", # pin for byte-stable regen
},
)
# Procedural narrative — ordered `section_title → markdown_body`.
jt.methodology(
spec={
"Data source": "Panel of 132 stations over 60 weeks.",
"Identification": "TWFE + Callaway-Sant'Anna robustness checks.",
},
out_path="manuscripts/METHODOLOGY.md",
project="showcase-rat-containerization",
)
# Per-notebook tearsheet (cells, artifacts, JSON summaries) — wraps
# the existing `jellycell export tearsheet` logic. Reads manifests
# from the project's cache.
jt.audit(
"notebooks/03_temporal_and_equity.py",
out_path="manuscripts/tearsheets/audit_03.md",
)
Every helper shares a pinnable header (author / author_url / month_year
/ version) via template_overrides so regeneration is byte-stable
when inputs don’t change. Each returns the resolved Path so callers
can chain into jc.save or similar. Parent directories are created.
CLI commands¶
Command |
Purpose |
|---|---|
|
Scaffold a new project |
|
Run lint rules; auto-fix where possible |
|
Execute a notebook (cached cells skipped) |
|
Render HTML to |
|
Serve live catalogue (requires |
|
List cached cell executions |
|
Remove old entries ( |
|
Wipe the cache |
|
Rebuild SQLite index from manifests |
|
Export to |
|
Export to MyST markdown (full notebook + outputs) |
|
Curated markdown tearsheet → |
|
Reproducible |
|
Show existing checkpoints |
|
Extract a checkpoint to a new sibling directory |
|
Scaffold a new notebook |
|
Emit this guide to stdout |
Every command supports --json for machine-readable output with
schema_version: 1.
Idiomatic patterns¶
Name every cell (
name=foo). Named cells are referenceable fromdeps=..., discoverable in manifests, and easier to re-run selectively.Declare deps explicitly. Running
jellycell lintwithenforce_declared_deps = truecatches undeclared implicit deps.Write artifacts under
artifacts/, not inline to the notebook’s directory. Theenforce_artifact_pathsrule catches mistakes.Inter-cell data goes through
jc.save/jc.load. Cells that are cached don’t re-execute — their in-memory variables aren’t available to re-executed dependent cells. Thejellycell runCLI warns when a run mixes cache hits and re-executions.Commit a lockfile for reproducibility.
env_hashprefersuv.lock/poetry.lockbytes over the PEP-723dependencieslist, so two environments with matching dep names but different resolved versions don’t silently share caches.Keep cells small. One logical operation per cell means the cache is granular. Change one summary → only the summary re-runs.
PEP-723
[tool.jellycell]overrides apply at file scope. Supported keys:project.name,run.kernel,run.timeout_seconds. Unknown keys raise a lint error (no silent typos).Don’t mock the cache. If tests need the cache, use a real
CacheStoreagainsttmp_path.Share results via tearsheets.
jellycell export tearsheet <nb>writes a curated markdown file tomanuscripts/tearsheets/<stem>.md: prose from markdown cells, inlined figures via relative paths, JSON summaries flattened to two-column tables. Thetearsheets/subfolder convention keeps auto-generated files separate from hand-authored writeups (paper drafts, memos, thesis chapters) that live at the root ofmanuscripts/. GitHub renders the result inline — no HTML preview service needed; commit it alongside the notebook so reviewers see the latest run without cloning.Pick an artifact layout that matches the project.
[artifacts] layoutinjellycell.tomlcontrols where path-lessjc.figure()/jc.table()writes:"flat"(default),"by_notebook"→artifacts/<notebook>/<name>.<ext>, or"by_cell"→artifacts/<notebook>/<cell>/<name>.<ext>."by_cell"makes every artifact self-identifying — handy for agents that need to trace “which cell produced this file?” without opening manifests.Commit the story, git-ignore the bulk. For datasets too big to commit, set
[artifacts] max_committed_size_mband letjellycell runwarn you when an artifact crosses the line. Commit a tinyheadline.jsonsummary instead, git-ignore (or Git-LFS) the bytes, and reviewers regenerate from the seed + notebook locally. Seeexamples/large-data/for the full pattern.Caption your artifacts.
jc.save,jc.figure, andjc.tableall acceptcaption="...",notes="...",tags=[...]. Captions become figure/table headings in tearsheets; notes render as an italic subcaption; tags are free-form labels for filtering. All optional — empty by default, no nagging. Seeexamples/paper/andexamples/ml-experiment/for the full metadata flow.Record the trajectory.
jellycell run -m "fixed sign on yoy"appends a timestamped entry tomanuscripts/journal.mdwith the cell summary, any new artifacts, and your message. Opt-out via[journal] enabled = false; append-only from jellycell’s side so hand-edited commentary survives future runs. The journal is the fastest way for a reviewer (or you in six months) to answer “why did the numbers change?”.Snapshot a known-good state.
jellycell checkpoint create -m "submitted for review"bundles the project into a.tar.gzunder.jellycell/checkpoints/.jellycell checkpoint restore <name>extracts into a sibling directory by default — the live project is never touched unless you explicitly opt in with--force. Good for reproducibility handoffs and “safe rollback to before I fixed the sign error” moves.
Invariants (§10 contracts)¶
Three contracts, stable across patch releases. Living statement + ceremony in reference/contracts:
§10.1
--jsonschemas carryschema_version: 1. Additive fields (optional + default) are patch- or minor-safe; renames, removals, and type changes force a major bump.§10.2 Cache key algorithm (
jellycell.cache.hashing) is frozen unlessMINOR_VERSIONin_version.pyis bumped — which forces every cache to invalidate cleanly and rides a major release.§10.3 Agent guide content (what
jellycell promptemits). Typo / clarification edits ride a patch; additive sections ride a minor; rewrites that change existing guidance force a major.