v0 spec — historical genesis

Historical record

This page is the frozen architectural spec that jellycell was built to. It captures what was promised for the initial 1.0.0 cut — phase budgets, the original piggyback map, the first statement of the three §10 contracts. It is preserved verbatim as a record of the project’s starting point; don’t edit it in place to reflect future changes.

For the current authoritative reference, see reference/:

  • Architecture — the living piggyback map + 8-layer dependency order.

  • Contracts — the living statement of §10.1 / §10.2 / §10.3 with their ceremonies.

Phase budgets (§8 below) are retained here as a permanent snapshot of how the implementation order was planned. The numbers still guide scope-creep reviews in the living codebase.

Architecture & build spec

Single jellycell package. [server] extra. Everything below is what was built.


1. Piggyback map

The point of jellycell is the composition, not the parts. Know what each dependency is doing for us so we don’t accidentally duplicate it or fight it.

Layer

We piggyback on

We own

Parse .py / .md

jupytext

Tag vocabulary, PEP-723 extraction, dep-graph assembly

In-memory notebook IR

nbformat.NotebookNode for I/O

Our own pydantic Notebook / Cell models for logic

Kernel subprocess

jupyter-client (KernelManager, BlockingKernelClient)

Per-cell orchestration, timeouts, streaming to the cache

Output capture

Jupyter message protocol (via jupyter-client)

Writing mime bundles to the cache

Cache store

diskcache for the content-addressable blob store

Key derivation, manifest format, SQLite catalogue index

Mime bundle → HTML

nbconvert helpers (NOT its full exporter)

The page shell, navigation, artifact links

Markdown rendering

markdown-it-py + MyST plugins

Tag-aware preprocessing

Templating

jinja2

Templates

Watching files

watchfiles

Debouncing, mapping file → notebook → clients

Live-reload transport

sse-starlette

Event schema, client reconnect story

ASGI server

starlette + uvicorn

Routes

CLI

typer

Command shape, --json contract

Config validation

pydantic + tomllib / tomli-w

Schemas

ipynb round-trip

nbformat (write) + our cache (reattach outputs)

Conversion logic

Two places we could piggyback but shouldn’t:

  • jupyter-cache: it caches whole notebooks for Sphinx-like static builds; its match semantics are “whole notebook matches” with some per-cell logic bolted on. We need per-cell keying with an explicit dep graph and artifact-manifest storage. Writing a thin cache on top of diskcache is smaller than bending jupyter-cache to our model.

  • full nbconvert HTMLExporter: it has its own template system and assumes a notebook-as-document layout. We want a project-wide catalogue. We only need nbconvert’s output transformers (the bits that turn a mime bundle into safe HTML, handle base64 images, etc.). Use those helpers; write the page shell ourselves.

PEP-723 handling (verified against jupytext 1.19): jupytext treats the # /// script block as a regular code cell and on write prepends a # %% marker, mutating the file on first round-trip. The fix is strip-and-reinsert: format/pep723.py extracts the block pre-parse, stashes it in notebook.metadata.jellycell.pep723, and writes it back verbatim ahead of jupytext’s output. Prototype is ~20 lines, round-trip is byte-exact, second round-trip is idempotent. uv run --script still matches the block because PEP 723 requires only that # /// script start a line — position in the file doesn’t matter to it. Convention: the block lives at the top of the file; a lint rule rejects mid-file blocks with a fix hint (see §7).


2. Component architecture

Eight layers. Each one has a clean interface and depends only on the ones above it in this list.

┌─────────────────────────────────────────────────────┐
│ CLI (typer)                                         │  user + agent entry
├─────────────────────────────────────────────────────┤
│ Server (starlette + SSE + watchfiles)               │  live catalogue
├─────────────────────────────────────────────────────┤
│ Render (jinja + nbconvert output helpers)           │  notebook → HTML
├─────────────────────────────────────────────────────┤
│ Run (jupyter-client subprocess kernel)              │  execute cells
├─────────────────────────────────────────────────────┤
│ API (jc.save / jc.figure / jc.table / jc.load)      │  what notebooks import
├─────────────────────────────────────────────────────┤
│ Cache (diskcache + SQLite index)                    │  content-addressed store
├─────────────────────────────────────────────────────┤
│ Format (jupytext + pydantic IR)                     │  parse .py/.md → Notebook
├─────────────────────────────────────────────────────┤
│ Paths + Config                                      │  project layout, settings
└─────────────────────────────────────────────────────┘

2.1 Paths + Config

  • jellycell.toml at project root defines paths and defaults.

  • pydantic model validates; tomllib reads; tomli-w writes (for init / lint --fix).

  • Paths resolve against project root. Everything downstream takes a Project value — not raw paths — so you can’t accidentally write outside declared roots.

2.2 Format

  • jellycell.format.parse(path) -> Notebook reads the file, strips the PEP-723 block, then wraps jupytext.reads(body, fmt="py:percent").

  • jellycell.format.write(nb, path) writes the PEP-723 block verbatim, then appends jupytext.writes(nb, fmt="py:percent").

  • We keep nbformat’s NotebookNode internally for outputs (mime bundles), but everything we reason about (cells, tags, deps, PEP-723 metadata) lives in pydantic models.

  • Cell tags round-trip unchanged through jupytext’s cell metadata — verified. They’re stored in cell.metadata.tags as a list of strings (["jc.load", "name=raw"]). format/tags.py parses them into a typed CellSpec:

    class CellSpec(BaseModel):
        kind: Literal["load", "step", "figure", "table", "setup", "note"] = "step"
        name: str | None = None          # from tag `name=...`
        deps: list[str] = []             # from tag `deps=a,b` or calls to jc.deps(...)
        timeout_s: int | None = None     # from tag `timeout=30`
    
  • PEP-723 block is extracted before jupytext touches the file and reattached after write. format/pep723.py holds the regex (per PEP 723’s spec: # /// script to # /// on their own lines), extract(text) -> (block, body), and insert(block, body) -> text. The raw block string is stashed in notebook.metadata["jellycell"]["pep723"] during parsing so the runner and exporters can see it. The block’s [tool.jellycell] table is parsed separately and merged onto the project’s jellycell.toml at file scope.

  • Round-trip guarantee: a notebook whose PEP-723 block is already at the top of the file is byte-exact after read+write. A second read+write is a no-op. A mid-file block is moved to the top on next write (lint warns about this before it happens — see §7).

2.3 Cache

This is the most load-bearing subsystem. Keep it small and explicit.

Key derivation — a cell’s cache key is:

sha256(
  cell.source_normalized          # strip trailing ws, normalize line endings
  || sorted_dep_keys              # hashes of every cell this one depends on
  || env_hash                     # lockfile hash, or hash of resolved deps list
  || jellycell.MINOR_VERSION      # invalidate on behavior changes
)

Storage layout (all under .jellycell/cache/):

.jellycell/
├── blobs/                     # diskcache — raw mime-bundle + stream bytes
│   └── <content-hash>/...
├── manifests/
│   └── <cache-key>.json       # one per cell execution; links blobs and artifacts
├── artifacts-index/
│   └── <artifact-sha>.json    # reverse index: artifact path → producing cell(s)
└── state.db                   # SQLite: fast listing for the catalogue

Manifest schema (JSON on disk, pydantic in memory):

{
  "cache_key": "sha256-…",
  "notebook": "notebooks/analysis.py",
  "cell_id": "analysis:3",
  "cell_name": "summarize",
  "source_hash": "sha256-…",
  "dep_keys": ["sha256-…", "sha256-…"],
  "env_hash": "sha256-…",
  "executed_at": "2026-04-17T14:32:01Z",
  "duration_ms": 3421,
  "status": "ok",
  "outputs": [
    {"type": "stream", "name": "stdout", "blob": "sha256-…"},
    {"type": "display_data", "mime": "image/png", "blob": "sha256-…", "w": 800, "h": 600}
  ],
  "artifacts": [
    {"path": "artifacts/analysis/summary.parquet", "sha256": "…", "size": 123456, "mime": "application/x-parquet"}
  ]
}

SQLite index (fast lookups for the catalogue page, not authoritative):

CREATE TABLE cells (
  cache_key TEXT PRIMARY KEY,
  notebook TEXT NOT NULL,
  cell_id TEXT NOT NULL,
  cell_name TEXT,
  executed_at TEXT NOT NULL,
  duration_ms INTEGER NOT NULL,
  status TEXT NOT NULL,
  manifest_path TEXT NOT NULL
);
CREATE INDEX idx_cells_notebook ON cells (notebook);

CREATE TABLE artifacts (
  sha256 TEXT NOT NULL,
  path TEXT NOT NULL,
  size INTEGER NOT NULL,
  mime TEXT,
  producer_cache_key TEXT NOT NULL,
  PRIMARY KEY (sha256, path)
);
CREATE INDEX idx_artifacts_producer ON artifacts (producer_cache_key);

The filesystem is the source of truth; the SQLite index is a derived accelerator. jellycell cache rebuild-index re-scans manifests to repair it.

2.4 API (jellycell.api)

What notebooks import. Every call either:

  • happens inside jellycell run — in which case there’s a RunContext available and the call registers an artifact or dep on the current cell, or

  • happens standalone — in which case the call degrades to a plain file op with a no-op manifest side-effect.

# writing
jc.save(obj, path, *, format=None) -> Path
jc.figure(path=None, *, caption=None, fig=None) -> Path
jc.table(df, *, caption=None, name=None) -> Path

# reading (registers a dep edge if inside a run)
jc.load(path) -> Any
jc.path(name) -> Path

# function-level caching (uses same cache store as cells)
@jc.cache
def expensive(x, y): ...

# explicit deps (alternative to cell tags)
jc.deps("raw", "processed") -> None

# context accessors (read-only)
jc.ctx.notebook
jc.ctx.cell_id
jc.ctx.project

Implementation detail: the RunContext is passed via a ContextVar set by the runner before each cell executes. Standalone calls see ctx = None and fall back.

2.5 Run

  • jellycell.run.kernel.Kernel — thin wrapper around jupyter_client.KernelManager + BlockingKernelClient. Owns the subprocess; exposes execute(source) -> CellExecution.

  • CellExecution accumulates outputs (using the Jupyter message protocol) until status: idle arrives, then returns a structured result (outputs, stdout, stderr, error).

  • jellycell.run.runner.Runner(project):

    1. Parse notebook via format.parse.

    2. Compute cache keys for each cell in topological order (deps from tags + jc.deps() calls parsed statically).

    3. For each cell:

      • if cache hit: restore manifest + any artifacts that are missing on disk (from blobs).

      • else: execute via Kernel, write artifacts (via jc.*), build manifest, insert into cache, index.

    4. Emit a RunReport — what ran, what was cached, what failed.

  • Subprocess kernels only. In-process is a premature optimization and a footgun.

  • Parsing jc.deps("a", "b") statically: walk the cell AST, look for calls to jellycell.api.deps or jc.deps at module level. If deps were declared via the cell tag, the static parse just confirms it.

2.6 Render

  • Input: a Notebook + matching Manifests from the cache.

  • Output: a single self-contained HTML file (per notebook) under reports/.

  • Pipeline:

    1. For each cell: render source via markdown-it-py (for markdown cells) or a syntax-highlighted <pre> (code cells; Pygments).

    2. For each output in a cell’s manifest: transform the mime bundle to safe HTML using nbconvert’s helper functions (nbconvert.preprocessors.ClearMetadataPreprocessor is not what we want; we want nbconvert.exporters.html internals: HTMLExporter.resources and filters like strip_ansi, convert_pandoc-free paths).

    3. Link to artifacts/<notebook>/... for files, not base64-inline them. --standalone flag inlines for shareability.

    4. Wrap in a page template (jinja) with side-nav for TOC, collapsible cells, timing badges.

  • index.html at the project root: table of contents across notebooks, recent runs, artifact browser.

2.7 Server

  • Starlette app with routes:

    • GET / → rendered project index.

    • GET /nb/<path> → rendered notebook page.

    • GET /artifacts/<...> → static file serving from artifacts/.

    • GET /api/state.json → current catalogue state (for custom clients / agents).

    • GET /events → SSE stream of reload events.

  • watchfiles watches notebooks/**, manuscripts/**, artifacts/**, jellycell.toml.

  • On a change:

    • Source file changed → re-parse + re-render that notebook → push {"type": "reload", "path": "/nb/…"}.

    • Artifact changed → push {"type": "artifact", "path": "/artifacts/…"}, client updates image src without a full reload.

  • No write endpoints. Server is read-only.

2.8 CLI

  • typer app at jellycell.cli.app:app.

  • One subcommand per file under jellycell.cli.commands/.

  • Every command supports --json. Human mode uses rich; --json mode prints one JSON object to stdout and nothing else.

  • Global flags: --project PATH, --quiet, --verbose, --json.

  • jellycell prompt emits the canonical agent guide to stdout — a single markdown document covering layout, format, tags, API, and the full CLI reference (auto-generated from typer).


3. End-to-end data flow

jellycell run notebooks/analysis.py

1. cli.run.callback(path, --json)
   └─> Project.from_path(cwd) → Project (validated jellycell.toml)
2. format.parse(path) → Notebook (jupytext under the hood)
3. pep723.extract(path) → ScriptMeta (merged into Project)
4. run.Runner(project).run(notebook)
   ├─ ast-walk cells to resolve deps (from tags + jc.deps calls)
   ├─ topological sort → exec order
   ├─ for each cell:
   │   ├─ key = cache.hashing.key(cell, dep_keys, env_hash)
   │   ├─ if cache.store.has(key):
   │   │    load Manifest; restore any missing artifacts from blobs
   │   │    yield CellResult(status="cached")
   │   ├─ else:
   │   │    Kernel.execute(cell.source)
   │   │    → Outputs + artifacts-written-during-run (via jc.* hooks)
   │   │    cache.store.put(key, manifest, blobs)
   │   │    cache.index.insert(manifest)
   │   │    yield CellResult(status="ok"|"error")
   └─ return RunReport
5. --json: print RunReport as JSON; human: rich table

jellycell view

1. cli.view.callback() → requires [server] extra
2. server.app.build(project) → Starlette app
3. watchfiles.watch(project.watched_paths) in a task
   on change → render.rerender(path) + sse.push(ReloadEvent)
4. uvicorn.run(app, host, port)
Client opens /:
  index.html listing notebooks + recent artifacts
Client opens /nb/analysis:
  connects to /events (SSE)
  on reload event → fetch /nb/analysis fragment, swap

4. Repository layout

jellycell/
├── .github/workflows/
│   ├── ci.yml
│   └── release.yml
├── docs/
│   ├── index.md
│   ├── getting-started.md
│   ├── file-format.md
│   ├── project-layout.md
│   ├── cli-reference.md
│   ├── agent-guide.md
│   └── spec/v0.md
├── examples/
│   ├── minimal/
│   ├── paper/
│   └── ml-experiment/
├── src/
│   └── jellycell/
│       ├── __init__.py
│       ├── __main__.py
│       ├── _version.py
│       ├── api.py
│       ├── config.py
│       ├── paths.py
│       ├── format/
│       ├── cache/
│       ├── run/
│       ├── render/
│       ├── export/
│       ├── lint/
│       ├── server/
│       └── cli/
├── tests/
│   ├── unit/
│   ├── integration/
│   └── fixtures/
├── pyproject.toml
├── README.md
├── CHANGELOG.md
├── LICENSE
└── jellycell.toml.example

(See the v0 spec in the repo for the full file-by-file breakdown.)


5. pyproject.toml

See pyproject.toml in-repo for the authoritative version. Key points:

  • hatchling build backend.

  • Python ≥ 3.11, classifiers for 3.11/3.12/3.13.

  • Runtime deps: jupytext, jupyter-client, ipykernel, nbformat, nbconvert, diskcache, typer, rich, pydantic, markdown-it-py, pygments, jinja2, platformdirs, tomli-w.

  • [server] extra: uvicorn, starlette, watchfiles, sse-starlette.

  • [docs] extra: sphinx, sphinx-autodoc2, myst-parser, myst-nb, furo, sphinxcontrib-typer, sphinx-design, sphinx-copybutton.

  • [dev] extra: pytest, pytest-cov, pytest-asyncio, pytest-regressions, ruff, mypy, pre-commit, hatch.

  • Dropped the jc console alias — not worth a potential collision on users’ machines.


6. Config schemas

jellycell.toml

[project]
name = "my-project"

[paths]
notebooks = "notebooks"
data = "data"
artifacts = "artifacts"
reports = "reports"
manuscripts = "manuscripts"
cache = ".jellycell/cache"

[run]
kernel = "python3"
subprocess = true
timeout_seconds = 600

[viewer]
host = "127.0.0.1"
port = 5179
watch = ["notebooks", "manuscripts", "artifacts"]

[lint]
enforce_artifact_paths = true
enforce_declared_deps = false
warn_on_large_cell_output = "10MB"

PEP-723 [tool.jellycell] (overrides at file scope)

# /// script
# requires-python = ">=3.11"
# dependencies = ["pandas", "matplotlib"]
#
# [tool.jellycell]
# project = "paper-2026"
# timeout_seconds = 1200
# ///

7. File format contract

PEP-723 block placement

  • Must be at the top of the file. Any content before the block (other than whitespace) is rejected by the lint rule pep723-position and re-flowed by jellycell lint --fix.

  • Optional but strongly encouraged. A file without a block is valid and uses the project-level jellycell.toml unchanged.

  • [tool.jellycell] is the only sanctioned override table. Other tool tables (e.g. [tool.ruff]) are preserved unchanged.

  • The raw block is available programmatically as notebook.metadata["jellycell"]["pep723"] after parsing.

Cell tags

Tag

Meaning

Extra attrs

jc.load

loads input data, conventionally from data/

name=

jc.step

default — transform, compute

name=, deps=a,b

jc.figure

writes an image artifact

name=

jc.table

writes a tabular artifact

name=

jc.setup

no deps, not cached, runs first

jc.note

markdown-only, not executable

Tags live in the jupytext cell header as a comma-separated metadata list, stored as cell.metadata.tags: list[str] in nbformat. This round-trips losslessly through jupytext — verified. Untagged code cells default to jc.step with an auto-generated name (<notebook>:<ordinal>).

Example

# /// script
# requires-python = ">=3.11"
# dependencies = ["pandas", "matplotlib"]
# ///

# %% [markdown]
# # Mortality trend analysis

# %% tags=["jc.load", "name=raw"]
import pandas as pd
df = pd.read_csv("data/who_mortality.csv")

# %% tags=["jc.step", "name=summary", "deps=raw"]
summary = df.groupby("country")["deaths"].sum().sort_values()
jc.save(summary, "artifacts/summary.parquet")
jc.table(summary.head(20), caption="Top 20 countries")

# %% tags=["jc.figure", "deps=summary"]
import matplotlib.pyplot as plt
summary.tail(20).plot.barh()
jc.figure(caption="Top 20 by deaths")  # path auto-chosen

8. Build phases (sized in files)

The phases below describe implementation order and per-area file budgets — they are not a release plan. The budgets are scope-creep signals: if changes in one area start pushing a phase’s src count well past its budget, cut features or defer them rather than raising the ceiling.

Files listed are new unless marked [mod].

Phase 0 — Skeleton

Goal: name locked on PyPI, CI green, empty but valid package.

Budget: 3 src files. (__init__, __main__, _version.)

Phase 1 — Format, config, init, lint skeleton

Goal: jellycell init my-project scaffolds a project; jellycell lint passes on it; format parsing works round-trip with jupytext.

Budget: 13 src files + 8 test/fixture files = 21 files.

Phase 2 — Cache + run + API

Goal: jellycell run notebooks/foo.py executes end-to-end; second run hits the cache for unchanged cells; jc.* helpers work inside cells and as plain scripts.

Budget: 13 src files + 9 test files = 22 files. Load-bearing phase.

Phase 3 — Render

Goal: jellycell render notebooks/foo.py produces a self-contained HTML page in reports/.

Budget: 11 src files + 3 test files = 14 files.

Phase 4 — Live viewer

Goal: jellycell view serves the catalogue.

Budget: 5 src files + 2 test files = 7 files.

Phase 5 — Export

Goal: ipynb / MyST markdown export with reattached outputs.

Budget: 4 src files + 3 test files = 7 files.

Phase 6 — Agent surface + polish

Goal: Two examples work. README honest. jellycell prompt emits the canonical agent guide.

Budget: ~21 files touched.


9. Total scope

Target: ~95 files across src + tests + docs + examples, with ~50 files in src/jellycell/ across all phases. Small enough for one person to hold end-to-end. If a phase creeps past its budget while you’re extending it, that’s a scope-creep signal to cut back — the phase budgets are the ceiling.


10. Cross-cutting contracts (lock these early)

Three things that touch every layer and are expensive to change later:

  1. The --json output schema. Every command that produces data (run, render, cache list, lint, export) emits a typed JSON object whose pydantic model lives in the corresponding subpackage. Document schemas in docs/cli-reference.md. Version them: top-level "schema_version": 1.

  2. The cache key algorithm. Defined in cache/hashing.py with a frozen spec in this file. Any change to what goes into the key is a breaking change — bump MINOR_VERSION in _version.py so caches invalidate cleanly. Never change silently.

  3. The agent guide content. Whatever jellycell prompt emits is the contract with agents. If you change it, agents in the wild get different instructions. Keep it stable across patch versions; changes go in minor releases with a note.

Everything else can evolve freely.

Note

The three contracts above are the genesis statement. The living version — with the current ceremony for each and a classification of additive vs. breaking changes — is at reference/contracts.md. Use the living reference for any new ceremony; this section stays frozen as the original promise.