File format

A jellycell notebook is a Python file in jupytext percent format with an optional PEP-723 script block at the top.

Anatomy

# /// script                                         ← PEP-723 block (optional)
# requires-python = ">=3.11"
# dependencies = ["pandas"]
#
# [tool.jellycell]                                   ← optional file-scope overrides
# timeout_seconds = 300
# ///

# %% [markdown]                                      ← markdown cell
# # Title

# %% tags=["jc.load", "name=raw"]                    ← tagged code cell
import pandas as pd
raw = pd.read_csv("data/input.csv")

# %% tags=["jc.step", "name=summary", "deps=raw"]
summary = raw.describe()

PEP-723 block

Per PEP 723: a # /// script block containing TOML metadata.

Rules:

  • Must be at the top of the file. Any content before the block (other than whitespace) is rejected by the lint rule pep723-position. jellycell lint --fix re-flows the file.

  • Optional. A file without a block is valid and inherits project-level config from jellycell.toml.

  • [tool.jellycell] overrides jellycell.toml at file scope. Other [tool.X] tables are preserved unchanged.

  • The raw block text is round-tripped verbatim. jellycell run, jellycell render, and uv run --script <notebook> all see the same block.

Example with overrides:

# /// script
# requires-python = ">=3.11"
# dependencies = ["pandas", "matplotlib"]
#
# [tool.jellycell]
# timeout_seconds = 1200     # override run.timeout_seconds for this file
# ///

Cell markers

Jupytext percent format:

Marker

Cell type

# %%

Code cell (untagged)

# %% tags=["jc.step", "name=foo"]

Code cell with tags

# %% [markdown]

Markdown cell

# %% [raw]

Raw cell (rarely needed)

Cell source is everything between two # %% markers (or from the last marker to end of file).

Cell tags

The tags=[...] list on the marker line. Two classes of tags:

Kind tags — exactly one per cell. Untagged code cells default to jc.step.

Kind

Meaning

jc.load

Loads input data (conventionally from data/)

jc.step

Default — transform, compute

jc.figure

Writes an image artifact

jc.table

Writes a tabular artifact

jc.setup

No deps; not cached; runs first

jc.note

Markdown-only; not executable

Attribute tagskey=value:

Attr

Meaning

Example

name=...

Cell’s name. Referenceable via deps= elsewhere.

name=summary

deps=<name>

Explicit dep. One tag per dep — never comma-separated.

deps=raw, deps=env

timeout=N

Per-cell timeout in seconds

timeout=60

tearsheet

Include this cell/artifact in export tearsheet

tearsheet

deps=a,b,c does not work — nbformat enforces the regex ^[^,]+$ on each individual tag, so a single tag containing a comma (e.g. deps=a,b,c) raises NotebookValidationError before jellycell sees it. Always use one deps= tag per dep: tags=["jc.step", "name=sink", "deps=a", "deps=b", "deps=c"]. jellycell lint has a deps-no-comma rule that catches the broken form with a clear pointer to the fix.

Tags round-trip losslessly through jupytext — verified by tests.

jc.figure accepts path-only invocation (1.3.2+): call jc.figure("artifacts/foo.png") with no fig= and an existing file on disk to register it as an artifact without re-encoding. See the agent guide for the full two-mode signature.

Cache-key semantics

A cell’s cache key is sha256(source, sorted_dep_keys, env_hash, minor_version):

  • source — normalized (line endings \n; per-line trailing whitespace stripped; leading/trailing blank lines removed). So cosmetic edits don’t invalidate the cache.

  • sorted_dep_keys — the cache keys of every dep cell. Dep order doesn’t matter; content does.

  • env_hash — sha256 of the PEP-723 dependencies list (or the project’s lockfile hash when that’s wired up).

  • MINOR_VERSION — jellycell’s cache-key version counter. Bumps on cache/hashing algorithm changes (§10.2 contract). Independent of the package semver.

Change any of these → cache miss → re-execute.

Idempotent round-trip

jellycell.format.parse + write round-trip byte-exact for canonical percent-format input. The PEP-723 block is preserved verbatim (jupytext’s default behavior would mutate it — jellycell strips and re-inserts).

A lint-clean file read + written twice produces no diff.

Cell dataflow: artifacts vs in-memory

Cells can refer to values defined in earlier cells in the same kernel process. That works when every cell in a run executes fresh. But cached cells do not re-execute — their manifest is restored, not their code.

If notebook editing causes a partial cache invalidation (one cell changes, its dependents don’t), the runner will re-execute the changed cells in a kernel where the cached cells’ variables were never defined. Dependent re-executed cells that reference those in-memory variables will fail with NameError.

Rule: inter-cell data goes through jc.save + jc.load. That way the value is persisted as an artifact, and any re-executed cell reads it back from disk regardless of whether the producer ran or was a cache hit.

# cell "raw"
df = pd.read_csv("data/input.csv")
jc.save(df, "artifacts/raw.parquet")

# cell "summary" with deps=raw — safe even if "raw" was cached
df = jc.load("artifacts/raw.parquet")
summary = df.describe()

The jellycell run CLI now warns when a single run mixes cache hits with re-executions; that warning is your cue to double-check that downstream cells use jc.load instead of reaching for variables in the kernel namespace.

Applying [tool.jellycell] overrides at runtime

A notebook’s PEP-723 block may set a [tool.jellycell] table that overrides project-level config at file scope. Supported keys:

Key

Effect

project.name / name

Display name for this notebook

run.kernel / kernel

Jupyter kernel override

run.timeout_seconds / timeout_seconds

Per-notebook cell timeout

Unknown keys raise a lint error — typos like timeouts = 60 don’t silently no-op.