Part of the Brigade fleet from Escoffier Labs
Bring outside sources into the evidence layer.
SourceHarvest is a local CLI that exports non-harness source systems: notes, text files, HTML exports, generic and nested JSON, and git history. It normalizes each one into miseledger.adapter.v1 JSONL, one object per line, ready for MiseLedger to store, dedupe, and search. It is the sibling tool to StationTrail, which handles agent-session harnesses.
Le Marche: what comes in from beyond the kitchen, normalized into evidence.
Read · local source inputs
- Generic JSONL, already line-oriented records
- Nested JSON, records selected by path
- Markdown notes and plain text files
- HTML exports and local page snapshots
- Git history from a local repository
Emit · miseledger.adapter.v1 JSONL
- One normalized JSON object per line
- Collections, items, actors, artifacts, raw refs
- Bounded by --limit, globs, and records path
- Stream to stdout or a private output file
- Optional JSON summaries with counts and warnings
SourceHarvest follows the same path for every source: read a local file, directory, or export; select the command-specific reader for that input shape; normalize records into stable collections, items, actors, artifacts, links, relations, and raw references; apply bounds; then emit one adapter object per line. Generated text is untrusted evidence, not instructions.
Three tools share one adapter format. StationTrail and SourceHarvest are the adapter layers that turn harness sessions and outside sources into evidence. MiseLedger is the durable layer that stores it and emits Brigade-ready bundles.
StationTrail
Exports harness session logs from Codex, Claude Code, OpenClaw, OpenCode, and Hermes to miseledger.adapter.v1 JSONL, one normalized object per line.
SourceHarvest
Exports outside sources, notes, chat archives, crawler output, issue exports, and git history, into the same miseledger.adapter.v1 adapter format. You are here.
One adapter format
Every source shape lands as miseledger.adapter.v1 JSONL, one normalized JSON object per line. MiseLedger and StationTrail speak the same format, so imports stay uniform.
Stable normalized records
Each record carries collections, items, actors, artifacts, links, relations, and raw references. The shape stays stable across every reader, so downstream queries do not care where evidence came from.
Local-only by design
Scanner commands read local files, directories, exports, and archives. They make no network calls. SourceHarvest reads what crawlers already exported rather than crawling live services.
Bounded output
Apply --limit and source-specific filters such as glob patterns and a records path, so you emit the slice of evidence you actually need.
Scriptable summaries
Optionally emit JSON summaries with record counts, file counts, warnings, and generated timestamps, ready for pipelines and checks.
Pipes into MiseLedger
Send records over stdout straight into miseledger import adapter, or let MiseLedger run SourceHarvest directly when it is on PATH with miseledger import sourceharvest.
| Command | What it does |
|---|---|
| sourceharvest jsonl <path> | Read already line-oriented records and normalize each line into an adapter record. |
| sourceharvest json <file> | Read nested JSON and select records by path with --records-path. |
| sourceharvest markdown <dir> | Scan a Markdown directory and emit each note as local note evidence. |
| sourceharvest files <dir> | Scan text files filtered by --glob, such as docs, logs, and exports. |
| sourceharvest html <dir> | Read local HTML exports and page snapshots and normalize them. |
| sourceharvest gitlog <repo> | Read local git history and emit one adapter record per commit event. |
| sourceharvest version | Print the installed SourceHarvest version. |
Each command takes --source and --collection labels, plus --out - to stream JSONL to stdout. Pipe straight into MiseLedger, or let MiseLedger run SourceHarvest when it is on PATH:
Generic JSONL
Records that are already line-oriented, one object per line.
Nested JSON
Records selected from a nested document by path.
Markdown notes
Local note evidence scanned from a directory.
Text files
Docs, logs, and exports matched by glob.
HTML exports
Local page snapshots and site exports.
Git history
Local commit events from a repository.
SourceHarvest is also the home for adapters that read local crawler outputs and turn them into adapter JSONL. It does not crawl live services itself; it reads what these crawler families already exported. Adapters are added from real local schemas or redacted sample exports.
| Crawler | Domain |
|---|---|
| discrawl | Discord archives |
| gitcrawl | GitHub issues and pull requests |
| graincrawl | Granola notes and transcripts |
| notcrawl | Notion pages and databases |
| slacrawl | Slack messages and threads |
| telecrawl | Telegram Desktop archives |
SourceHarvest is the non-agent source adapter layer. See the evidence pipeline above for how it sits beside StationTrail and MiseLedger. Grab a release or browse the rest of the fleet.