v2.4.1 — Now with artifact diffing

Git-style versioning for ML experiments. Track hyperparameters, metric curves, and artifact hashes across every training run — searchable in milliseconds.

$pip install track
track · bert-sentiment-project · 5 runs
↑ sort: f12026-02-24 18:27
run_idlrbatchepochsval_loss ↕F1 ↕GPU %status
bert-finetune-v3★ BEST
3e-53220/200.1180.924
complete
bert-finetune-v4diff ↕ v3→v4
2e-53212/200.1420.891
94%
running
bert-finetune-v1
2e-53220/200.1890.878
complete
bert-finetune-v2
5e-56420/200.2030.867
complete
bert-base-sweep
1e-4166/200.4410.712
failed
5 runs · 1 running · 3 complete · 1 failed⌘K to search all runs
scroll to compare
▤ comparison matrix

Don't take our word for it.

Verify every claim row by row. Each ✓ is backed by a code snippet below. No marketing copy — just receipts.

Full support
Partial
Not available
FEATURE
Track
← you are here
MLflow
W&B
Neptune
Git-style experiment versioning
Diff any two runs like a git diff — param changes, metric deltas, artifact hashes
Artifact hash tracking
SHA-256 hash of every model checkpoint, dataset slice, and config file
Self-hosted (open source)
Run on your own infra — no data leaves your VPC
CLI-first workflow
Single pip install, zero config, works in any training script in 2 lines
Team collaboration & access control
Role-based permissions, shared experiment namespaces, comment threads on runs
CI/CD model registry integration
Promote model versions to staging/production via API or GitHub Actions
Free tier (self-hosted unlimited)
No run limits, no seat limits when self-hosted
Real-time metric streaming
Sub-second metric updates during training, no polling required

◐ = available with paid plan or significant configuration overhead · last verified 2026-02-24

⌥ receipts for every claim

The table checks out.

ROW 01 · git-style versioning

Diff any two runs like `git diff`

See exactly what changed between run_c1e9b7 and run_a8f3d2 — every param, every metric delta, every artifact hash.

< 50ms
to retrieve full run diff, p99
Experiment Versioning
terminaltrack v2.4.1
# Reproduce a result from 6 months ago
$
track diff
run_c1e9b7 run_a8f3d2
# Output:
+ learning_rate: 2e-5 → 3e-5
+ val_loss: 0.142 → 0.118 (−16.9%)
+ f1_score: 0.891 → 0.924 (+3.7%)
− batch_size: 32 → 32 (unchanged)
# Artifact diff:
~ model.pt: sha256 a3f8... → b7c2...
ROW 04 · CLI-first workflow

2 lines. Any training script. Zero config.

Drop into any PyTorch, TensorFlow, or JAX script. No YAML files, no dashboard login, no SDK wrapper classes.

2 lines
to instrument any training script
CLI Speed
train.pytrack v2.4.1
# Your existing training script
import
torch
import track
track.init(
"bert-finetune-v4"
)
# Inside your training loop:
track.log({
"val_loss"
: val_loss,
"f1"
: f1_score
})
# That's it. Every epoch is versioned.
ROW 06 · CI/CD model registry

Promote models to production via API

Wire your model registry into GitHub Actions, GitLab CI, or any HTTP client. One API call promotes a run to staging.

1 API call
to promote model to production
CI/CD Integration
promote.ymltrack v2.4.1
# .github/workflows/promote.yml
name
: Promote Best Model
on
: [push]
- name: Promote to production
run: |
track registry promote \
--run run_c1e9b7 \
--stage production \
--min-f1 0.92
# Fails CI if F1 < threshold
✓ Model promoted: bert-finetune-v3@production
✦ from the terminal logs of real engineers

The relief of total recall.

8 monthsold run reproduced in 3s

I reproduced a NeurIPS result from 8 months ago at 1:47 AM the night before the rebuttal deadline. track diff found the exact learning rate schedule in 3 seconds. That's the whole product pitch right there.

Dr. Priya Chandrasekaran profile photo
Dr. Priya Chandrasekaran
Research Scientist · DeepMind London
0.2msoverhead per training step

We had 47 spreadsheet tabs of hyperparameter grids. I spent 4 hours migrating to track on a Friday afternoon. By Monday, the whole team was using it. The CLI is stupid fast — `track log` adds 0.2ms per step.

Marcus Okonkwo profile photo
Marcus Okonkwo
Senior ML Engineer · Cohere · Toronto
3 regressionscaught before production last quarter

Our MLOps pipeline promotes models to production via `track registry promote`. Failed experiments now block CI automatically if val_loss regresses. We caught 3 silent regressions last quarter that would have shipped.

Sofia Bergström profile photo
Sofia Bergström
MLOps Lead · Klarna · Stockholm
↓ install the CLI

Zero config.
Total recall.

One package install. Works with PyTorch, TensorFlow, JAX, and any Python training script. Your first run is tracked in under 60 seconds.

$ pip install track

Python 3.8+ · MIT License · 2.3MB install · no system dependencies

✓ Self-hosted free forever✓ No run limits✓ MIT licensed✓ Works offline