Track — ML Experiment Tracking, Versioned Like Code

v2.4.1 — Now with artifact diffing

Git-style versioning for ML experiments. Track hyperparameters, metric curves, and artifact hashes across every training run — searchable in milliseconds.

$pip install track

track · bert-sentiment-project · 5 runs

↑ sort: f12026-02-24 18:27

run_idlrbatchepochsval_loss ↕F1 ↕GPU %status

bert-finetune-v3★ BEST

3e-53220/200.1180.924

—

complete

bert-finetune-v4diff ↕ v3→v4

2e-53212/200.1420.891

94%

running

bert-finetune-v1

2e-53220/200.1890.878

—

complete

bert-finetune-v2

5e-56420/200.2030.867

—

complete

bert-base-sweep

1e-4166/200.4410.712

—

failed

5 runs · 1 running · 3 complete · 1 failed⌘K to search all runs

scroll to compare

▤ comparison matrix

Don't take our word for it.

Verify every claim row by row. Each ✓ is backed by a code snippet below. No marketing copy — just receipts.

✓Full support

◐Partial

—Not available

FEATURE

Track

← you are here

MLflow

W&B

Neptune

Git-style experiment versioning

Diff any two runs like a git diff — param changes, metric deltas, artifact hashes

✓

—

◐

Artifact hash tracking

SHA-256 hash of every model checkpoint, dataset slice, and config file

✓

Self-hosted (open source)

Run on your own infra — no data leaves your VPC

✓

—

CLI-first workflow

Single pip install, zero config, works in any training script in 2 lines

✓

◐

Team collaboration & access control

Role-based permissions, shared experiment namespaces, comment threads on runs

✓

—

✓

CI/CD model registry integration

Promote model versions to staging/production via API or GitHub Actions

✓

◐

✓

◐

Free tier (self-hosted unlimited)

No run limits, no seat limits when self-hosted

✓

—

Real-time metric streaming

Sub-second metric updates during training, no polling required

✓

—

✓

◐ = available with paid plan or significant configuration overhead · last verified 2026-02-24

⌥ receipts for every claim

The table checks out.

ROW 01 · git-style versioning

Diff any two runs like `git diff`

See exactly what changed between run_c1e9b7 and run_a8f3d2 — every param, every metric delta, every artifact hash.

< 50ms

to retrieve full run diff, p99

Experiment Versioning

terminaltrack v2.4.1

# Reproduce a result from 6 months ago
$ 
track diff
 run_c1e9b7 run_a8f3d2
# Output:
+ learning_rate: 2e-5 → 3e-5
+ val_loss:      0.142 → 0.118  (−16.9%)
+ f1_score:      0.891 → 0.924  (+3.7%)
− batch_size:    32 → 32         (unchanged)
# Artifact diff:
~ model.pt: sha256 a3f8... → b7c2...

ROW 04 · CLI-first workflow

2 lines. Any training script. Zero config.

Drop into any PyTorch, TensorFlow, or JAX script. No YAML files, no dashboard login, no SDK wrapper classes.

2 lines

to instrument any training script

CLI Speed

train.pytrack v2.4.1

# Your existing training script
import
 torch
import track
track.init(
"bert-finetune-v4"
)
# Inside your training loop:
track.log({
  
"val_loss"
: val_loss,
  
"f1"
: f1_score
})
# That's it. Every epoch is versioned.

ROW 06 · CI/CD model registry

Promote models to production via API

Wire your model registry into GitHub Actions, GitLab CI, or any HTTP client. One API call promotes a run to staging.

1 API call

to promote model to production

CI/CD Integration

promote.ymltrack v2.4.1

# .github/workflows/promote.yml
name
: Promote Best Model
on
: [push]
- name: Promote to production
  run: |
    track registry promote \
      --run run_c1e9b7 \
      --stage production \
      --min-f1 0.92
# Fails CI if F1 < threshold
✓ Model promoted: bert-finetune-v3@production

✦ from the terminal logs of real engineers

The relief of total recall.

8 monthsold run reproduced in 3s

“I reproduced a NeurIPS result from 8 months ago at 1:47 AM the night before the rebuttal deadline. track diff found the exact learning rate schedule in 3 seconds. That's the whole product pitch right there.”

Dr. Priya Chandrasekaran

Research Scientist · DeepMind London

0.2msoverhead per training step

“We had 47 spreadsheet tabs of hyperparameter grids. I spent 4 hours migrating to track on a Friday afternoon. By Monday, the whole team was using it. The CLI is stupid fast — `track log` adds 0.2ms per step.”

Marcus Okonkwo

Senior ML Engineer · Cohere · Toronto

3 regressionscaught before production last quarter

“Our MLOps pipeline promotes models to production via `track registry promote`. Failed experiments now block CI automatically if val_loss regresses. We caught 3 silent regressions last quarter that would have shipped.”

Sofia Bergström

MLOps Lead · Klarna · Stockholm

↓ install the CLI

Zero config.
Total recall.

One package install. Works with PyTorch, TensorFlow, JAX, and any Python training script. Your first run is tracked in under 60 seconds.

$ pip install track

Python 3.8+ · MIT License · 2.3MB install · no system dependencies

✓ Self-hosted free forever✓ No run limits✓ MIT licensed✓ Works offline

Don't take our word for it.

The table checks out.

Diff any two runs like `git diff`

2 lines. Any training script. Zero config.

Promote models to production via API

The relief of total recall.

Zero config.Total recall.

Zero config.
Total recall.