Skip to content

Statistical Training

Fit a cuvis-ai pipeline using StatisticalTrainer — accumulate background moments (mean, covariance, histograms) during a single pass over the data, no gradient steps.

Goal

Produce a saved, ready-to-run pipeline whose statistical nodes have been initialised from data. The resulting pipeline can be replayed with restore-pipeline.

Prerequisites

  • A pipeline with at least one statistical node (RX, PCA, NormalizeFromStats, …).
  • A datamodule that produces unlabelled training data (typically SingleCu3sDataModule or MultiCu3sDataModule).
  • The Concepts → Training page if you want the model behind the trainer.

Recipe

from cuvis_ai_core.trainer import StatisticalTrainer
from cuvis_ai_core.pipeline import Pipeline
from cuvis_ai_core.data.datamodule import SingleCu3sDataModule

pipeline = Pipeline.from_yaml("configs/pipeline/anomaly/rx/rx_statistical.yaml")
datamodule = SingleCu3sDataModule(cu3s_file_path="data/Lentils/Demo_000.cu3s")

trainer = StatisticalTrainer()
trainer.fit(pipeline=pipeline, datamodule=datamodule)

pipeline.save("artifacts/rx_statistical_fitted.yaml")

What happens under the hood

  1. Trainer collects every node whose execution_stages includes STATISTICAL.
  2. For each batch, it calls statistical_initialization(batch) on every collected node.
  3. After the pass, each node finalises its accumulated stats (covariance inversion, normalisation, etc.).
  4. The fitted pipeline is saved as a YAML with TRAINABLE_BUFFERS populated.

Common variations