Skip to content

Status: Needs Review

This page has not been reviewed for accuracy and completeness. Content may be outdated or contain errors.


Training API

Training-related components including losses and metrics.

Overview

Training functionality in CUVIS.AI is provided through loss functions, metrics, and monitoring nodes that integrate with PyTorch Lightning.


Loss Functions

losses

Loss nodes for training pipeline (port-based architecture).

LossNode

LossNode(**kwargs)

Bases: Node

Base class for loss nodes that restricts execution to training stages.

Loss nodes should not execute during inference - only during train, val, and test.

Source code in cuvis_ai/node/losses.py
def __init__(self, **kwargs) -> None:
    # Default to train/val/test stages, but allow override
    assert "execution_stages" not in kwargs, (
        "Loss nodes can only execute in train, val, and test stages."
    )

    super().__init__(
        execution_stages={
            ExecutionStage.TRAIN,
            ExecutionStage.VAL,
            ExecutionStage.TEST,
        },
        **kwargs,
    )

OrthogonalityLoss

OrthogonalityLoss(weight=1.0, **kwargs)

Bases: LossNode

Orthogonality regularization loss for TrainablePCA.

Encourages PCA components to remain orthonormal during training. Loss = weight * ||W @ W.T - I||^2_F

Parameters:

Name Type Description Default
weight float

Weight for orthogonality loss (default: 1.0)

1.0
Source code in cuvis_ai/node/losses.py
def __init__(self, weight: float = 1.0, **kwargs) -> None:
    self.weight = weight

    super().__init__(
        weight=weight,
        **kwargs,
    )
forward
forward(components, **_)

Compute weighted orthogonality loss from PCA components.

Parameters:

Name Type Description Default
components Tensor

PCA components matrix [n_components, n_features]

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing weighted loss

Source code in cuvis_ai/node/losses.py
def forward(self, components: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute weighted orthogonality loss from PCA components.

    Parameters
    ----------
    components : Tensor
        PCA components matrix [n_components, n_features]

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing weighted loss
    """
    # Compute gram matrix: W @ W.T
    gram = components @ components.T

    # Target: identity matrix
    n_components = components.shape[0]
    eye = torch.eye(
        n_components,
        device=components.device,
        dtype=components.dtype,
    )

    # Frobenius norm of difference
    orth_loss = torch.sum((gram - eye) ** 2)

    return {"loss": self.weight * orth_loss}

AnomalyBCEWithLogits

AnomalyBCEWithLogits(
    weight=1.0, pos_weight=None, reduction="mean", **kwargs
)

Bases: LossNode

Binary cross-entropy loss for anomaly detection with logits.

Computes BCE loss between predicted anomaly scores and ground truth masks. Uses BCEWithLogitsLoss for numerical stability.

Parameters:

Name Type Description Default
weight float

Overall weight for this loss component (default: 1.0)

1.0
pos_weight float

Weight for positive class (anomaly) to handle class imbalance (default: None)

None
reduction str

Reduction method: 'mean', 'sum', or 'none' (default: 'mean')

'mean'
Source code in cuvis_ai/node/losses.py
def __init__(
    self,
    weight: float = 1.0,
    pos_weight: float | None = None,
    reduction: str = "mean",
    **kwargs,
) -> None:
    self.weight = weight
    self.pos_weight = pos_weight
    self.reduction = reduction

    super().__init__(
        weight=weight,
        pos_weight=pos_weight,
        reduction=reduction,
        **kwargs,
    )

    # Create loss function
    if pos_weight is not None:
        pos_weight_tensor = torch.tensor([pos_weight])
        self.register_buffer("_pos_weight", pos_weight_tensor)
        self.loss_fn = nn.BCEWithLogitsLoss(
            pos_weight=self._pos_weight,
            reduction=reduction,
        )
    else:
        self.loss_fn = nn.BCEWithLogitsLoss(reduction=reduction)
forward
forward(predictions, targets, **_)

Compute weighted BCE loss.

Parameters:

Name Type Description Default
predictions Tensor

Predicted scores [B, H, W, 1]

required
targets Tensor

Ground truth masks [B, H, W, 1]

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing scalar loss

Source code in cuvis_ai/node/losses.py
def forward(self, predictions: Tensor, targets: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute weighted BCE loss.

    Parameters
    ----------
    predictions : Tensor
        Predicted scores [B, H, W, 1]
    targets : Tensor
        Ground truth masks [B, H, W, 1]

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing scalar loss
    """
    # Squeeze channel dimension to [B, H, W] for BCEWithLogitsLoss
    if predictions.dim() == 4 and predictions.shape[-1] == 1:
        predictions = predictions.squeeze(-1)

    if targets.dim() == 4 and targets.shape[-1] == 1:
        targets = targets.squeeze(-1)

    # Convert labels to float
    targets = targets.float()

    # Compute loss
    loss = self.loss_fn(predictions, targets)

    # Apply weight
    weighted_loss = self.weight * loss

    return {"loss": weighted_loss}

MSEReconstructionLoss

MSEReconstructionLoss(
    weight=1.0, reduction="mean", **kwargs
)

Bases: LossNode

Mean squared error reconstruction loss.

Computes MSE between reconstruction and target. Useful for autoencoder-style architectures.

Parameters:

Name Type Description Default
weight float

Weight for this loss component (default: 1.0)

1.0
reduction str

Reduction method: 'mean', 'sum', or 'none' (default: 'mean')

'mean'
Source code in cuvis_ai/node/losses.py
def __init__(self, weight: float = 1.0, reduction: str = "mean", **kwargs) -> None:
    self.weight = weight
    self.reduction = reduction
    # Extract Node base parameters from kwargs to avoid duplication
    super().__init__(
        weight=weight,
        reduction=reduction,
        **kwargs,
    )
    self.loss_fn = nn.MSELoss(reduction=reduction)
forward
forward(reconstruction, target, **_)

Compute MSE reconstruction loss.

Parameters:

Name Type Description Default
reconstruction Tensor

Reconstructed data

required
target Tensor

Target for reconstruction

required
**_ Any

Additional arguments (e.g., context) - ignored but accepted for compatibility

{}

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing scalar loss

Source code in cuvis_ai/node/losses.py
def forward(self, reconstruction: Tensor, target: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute MSE reconstruction loss.

    Parameters
    ----------
    reconstruction : Tensor
        Reconstructed data
    target : Tensor
        Target for reconstruction
    **_ : Any
        Additional arguments (e.g., context) - ignored but accepted for compatibility

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing scalar loss
    """
    # Ensure consistent shapes
    if target.shape != reconstruction.shape:
        raise ValueError(
            f"Shape mismatch: reconstruction {reconstruction.shape} vs target {target.shape}"
        )

    # Compute loss
    loss = self.loss_fn(reconstruction, target)

    # Apply weight
    return {"loss": self.weight * loss}

DistinctnessLoss

DistinctnessLoss(weight=0.1, eps=1e-06, **kwargs)

Bases: LossNode

Repulsion loss encouraging different selectors to choose different bands.

This loss is designed for band/channel selector nodes that output a 2D weight matrix [output_channels, input_channels]. It computes the mean pairwise cosine similarity between all pairs of selector weight vectors and penalizes high similarity:

.. math::

L_\text{repel} = \frac{1}{N_\text{pairs}} \sum_{i < j}
    \cos(\mathbf{w}_i, \mathbf{w}_j)

Minimizing this loss encourages selectors to focus on different bands, preventing the common failure mode where all channels collapse onto the same band.

Parameters:

Name Type Description Default
weight float

Overall weight for this loss component (default: 0.1).

0.1
eps float

Small constant for numerical stability when normalizing (default: 1e-6).

1e-06
Source code in cuvis_ai/node/losses.py
def __init__(self, weight: float = 0.1, eps: float = 1e-6, **kwargs) -> None:
    self.weight = float(weight)
    self.eps = float(eps)

    super().__init__(weight=self.weight, eps=self.eps, **kwargs)
forward
forward(selection_weights, **_)

Compute mean pairwise cosine similarity penalty.

Parameters:

Name Type Description Default
selection_weights Tensor

Weight matrix of shape [output_channels, input_channels].

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with a single key "loss" containing the scalar loss.

Source code in cuvis_ai/node/losses.py
def forward(self, selection_weights: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute mean pairwise cosine similarity penalty.

    Parameters
    ----------
    selection_weights : Tensor
        Weight matrix of shape [output_channels, input_channels].

    Returns
    -------
    dict[str, Tensor]
        Dictionary with a single key ``"loss"`` containing the scalar loss.
    """
    # Normalize each selector vector to unit length
    w = selection_weights
    w_norm = F.normalize(w, p=2, dim=-1, eps=self.eps)  # [C, T]

    num_channels = w_norm.shape[0]
    if num_channels < 2:
        # Nothing to compare - no repulsion needed
        return {"loss": torch.zeros((), device=w_norm.device, dtype=w_norm.dtype)}

    # Compute all pairwise cosine similarities using matrix multiplication (optimized)
    similarity_matrix = w_norm @ w_norm.T  # [C, C] matrix of cosine similarities

    # Extract upper triangular part (i < j pairs), excluding diagonal
    upper_tri = torch.triu(similarity_matrix, diagonal=1)

    # Compute mean of non-zero elements (i < j pairs)
    mean_cos = upper_tri[upper_tri != 0].mean()

    # Minimize mean cosine similarity (repulsion)
    loss = self.weight * mean_cos
    return {"loss": loss}

SelectorEntropyRegularizer

SelectorEntropyRegularizer(
    weight=0.01, target_entropy=None, eps=1e-06, **kwargs
)

Bases: LossNode

Entropy regularization for SoftChannelSelector.

Encourages exploration by penalizing low-entropy (over-confident) selections. Computes entropy from selection weights and applies regularization.

Higher entropy = more uniform selection (encouraged early in training) Lower entropy = more peaked selection (emerges naturally as training progresses)

Parameters:

Name Type Description Default
weight float

Weight for entropy regularization (default: 0.01) Positive weight encourages exploration (maximizes entropy) Negative weight encourages exploitation (minimizes entropy)

0.01
target_entropy float

Target entropy for regularization (default: None, no target) If set, uses squared error: (entropy - target)^2

None
eps float

Small constant for numerical stability (default: 1e-6)

1e-06
Source code in cuvis_ai/node/losses.py
def __init__(
    self,
    weight: float = 0.01,
    target_entropy: float | None = None,
    eps: float = 1e-6,
    **kwargs,
) -> None:
    self.weight = weight
    self.target_entropy = target_entropy
    self.eps = eps

    super().__init__(
        weight=weight,
        target_entropy=target_entropy,
        eps=eps,
        **kwargs,
    )
forward
forward(weights, **_)

Compute entropy regularization loss from selection weights.

Parameters:

Name Type Description Default
weights Tensor

Channel selection weights [n_channels]

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing regularization loss

Source code in cuvis_ai/node/losses.py
def forward(self, weights: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute entropy regularization loss from selection weights.

    Parameters
    ----------
    weights : Tensor
        Channel selection weights [n_channels]

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing regularization loss
    """
    # Normalize weights to probabilities
    probs = weights / (weights.sum() + self.eps)

    # Compute entropy: -sum(p * log(p))
    entropy = -(probs * torch.log(probs + self.eps)).sum()

    # Compute loss
    if self.target_entropy is not None:
        # Target-based regularization: minimize distance to target
        loss = (entropy - self.target_entropy) ** 2
    else:
        # Simple regularization:
        # maximize (positive weight) or minimize (negative weight) entropy
        loss = -entropy

    # Apply weight
    return {"loss": self.weight * loss}

SelectorDiversityRegularizer

SelectorDiversityRegularizer(weight=0.01, **kwargs)

Bases: LossNode

Diversity regularization for SoftChannelSelector.

Encourages diverse channel selection by penalizing concentration on few channels. Uses negative variance to encourage spread (higher variance = more diverse).

Parameters:

Name Type Description Default
weight float

Weight for diversity regularization (default: 0.01)

0.01
Source code in cuvis_ai/node/losses.py
def __init__(self, weight: float = 0.01, **kwargs) -> None:
    self.weight = weight
    super().__init__(
        weight=weight,
        **kwargs,
    )
forward
forward(weights, **_)

Compute weighted diversity loss from selection weights.

Parameters:

Name Type Description Default
weights Tensor

Channel selection weights [n_channels]

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing weighted loss

Source code in cuvis_ai/node/losses.py
def forward(self, weights: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute weighted diversity loss from selection weights.

    Parameters
    ----------
    weights : Tensor
        Channel selection weights [n_channels]

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing weighted loss
    """
    # Compute variance of weights (high variance = diverse selection)
    mean_weight = weights.mean()
    variance = ((weights - mean_weight) ** 2).mean()

    # Return negative variance (minimizing loss = maximizing variance = maximizing diversity)
    diversity_loss = -variance

    return {"loss": self.weight * diversity_loss}

DeepSVDDSoftBoundaryLoss

DeepSVDDSoftBoundaryLoss(nu=0.05, weight=1.0, **kwargs)

Bases: LossNode

Soft-boundary Deep SVDD objective operating on BHWD embeddings.

Source code in cuvis_ai/node/losses.py
def __init__(self, nu: float = 0.05, weight: float = 1.0, **kwargs) -> None:
    if not (0.0 < nu < 1.0):
        raise ValueError("nu must be in (0, 1)")
    self.nu = float(nu)
    self.weight = float(weight)

    super().__init__(nu=self.nu, weight=self.weight, **kwargs)

    self.r_unconstrained = nn.Parameter(torch.tensor(0.0))
forward
forward(embeddings, center, **_)

Compute Deep SVDD soft-boundary loss.

The loss consists of the hypersphere radius R² plus a slack penalty for points outside the hypersphere. The radius R is learned via an unconstrained parameter with softplus activation.

Parameters:

Name Type Description Default
embeddings Tensor

Embedded feature representations [B, H, W, D] from the network.

required
center Tensor

Center of the hypersphere [D] computed during initialization.

required
**_ Any

Additional unused keyword arguments.

{}

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing the scalar loss value.

Notes

The loss formula is: loss = weight * (R² + (1/ν) * mean(ReLU(dist - R²))) where dist is the squared distance from embeddings to the center.

Source code in cuvis_ai/node/losses.py
def forward(self, embeddings: Tensor, center: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute Deep SVDD soft-boundary loss.

    The loss consists of the hypersphere radius R² plus a slack penalty
    for points outside the hypersphere. The radius R is learned via
    an unconstrained parameter with softplus activation.

    Parameters
    ----------
    embeddings : Tensor
        Embedded feature representations [B, H, W, D] from the network.
    center : Tensor
        Center of the hypersphere [D] computed during initialization.
    **_ : Any
        Additional unused keyword arguments.

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing the scalar loss value.

    Notes
    -----
    The loss formula is: loss = weight * (R² + (1/ν) * mean(ReLU(dist - R²)))
    where dist is the squared distance from embeddings to the center.
    """
    B, H, W, D = embeddings.shape
    z = embeddings.reshape(B * H * W, D)
    R = torch.nn.functional.softplus(self.r_unconstrained, beta=10.0)
    dist = torch.sum((z - center.view(1, -1)) ** 2, dim=1)
    slack = torch.relu(dist - R**2)
    base_loss = R**2 + (1.0 / self.nu) * slack.mean()
    loss = self.weight * base_loss

    return {"loss": loss}

IoULoss

IoULoss(
    weight=1.0,
    smooth=1e-06,
    normalize_method="sigmoid",
    **kwargs,
)

Bases: LossNode

Differentiable IoU (Intersection over Union) loss.

Computes: 1 - (|A ∩ B| + smooth) / (|A U B| + smooth) Works directly on continuous scores (not binary decisions), preserving gradients.

The scores are normalized to [0, 1] range using sigmoid or clamp before computing IoU, ensuring differentiability.

Parameters:

Name Type Description Default
weight float

Overall weight for this loss component (default: 1.0)

1.0
smooth float

Small constant for numerical stability (default: 1e-6)

1e-06
normalize_method ('sigmoid', 'clamp', 'minmax')

Method to normalize predictions to [0, 1] range (default: "sigmoid") - "sigmoid": Apply sigmoid activation (good for unbounded scores) - "clamp": Clamp to [0, 1] (good for scores already in reasonable range) - "minmax": Min-max normalization per batch (good for varying score ranges)

"sigmoid"

Examples:

>>> iou_loss = IoULoss(weight=1.0, smooth=1e-6)
>>> # Use with AdaClip scores directly (no thresholding needed)
>>> loss = iou_loss.forward(predictions=adaclip_scores, targets=ground_truth_mask)
Source code in cuvis_ai/node/losses.py
def __init__(
    self,
    weight: float = 1.0,
    smooth: float = 1e-6,
    normalize_method: str = "sigmoid",
    **kwargs,
) -> None:
    self.weight = weight
    self.smooth = smooth
    self.normalize_method = normalize_method

    if normalize_method not in ["sigmoid", "clamp", "minmax"]:
        raise ValueError(
            f"normalize_method must be one of ['sigmoid', 'clamp', 'minmax'], got {normalize_method}"
        )

    super().__init__(
        weight=weight,
        smooth=smooth,
        normalize_method=normalize_method,
        **kwargs,
    )
forward
forward(predictions, targets, **_)

Compute differentiable IoU loss.

Parameters:

Name Type Description Default
predictions Tensor

Predicted anomaly scores [B, H, W, 1] (any real values)

required
targets Tensor

Ground truth binary masks [B, H, W, 1]

required

Returns:

Type Description
dict[str, Tensor]

Dictionary with "loss" key containing scalar IoU loss

Source code in cuvis_ai/node/losses.py
def forward(self, predictions: Tensor, targets: Tensor, **_: Any) -> dict[str, Tensor]:
    """Compute differentiable IoU loss.

    Parameters
    ----------
    predictions : Tensor
        Predicted anomaly scores [B, H, W, 1] (any real values)
    targets : Tensor
        Ground truth binary masks [B, H, W, 1]

    Returns
    -------
    dict[str, Tensor]
        Dictionary with "loss" key containing scalar IoU loss
    """
    # Normalize predictions to [0, 1] range based on method
    if self.normalize_method == "sigmoid":
        # Sigmoid: good for unbounded scores (e.g., logits)
        pred = torch.sigmoid(predictions)
    elif self.normalize_method == "clamp":
        # Clamp: good for scores already in reasonable range
        pred = torch.clamp(predictions, 0.0, 1.0)
    elif self.normalize_method == "minmax":
        # Min-max normalization per batch
        pred_min = predictions.min()
        pred_max = predictions.max()
        if pred_max > pred_min:
            pred = (predictions - pred_min) / (pred_max - pred_min + self.smooth)
        else:
            pred = torch.ones_like(predictions) * 0.5
    else:
        raise ValueError(f"Unknown normalize_method: {self.normalize_method}")

    # Convert targets to float
    target = targets.float()

    # Flatten for computation
    pred_flat = pred.view(-1)  # [B*H*W]
    target_flat = target.view(-1)  # [B*H*W]

    # Compute IoU: intersection / union
    # intersection = |A ∩ B| = sum(pred * target)
    # union = |A ∪ B| = sum(pred) + sum(target) - intersection
    intersection = (pred_flat * target_flat).sum()
    union = pred_flat.sum() + target_flat.sum() - intersection

    # IoU coefficient
    iou = (intersection + self.smooth) / (union + self.smooth)

    # IoU loss: 1 - IoU (minimize loss = maximize IoU)
    loss = 1.0 - iou

    return {"loss": self.weight * loss}

Metrics

metrics

Metric nodes for training pipeline (port-based architecture).

ExplainedVarianceMetric

ExplainedVarianceMetric(execution_stages=None, **kwargs)

Bases: Node

Track explained variance ratio for PCA components.

Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )
forward
forward(explained_variance_ratio, context)

Compute explained variance metrics.

Parameters:

Name Type Description Default
explained_variance_ratio Tensor

Explained variance ratios from PCA node

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, explained_variance_ratio: Tensor, context: Context) -> dict[str, Any]:
    """Compute explained variance metrics.

    Parameters
    ----------
    explained_variance_ratio : Tensor
        Explained variance ratios from PCA node
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    metrics = []

    # Per-component variance
    for i, ratio in enumerate(explained_variance_ratio):
        metrics.append(
            Metric(
                name=f"explained_variance_pc{i + 1}",
                value=ratio.item(),
                stage=context.stage,
                epoch=context.epoch,
                batch_idx=context.batch_idx,
            )
        )

    # Total variance explained
    metrics.append(
        Metric(
            name="total_explained_variance",
            value=explained_variance_ratio.sum().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        )
    )

    # Cumulative variance
    cumulative = torch.cumsum(explained_variance_ratio, dim=0)
    for i, cum_var in enumerate(cumulative):
        metrics.append(
            Metric(
                name=f"cumulative_variance_pc{i + 1}",
                value=cum_var.item(),
                stage=context.stage,
                epoch=context.epoch,
                batch_idx=context.batch_idx,
            )
        )

    return {"metrics": metrics}

AnomalyDetectionMetrics

AnomalyDetectionMetrics(execution_stages=None, **kwargs)

Bases: Node

Compute anomaly detection metrics (precision, recall, F1, etc.).

Uses torchmetrics for GPU-optimized, robust metric computation. Expects binary decisions and targets to be binary masks. Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )

    # Initialize torchmetrics for binary classification
    # These are stateless (compute per-batch) since we don't call update()
    self.precision_metric = BinaryPrecision()
    self.recall_metric = BinaryRecall()
    self.f1_metric = BinaryF1Score()
    self.iou_metric = BinaryJaccardIndex()
    self.average_precision_metric = BinaryAveragePrecision()
forward
forward(decisions, targets, context, logits=None)

Compute anomaly detection metrics using torchmetrics.

Parameters:

Name Type Description Default
decisions Tensor

Binary anomaly decisions [B, H, W, 1]

required
targets Tensor

Ground truth binary masks [B, H, W, 1]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(
    self,
    decisions: Tensor,
    targets: Tensor,
    context: Context,
    logits: Tensor | None = None,
) -> dict[str, Any]:
    """Compute anomaly detection metrics using torchmetrics.

    Parameters
    ----------
    decisions : Tensor
        Binary anomaly decisions [B, H, W, 1]
    targets : Tensor
        Ground truth binary masks [B, H, W, 1]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Ensure consistent shapes and flatten spatial dimensions
    decisions = decisions.squeeze(-1)  # [B, H, W]
    targets = targets.squeeze(-1)  # [B, H, W]

    # Flatten to [N] where N = B*H*W for torchmetrics
    preds_flat = decisions.flatten()  # [B*H*W]
    targets_flat = targets.flatten()  # [B*H*W]

    # Compute metrics using torchmetrics (they handle edge cases robustly)
    precision = self.precision_metric(preds_flat, targets_flat)
    recall = self.recall_metric(preds_flat, targets_flat)
    f1 = self.f1_metric(preds_flat, targets_flat)
    iou = self.iou_metric(preds_flat, targets_flat)

    metrics = [
        Metric(
            name="precision",
            value=precision.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="recall",
            value=recall.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="f1_score",
            value=f1.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="iou",
            value=iou.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    if logits is not None:
        raw_scores = logits.squeeze(-1).flatten().float()
        probs_for_ap = torch.sigmoid(raw_scores)
        average_precision = self.average_precision_metric(probs_for_ap, targets_flat)

        metrics.append(
            Metric(
                name="average_precision",
                value=average_precision.item(),
                stage=context.stage,
                epoch=context.epoch,
                batch_idx=context.batch_idx,
            )
        )

    return {"metrics": metrics}

ScoreStatisticsMetric

ScoreStatisticsMetric(execution_stages=None, **kwargs)

Bases: Node

Compute statistical properties of score distributions.

Tracks mean, std, min, max, median, and quantiles of scores. Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )
forward
forward(scores, context)

Compute score statistics.

Parameters:

Name Type Description Default
scores Tensor

Score values [B, H, W]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, scores: Tensor, context: Context) -> dict[str, Any]:
    """Compute score statistics.

    Parameters
    ----------
    scores : Tensor
        Score values [B, H, W]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Flatten scores
    scores_flat = scores.reshape(-1)

    metrics = [
        Metric(
            name="scores/mean",
            value=scores_flat.mean().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/std",
            value=scores_flat.std().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/min",
            value=scores_flat.min().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/max",
            value=scores_flat.max().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/median",
            value=scores_flat.median().item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/q25",
            value=torch.quantile(scores_flat, 0.25).item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/q75",
            value=torch.quantile(scores_flat, 0.75).item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/q95",
            value=torch.quantile(scores_flat, 0.95).item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="scores/q99",
            value=torch.quantile(scores_flat, 0.99).item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    return {"metrics": metrics}

ComponentOrthogonalityMetric

ComponentOrthogonalityMetric(
    execution_stages=None, **kwargs
)

Bases: Node

Track orthogonality of PCA components during training.

Measures how close the component matrix is to being orthonormal. Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )
forward
forward(components, context)

Compute component orthogonality metrics.

Parameters:

Name Type Description Default
components Tensor

PCA components matrix [n_components, n_features]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, components: Tensor, context: Context) -> dict[str, Any]:
    """Compute component orthogonality metrics.

    Parameters
    ----------
    components : Tensor
        PCA components matrix [n_components, n_features]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Compute gram matrix: W @ W.T
    gram = components @ components.T
    n = components.shape[0]

    # Target: identity matrix
    eye = torch.eye(n, device=components.device, dtype=components.dtype)

    # Frobenius norm of difference
    orth_error = torch.norm(gram - eye, p="fro").item()

    # Average absolute deviation from identity
    avg_off_diagonal = (gram - eye).abs().mean().item()

    # Diagonal elements (should be close to 1)
    diagonal = torch.diagonal(gram)
    diagonal_mean = diagonal.mean().item()
    diagonal_std = diagonal.std().item()

    metrics = [
        Metric(
            name="orthogonality_error",
            value=orth_error,
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="avg_off_diagonal",
            value=avg_off_diagonal,
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="diagonal_mean",
            value=diagonal_mean,
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="diagonal_std",
            value=diagonal_std,
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    return {"metrics": metrics}

SelectorEntropyMetric

SelectorEntropyMetric(
    eps=1e-06, execution_stages=None, **kwargs
)

Bases: Node

Track entropy of channel selection distribution.

Measures the uncertainty/diversity in channel selection weights. Higher entropy indicates more uniform selection (less confident). Lower entropy indicates more peaked selection (more confident).

Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    eps: float = 1e-6,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    self.eps = eps
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        eps=eps,
        **kwargs,
    )
forward
forward(weights, context)

Compute entropy of selection weights.

Parameters:

Name Type Description Default
weights Tensor

Channel selection weights [n_channels]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, weights: Tensor, context: Context) -> dict[str, Any]:
    """Compute entropy of selection weights.

    Parameters
    ----------
    weights : Tensor
        Channel selection weights [n_channels]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Normalize weights to probabilities
    probs = weights / (weights.sum() + self.eps)

    # Compute entropy: -sum(p * log(p))
    entropy = -(probs * torch.log(probs + self.eps)).sum()

    metrics = [
        Metric(
            name="selector/entropy",
            value=entropy.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    return {"metrics": metrics}

SelectorDiversityMetric

SelectorDiversityMetric(execution_stages=None, **kwargs)

Bases: Node

Track diversity of channel selection.

Measures how spread out the selection weights are across channels. Uses Gini coefficient - lower values indicate more diverse selection.

Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )
forward
forward(weights, context)

Compute diversity metrics for selection weights.

Parameters:

Name Type Description Default
weights Tensor

Channel selection weights [n_channels]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, weights: Tensor, context: Context) -> dict[str, Any]:
    """Compute diversity metrics for selection weights.

    Parameters
    ----------
    weights : Tensor
        Channel selection weights [n_channels]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Compute variance (measure of spread)
    mean_weight = weights.mean()
    variance = ((weights - mean_weight) ** 2).mean()

    # Compute Gini coefficient (0 = perfect equality, 1 = perfect inequality)
    # Lower Gini = more diverse selection
    sorted_weights, _ = torch.sort(weights)
    n = len(sorted_weights)
    index = torch.arange(1, n + 1, device=weights.device, dtype=weights.dtype)
    gini = (2 * (sorted_weights * index).sum()) / (n * sorted_weights.sum()) - (n + 1) / n

    metrics = [
        Metric(
            name="weight_variance",
            value=variance.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="gini_coefficient",
            value=gini.item(),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    return {"metrics": metrics}

AnomalyPixelStatisticsMetric

AnomalyPixelStatisticsMetric(
    execution_stages=None, **kwargs
)

Bases: Node

Compute anomaly pixel statistics from binary decisions.

Calculates total pixels, anomalous pixels count, and anomaly percentage. Useful for monitoring the proportion of detected anomalies in batches. Executes only during validation and test stages.

Source code in cuvis_ai/node/metrics.py
def __init__(
    self,
    execution_stages: set[ExecutionStage] | None = None,
    **kwargs,
) -> None:
    name, execution_stages = Node.consume_base_kwargs(
        kwargs, execution_stages or {ExecutionStage.VAL, ExecutionStage.TEST}
    )
    super().__init__(
        name=name,
        execution_stages=execution_stages,
        **kwargs,
    )
forward
forward(decisions, context)

Compute anomaly pixel statistics.

Parameters:

Name Type Description Default
decisions Tensor

Binary anomaly decisions [B, H, W, 1]

required
context Context

Execution context with stage, epoch, batch_idx

required

Returns:

Type Description
dict[str, Any]

Dictionary with "metrics" key containing list of Metric objects

Source code in cuvis_ai/node/metrics.py
def forward(self, decisions: Tensor, context: Context) -> dict[str, Any]:
    """Compute anomaly pixel statistics.

    Parameters
    ----------
    decisions : Tensor
        Binary anomaly decisions [B, H, W, 1]
    context : Context
        Execution context with stage, epoch, batch_idx

    Returns
    -------
    dict[str, Any]
        Dictionary with "metrics" key containing list of Metric objects
    """
    # Calculate statistics
    total_pixels = decisions.numel()
    anomalous_pixels = int(decisions.sum().item())
    anomaly_percentage = (anomalous_pixels / total_pixels) * 100 if total_pixels > 0 else 0.0

    metrics = [
        Metric(
            name="anomaly/total_pixels",
            value=float(total_pixels),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="anomaly/anomalous_pixels",
            value=float(anomalous_pixels),
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
        Metric(
            name="anomaly/anomaly_percentage",
            value=anomaly_percentage,
            stage=context.stage,
            epoch=context.epoch,
            batch_idx=context.batch_idx,
        ),
    ]

    return {"metrics": metrics}

Monitoring

monitor

TensorBoard Monitoring Nodes.

This module provides nodes for logging artifacts and metrics to TensorBoard during pipeline execution. The monitoring nodes are sink nodes that accept artifacts (visualizations) and metrics from upstream nodes and write them to TensorBoard logs for visualization and analysis.

The primary use case is logging training and validation metrics, along with visualizations like heatmaps, RGB renderings, and PCA plots during model training.

See Also

cuvis_ai.node.visualizations : Nodes that generate artifacts for monitoring

TensorBoardMonitorNode

TensorBoardMonitorNode(
    output_dir="./runs",
    run_name=None,
    comment="",
    flush_secs=120,
    **kwargs,
)

Bases: Node

TensorBoard monitoring node for logging artifacts and metrics.

This is a SINK node that logs visualizations (artifacts) and metrics to TensorBoard. Accepts optional inputs for artifacts and metrics, allowing predecessors to be filtered by execution_stage without causing errors.

Executes during all stages (ALWAYS).

Parameters:

Name Type Description Default
output_dir str

Directory for TensorBoard logs (default: "./runs")

'./runs'
comment str

Comment to append to log directory name (default: "")

''
flush_secs int

How often to flush pending events to disk (default: 120)

120

Examples:

>>> heatmap_viz = AnomalyHeatmap(cmap='hot', up_to=10)
>>> tensorboard_node = TensorBoardMonitorNode(output_dir="./runs")
>>> graph.connect(
...     (heatmap_viz.artifacts, tensorboard_node.artifacts),
... )
Source code in cuvis_ai/node/monitor.py
def __init__(
    self,
    output_dir: str = "./runs",
    run_name: str | None = None,
    comment: str = "",
    flush_secs: int = 120,
    **kwargs,
) -> None:
    self.output_dir = Path(output_dir)
    self.run_name = run_name
    self.comment = comment
    self.flush_secs = flush_secs
    self._writer = None
    self._tensorboard_available = False

    super().__init__(
        execution_stages={ExecutionStage.ALWAYS},
        output_dir=str(output_dir),
        run_name=run_name,
        comment=comment,
        flush_secs=flush_secs,
        **kwargs,
    )

    # Check if tensorboard is available

    self._SummaryWriter = SummaryWriter

    # Determine the log directory with run name
    self.log_dir = self._resolve_log_dir()

    # Initialize TensorBoard writer
    self.log_dir.mkdir(parents=True, exist_ok=True)
    self._writer = self._SummaryWriter(
        log_dir=str(self.log_dir),
        comment=self.comment,
        flush_secs=self.flush_secs,
    )
    logger.info(f"TensorBoard writer initialized: {self.log_dir}")
    logger.info(f"To view visualizations, run: uv run tensorboard --logdir={self.output_dir}")
forward
forward(artifacts=None, metrics=None, context=None)

Log artifacts and metrics to TensorBoard.

Parameters:

Name Type Description Default
context Context

Execution context with stage, epoch, batch_idx, global_step

None
artifacts list[Artifact]

List of artifacts to log (default: None)

None
metrics list[Metric]

List of metrics to log (default: None)

None

Returns:

Type Description
dict

Empty dict (sink node has no outputs)

Source code in cuvis_ai/node/monitor.py
def forward(
    self,
    artifacts: list[Artifact] | None = None,
    metrics: list[Metric] | None = None,
    context: Context | None = None,
) -> dict:
    """Log artifacts and metrics to TensorBoard.

    Parameters
    ----------
    context : Context
        Execution context with stage, epoch, batch_idx, global_step
    artifacts : list[Artifact], optional
        List of artifacts to log (default: None)
    metrics : list[Metric], optional
        List of metrics to log (default: None)

    Returns
    -------
    dict
        Empty dict (sink node has no outputs)
    """
    if context is None:
        context = Context()

    stage = context.stage.value
    step = context.global_step

    # Flatten artifacts if it's a list of lists (variadic port)
    if artifacts is not None:
        if (
            isinstance(artifacts, list)
            and len(artifacts) > 0
            and isinstance(artifacts[0], list)
        ):
            artifacts = [item for sublist in artifacts for item in sublist]

    # Log artifacts
    if artifacts is not None:
        for artifact in artifacts:
            self._log_artifact(artifact, stage, step)
        logger.debug(f"Logged {len(artifacts)} artifacts to TensorBoard at step {step}")

    # Flatten metrics if variadic input provided
    if (
        metrics is not None
        and isinstance(metrics, list)
        and metrics
        and isinstance(metrics[0], list)
    ):
        metrics = [item for sublist in metrics for item in sublist]

    # Log metrics
    if metrics is not None:
        for metric in metrics:
            self._log_metric(metric, stage, step)
        logger.debug(f"Logged {len(metrics)} metrics to TensorBoard at step {step}")

    return {}
log
log(name, value, step)

Log a scalar value to TensorBoard.

This method provides a simple interface for external trainers to log metrics directly, complementing the port-based logging. Used by GradientTrainer to log train/val losses to the same TensorBoard directory as graph metrics and artifacts.

Parameters:

Name Type Description Default
name str

Name/tag for the scalar (e.g., "train/loss", "val/accuracy")

required
value float

Scalar value to log

required
step int

Global step number

required

Examples:

>>> tensorboard_node = TensorBoardMonitorNode(output_dir="./runs")
>>> # From external trainer
>>> tensorboard_node.log("train/loss", 0.5, step=100)
Source code in cuvis_ai/node/monitor.py
def log(self, name: str, value: float, step: int) -> None:
    """Log a scalar value to TensorBoard.

    This method provides a simple interface for external trainers
    to log metrics directly, complementing the port-based logging.
    Used by GradientTrainer to log train/val losses to the same
    TensorBoard directory as graph metrics and artifacts.

    Parameters
    ----------
    name : str
        Name/tag for the scalar (e.g., "train/loss", "val/accuracy")
    value : float
        Scalar value to log
    step : int
        Global step number

    Examples
    --------
    >>> tensorboard_node = TensorBoardMonitorNode(output_dir="./runs")
    >>> # From external trainer
    >>> tensorboard_node.log("train/loss", 0.5, step=100)
    """
    self._writer.add_scalar(name, value, step)