Skip to content

Status: Needs Review

This page has not been reviewed for accuracy and completeness. Content may be outdated or contain errors.


Data API

Data loading and preprocessing nodes.

Overview

Data loading functionality in CUVIS.AI is provided through data nodes. See the complete node documentation below.


Data Nodes

data

Data loading nodes for hyperspectral anomaly detection pipelines.

This module provides specialized data nodes that convert multi-class segmentation datasets into binary anomaly detection tasks. Data nodes handle type conversions, label mapping, and format transformations required for pipeline processing.

LentilsAnomalyDataNode

LentilsAnomalyDataNode(
    normal_class_ids, anomaly_class_ids=None, **kwargs
)

Bases: Node

Data node for Lentils anomaly detection dataset with binary label mapping.

Converts multi-class Lentils segmentation data to binary anomaly detection format. Maps specified class IDs to normal (0) or anomaly (1) labels, and handles type conversions from uint16 to float32 for hyperspectral cubes.

Parameters:

Name Type Description Default
normal_class_ids list[int]

List of class IDs to treat as normal background (e.g., [0, 1] for unlabeled and black lentils)

required
anomaly_class_ids list[int] | None

List of class IDs to treat as anomalies. If None, all classes not in normal_class_ids are treated as anomalies (default: None)

None
**kwargs dict

Additional arguments passed to Node base class

{}

Attributes:

Name Type Description
_binary_mapper BinaryAnomalyLabelMapper

Internal label mapper for converting multi-class to binary masks

Examples:

>>> from cuvis_ai.node.data import LentilsAnomalyDataNode
>>> from cuvis_ai_core.data.datasets import SingleCu3sDataModule
>>>
>>> # Create datamodule for Lentils dataset
>>> datamodule = SingleCu3sDataModule(
...     data_dir="data/lentils",
...     batch_size=4,
... )
>>>
>>> # Create data node with normal class specification
>>> data_node = LentilsAnomalyDataNode(
...     normal_class_ids=[0, 1],  # Unlabeled and black lentils are normal
... )
>>>
>>> # Use in pipeline
>>> pipeline.add_node(data_node)
>>> pipeline.connect(
...     (data_node.cube, normalizer.data),
...     (data_node.mask, metrics.targets),
... )
See Also

BinaryAnomalyLabelMapper : Label mapping utility used internally SingleCu3sDataModule : DataModule for loading CU3S hyperspectral data docs/tutorials/rx-statistical.md : Complete example with LentilsAnomalyDataNode

Notes

The node performs the following transformations: - Converts hyperspectral cube from uint16 to float32 - Maps multi-class mask [B, H, W] to binary mask [B, H, W, 1] - Extracts wavelengths from first batch element (assumes consistent wavelengths)

Source code in cuvis_ai/node/data.py
def __init__(
    self, normal_class_ids: list[int], anomaly_class_ids: list[int] | None = None, **kwargs
) -> None:
    super().__init__(
        normal_class_ids=normal_class_ids, anomaly_class_ids=anomaly_class_ids, **kwargs
    )

    self._binary_mapper = BinaryAnomalyLabelMapper(  # could have be used as a node as well
        normal_class_ids=normal_class_ids,
        anomaly_class_ids=anomaly_class_ids,
    )
forward
forward(cube, mask=None, wavelengths=None, **_)

Process hyperspectral cube and convert labels to binary anomaly format.

Parameters:

Name Type Description Default
cube Tensor

Input hyperspectral cube, shape (B, H, W, C), dtype uint16

required
mask Tensor | None

Multi-class segmentation mask, shape (B, H, W), dtype int32. If None, only cube is returned (default: None)

None
wavelengths Tensor | None

Wavelengths for each channel, shape (B, C), dtype int32. If None, wavelengths are not included in output (default: None)

None

Returns:

Type Description
dict[str, Tensor | ndarray]

Dictionary containing: - "cube" : torch.Tensor Converted hyperspectral cube, shape (B, H, W, C), dtype float32 - "mask" : torch.Tensor (optional) Binary anomaly mask, shape (B, H, W, 1), dtype bool. Only included if input mask is provided. - "wavelengths" : np.ndarray (optional) Wavelength array, shape (C,), dtype int32. Only included if input wavelengths are provided.

Source code in cuvis_ai/node/data.py
def forward(
    self,
    cube: torch.Tensor,
    mask: torch.Tensor | None = None,
    wavelengths: torch.Tensor | None = None,
    **_: Any,
) -> dict[str, torch.Tensor | np.ndarray]:
    """Process hyperspectral cube and convert labels to binary anomaly format.

    Parameters
    ----------
    cube : torch.Tensor
        Input hyperspectral cube, shape (B, H, W, C), dtype uint16
    mask : torch.Tensor | None, optional
        Multi-class segmentation mask, shape (B, H, W), dtype int32.
        If None, only cube is returned (default: None)
    wavelengths : torch.Tensor | None, optional
        Wavelengths for each channel, shape (B, C), dtype int32.
        If None, wavelengths are not included in output (default: None)

    Returns
    -------
    dict[str, torch.Tensor | np.ndarray]
        Dictionary containing:
        - "cube" : torch.Tensor
            Converted hyperspectral cube, shape (B, H, W, C), dtype float32
        - "mask" : torch.Tensor (optional)
            Binary anomaly mask, shape (B, H, W, 1), dtype bool.
            Only included if input mask is provided.
        - "wavelengths" : np.ndarray (optional)
            Wavelength array, shape (C,), dtype int32.
            Only included if input wavelengths are provided.
    """
    result: dict[str, torch.Tensor | np.ndarray] = {"cube": cube.to(torch.float32)}

    # wavelengths passthrough, could check that in all batch elements the same wavelengths are used
    # input B x C -> output C
    if wavelengths is not None:
        result["wavelengths"] = wavelengths[0].cpu().numpy()

    if mask is not None:
        # Add channel dimension for mapper: BHW -> BHWC
        mask_4d = mask.unsqueeze(-1)

        # Always apply binary mapper
        mapped = self._binary_mapper.forward(
            cube=cube,
            mask=mask_4d,
            **_,  # Pass through additional kwargs
        )
        result["mask"] = mapped["mask"]  # Already BHWC bool

    return result