# Classification Modules The classification module provides data loaders for the UCR/UEA Time Series Classification Archive, a standard benchmark collection for time series classification research. For the full API of all classification data modules, see the {doc}`api/modules` reference. ## UCR Classification Data Module The UCR archive contains **univariate, equal-length** time series classification datasets stored in ARFF format. Each dataset directory provides `TRAIN.arff` and `TEST.arff` files with feature columns and a target label column. ```python from pathlib import Path from chronocratic.datasets import UCRClassificationDataModule module = UCRClassificationDataModule( dataset_folder_path=Path("data/FogiDataset1"), target_column_name="class", scale_data=True, batch_size=32, ) module.prepare_data() module.setup() train_loader = module.train_dataloader() ``` **Key details:** - `dataset_folder_path` points to the directory containing the `.arff` files. The module auto-discovers `{dataset_name}_TRAIN.arff` and `{dataset_name}_TEST.arff`. - `target_column_name` specifies the label column name in the ARFF files. - Sequence length is derived from the number of feature columns. - Handles variable-length series automatically via padding. See the {py:class}`~chronocratic.datasets.modules.UCRClassificationDataModule` API reference for all constructor parameters. ## UEA Classification Data Module The UEA archive contains **multivariate and/or variable-length** time series classification datasets stored in nested ARFF format. These datasets have multiple dimensions per timestep and may have different sequence lengths per sample. ```python from pathlib import Path from chronocratic.datasets import UEAClassificationDataModule module = UEAClassificationDataModule( dataset_folder_path=Path("data/ArrowHead"), target_column_name="class", scale_data=True, batch_size=32, ) module.prepare_data() module.setup() train_loader = module.train_dataloader() ``` **Key details:** - Uses `scipy.io.arff.loadarff` directly for reading nested ARFF format. - Automatically encodes string labels via `sklearn.preprocessing.LabelEncoder`. - Data form is `NESTED`, meaning each sample may have variable length and multiple dimensions. - Sequence length and feature count are derived from the data at load time. See the {py:class}`~chronocratic.datasets.modules.UEAClassificationDataModule` API reference for all constructor parameters. ## Dataset Classes Under the hood, the data modules use these PyTorch Dataset classes: - {py:class}`~chronocratic.datasets.datatypes.UCRClassificationUnivariateDataset` - {py:class}`~chronocratic.datasets.datatypes.UEAClassificationMultivariateDataset` See the {doc}`api/datatypes` reference for full class documentation. ## Loader Mode Classification modules support multiple loader modes via {py:class}`~chronocratic.datasets.enums.ClassificationLoaderMode`: - **SAMPLE_ONLY** -- Returns only the input sample tensor (no labels) - **SAMPLE_LABEL** -- Returns the input sample tensor and its label (default) Set this on the `train_dataloader()`, `val_dataloader()`, and `test_dataloader()` calls via the `mode` keyword argument. ```python # Without labels train_loader = module.train_dataloader(mode=ClassificationLoaderMode.SAMPLE_ONLY) ``` ## Splitting Strategy Control how the archive's train/test split is handled via {py:class}`~chronocratic.datasets.enums.ClassificationSplitMode`: - **AS_DEFINED** -- Keep the original train/test split from the archive - **MANUAL** -- Re-split the combined data with a custom `test_size` fraction Set this on the module constructor via the `splitting_strategy` keyword argument. ## Scaling Data scaling is configured via {py:class}`~chronocratic.datasets.enums.ScalingMethod`: - **NONE** -- No scaling applied - **MINMAX** -- Scales to a specified range (default 0-1) - **STANDARD** -- Standardizes to zero mean and unit variance ## Next Steps - See the {doc}`forecasting` guide for forecasting datasets. - See the {doc}`api/modules` reference for the full API of classification data modules. - See the {doc}`api/enums` reference for all enum options.