DataModule API Reference#

DataModule classes are defined in chronocratic.datasets.modules and re-exported from the package root. They provide PyTorch Lightning LightningDataModule implementations for time series datasets.

Forecasting Modules#

Forecasting modules share a common interface centered around windowing (seq_len, forecast_horizon), mode selection, and data scaling.

Classification Modules#

Classification modules deal with labeled time series (univariate or multivariate). They require a dataset folder path and the name of the target column, plus a splitting strategy.

Base Modules#

Base classes provide shared functionality and are not meant to be instantiated directly.

LightningDataModule classes for time series datasets.

class chronocratic.datasets.modules.BaseClassificationTimeSeriesDataModule(*, dataset_folder_path: Path, batch_size: int = 32, valid_size: float = 0.1, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), target_column_name: str, splitting_strategy: ClassificationSplitMode = ClassificationSplitMode.AS_DEFINED, test_size: float = 0.5, num_workers: int = 0, data_form: DataForm = DataForm.REGULAR)#

Bases: BaseTimeSeriesDataModule

Base LightningDataModule for classification time series datasets.

Extends BaseTimeSeriesDataModule with label handling, target column separation, and variable-length sequence processing. Used by UCR and UEA classification modules.

The constructor accepts target_column_name as an explicit parameter, uses ClassificationSplitMode enum for splitting, and relies on the inherited setup() which calls create_data_scaler().

Parameters:
  • dataset_folder_path – Path to the dataset folder containing ARFF/CSV files.

  • batch_size – Batch size for dataloaders.

  • valid_size – Fraction of training data reserved for validation.

  • shuffle – Whether to shuffle the training dataloader.

  • scale_data – Whether to apply data scaling.

  • data_scaling_method – Scaling algorithm, typed as ScalingMethod.

  • data_scaling_range – Target (min, max) range for ScalingMethod.MINMAX.

  • target_column_name – Name of the target/label column in the data.

  • splitting_strategy – How to split train/test data, typed as ClassificationSplitMode.

  • test_size – Fraction reserved as test set (used with ClassificationSplitMode.MANUAL).

  • num_workers – Number of DataLoader worker processes.

property num_classes: int | None#

Number of distinct classes (read-only).

property train_data_labels: Any#

Training data labels.

property test_data_labels: Any#

Test data labels.

property valid_data_labels: Any#

Validation data labels.

property all_data_labels: Series#

Concatenation of all label splits.

setup(stage: str | None = None) None#

Load cached splits and apply data scaling.

Calls _load_cached_data() to populate in-memory data arrays from the cache written by prepare_data(), then delegates to the base class for scaling.

Parameters:

stage – Lightning stage identifier. Defaults to None (equivalent to "fit").

Raises:

ValueError – If stage is not one of {'fit', 'validate', 'test', 'predict', None}.

reset() None#

Reset classification state while preserving the cache key.

abstractmethod train_dataloader(**kwargs: Any) DataLoader#

Return the training DataLoader.

abstractmethod val_dataloader(**kwargs: Any) DataLoader | None#

Return the validation DataLoader.

abstractmethod test_dataloader(**kwargs: Any) DataLoader#

Return the test DataLoader.

class chronocratic.datasets.modules.BaseForecastingTimeSeriesDataModule(*, batch_size: int = 32, seq_len: int = 128, valid_size: float = 0.1, test_size: float = 0.5, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), num_workers: int = 0, mode: ForecastingMode = ForecastingMode.UNIVARIATE, forecast_horizon: int | None = None, step: int | None = None)#

Bases: BaseTimeSeriesDataModule

Base LightningDataModule for forecasting time series datasets.

Extends BaseTimeSeriesDataModule with dataset-intrinsic time slicing, sklearn-based scaling, and cyclical time feature extraction. Overrides setup() entirely to handle forecasting-specific scaling (fit on train slice only).

Supports two loader modes:

  • RAW_SERIES (default): Returns raw time series samples via TensorDataset. Preserves existing behavior.

  • INPUT_TARGET / INPUT_ONLY: Returns sliding-window datasets built by _build_sliding_dataset().

Note

forecast_horizon and step are dataset-level parameters applied at dataloader time. They do NOT affect the cache key; only seq_len, mode, and scaling params are cached.

Subclasses implement _set_data_slices() to define train/val/test boundaries.

Parameters:
  • batch_size – Batch size for dataloaders.

  • seq_len – Input window length for sliding windows.

  • valid_size – Fraction of data reserved for validation.

  • test_size – Fraction reserved as test set.

  • shuffle – Whether to shuffle the training dataloader.

  • scale_data – Whether to apply data scaling.

  • data_scaling_method – Scaling algorithm, typed as ScalingMethod.

  • data_scaling_range – Target (min, max) range for ScalingMethod.MINMAX.

  • num_workers – Number of DataLoader worker processes.

  • mode – Forecasting mode (univariate or multivariate), typed as ForecastingMode.

  • forecast_horizon – Number of future steps to predict. Used only when loader_mode is INPUT_TARGET or INPUT_ONLY in dataloader calls. Does not affect cache key.

  • step – Stride between consecutive sliding windows. Defaults to seq_len when not provided. Does not affect cache key.

property train_slice: slice | None#

Training data slice boundaries.

property valid_slice: slice | None#

Validation data slice boundaries.

property test_slice: slice | None#

Test data slice boundaries.

property full_data: ndarray | None#

Full data array.

Returns _full_data_scaled after setup has scaled data, otherwise _full_data_raw.

property num_time_series_features: int | None#

Number of cyclical time features extracted.

setup(stage: str | None = None) None#

Scale data, extract time features, and split into train/val/test.

The forecasting branch uses sklearn scalers directly (not create_data_scaler()) because forecasting data has a different shape (features x timesteps). Fits scaler on train slice only to prevent data leakage.

Stage branching: - fit/None: Fit scalers, transform data, split into slices. - test/predict: Reuse cached fitted scalers to transform. - validate: No data mutation; mark stage as complete.

Idempotency guard: Repeated calls for the same stage are no-ops via _setup_completed_stages sentinel.

When scale_data is False, scaling and time feature extraction are skipped entirely to preserve raw values.

Parameters:

stage – Lightning stage identifier.

Raises:

ValueError – If stage is not one of {'fit', 'validate', 'test', 'predict', None}.

reset() None#

Reset forecasting state, including the scaling flag.

Restores _cache_key from the original init params since it is deterministic and required for cache-based setup().

class chronocratic.datasets.modules.BaseTimeSeriesDataModule(*, batch_size: int, seq_len: int | None, valid_size: float, test_size: float, shuffle: bool, scale_data: bool, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), num_workers: int = 0, data_form: DataForm = DataForm.REGULAR, cache_dir: Path | None = None)#

Bases: LightningDataModule, ABC

Shared base for all time series LightningDataModules.

Handles batch size, scaling, and dataloader construction. Subclasses implement dataset-specific prepare_data() for file validation and data loading.

Parameters:
  • batch_size – Batch size for dataloaders.

  • seq_len – Sequence length. None for classification (computed from data), int for forecasting (user-provided).

  • valid_size – Fraction of training data reserved for validation.

  • test_size – Fraction reserved as test set.

  • shuffle – Whether to shuffle the training dataloader.

  • scale_data – Whether to apply data scaling.

  • data_scaling_method – Scaling algorithm, typed as ScalingMethod.

  • data_scaling_range – Target (min, max) range for ScalingMethod.MINMAX.

  • num_workers – Number of DataLoader worker processes.

  • data_form – Data shape category for scaling, typed as DataForm.

  • cache_dir – Custom cache directory. None uses the default ~/.cache/tsdatasets/<dataset_name>.

prepare_data_per_node: bool = True#
property name: str | None#

Dataset name.

property sequence_length: int | None#

Sequence length (read-only).

property num_features: int | None#

Number of features (read-only).

property train_data_samples: ndarray | DataFrame | None#

Training data samples.

property test_data_samples: ndarray | DataFrame | None#

Test data samples.

property valid_data_samples: ndarray | DataFrame | None#

Validation data samples.

property all_data_samples: ndarray | DataFrame#

Concatenation of all data splits.

prepare_data() None#

Validate file paths and perform lightweight checks.

Concrete wrapper that drives the template:

  1. Check idempotency sentinel (skip if already called).

  2. Call _do_prepare_data() (abstract — subclass I/O).

  3. Call _finalize_prepare_data() (hook — no-op default, forecasting overrides to set slices).

  4. Set sentinel.

prepare_data() does NOT load or split data. That happens in setup().

prepare_dimensions() tuple[int | None, int | None]#

Return (n_features, sequence_len) from cached attrs or metadata.

Short-circuits if _num_features is already populated (e.g. after setup()). Otherwise attempts to read metadata.json from the cache directory so that dimensions are available without loading any arrays — the DDP-safe flow.

If _cache_key is not yet set or metadata cannot be found, falls back to _compute_dimensions() for backward compatibility with subclasses that set _num_features in _do_prepare_data().

Returns:

Tuple of (n_features, sequence_len). Values may be None if dimensions have not yet been computed.

Raises:
  • FileNotFoundError – If metadata file does not exist and _cache_key is set.

  • ValueError – If metadata schema version does not match CACHE_SCHEMA_VERSION.

setup(stage: str | None = None) None#

Apply data scaling via create_data_scaler().

The classification branch uses create_data_scaler() from utilities. The forecasting branch overrides this method entirely with sklearn direct scaling.

Stage branching: - fit/None: Fit scaler on all splits. - test/predict: Scale test data only (reuse cached scaler). - validate: No data mutation; mark stage as complete.

Idempotency guard: Repeated calls for the same stage are no-ops via _setup_completed_stages sentinel.

Parameters:

stage – Lightning stage identifier. Defaults to None (equivalent to "fit").

Raises:

ValueError – If stage is not one of {'fit', 'validate', 'test', 'predict', None}.

reset() None#

Clear lifecycle sentinels to allow re-use of this DataModule.

Resets the setup stage tracking and prepare_data sentinel so that subsequent calls to setup() or prepare_data() will re-execute their logic. Also clears all cache-related attributes (_full_data_raw, _time_index, _full_data_scaled, scaler caches, data samples, _cache_key) to prevent stale state across resets.

Useful for hyperparameter sweeps or re-training scenarios that reuse the same DataModule instance.

class chronocratic.datasets.modules.ETTDataModule(*, dataset_file_path: Path, variant: str, seq_len: int = 128, mode: ForecastingMode = ForecastingMode.UNIVARIATE, batch_size: int = 32, valid_size: float = 0.1, test_size: float = 0.5, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), num_workers: int = 0, forecast_horizon: int = 96, step: int | None = None)#

Bases: BaseForecastingTimeSeriesDataModule

LightningDataModule for ETT forecasting datasets.

Supports ETTh1, ETTh2 (hourly) and ETTm1, ETTm2 (15-min). Uses standard 16-month / 4-month / 4-month splits based on variant.

Accepts explicit variant parameter rather than auto-detecting from the filename.

Data shape reference

ETT is a single multivariate time series. Raw CSV shape varies by variant: ETTh1/ETTh2 have 7 features, ETTm1/ETTm2 have 7 features.

Variant

Raw CSV Shape

Post-Transform

Notes

ETTh1, ETTh2

(17420, 7)

(1, 17420, 7)

Hourly, 12 months

ETTm1, ETTm2

(69680, 7)

(1, 69680, 7)

15-min, 12 months

For univariate mode, only the OT column is retained (shape becomes (1, T, 2) after adding time features).

Parameters:
  • dataset_file_path – Path to the CSV file.

  • variant – ETT dataset variant ("ETTh1", "ETTh2", "ETTm1", "ETTm2").

  • seq_len – Input window length.

  • mode – UNIVARIATE or MULTIVARIATE.

  • batch_size – Batch size.

  • valid_size – Validation fraction (unused, fixed by dataset).

  • test_size – Test fraction (unused, fixed by dataset).

  • shuffle – Whether to shuffle training data.

  • scale_data – Whether to scale features.

  • data_scaling_method – Scaling algorithm.

  • data_scaling_range – Target min-max range.

  • num_workers – DataLoader worker count.

Raises:

ValueError – If variant is not one of the four valid ETT variants.

train_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, shuffle: bool | None = None, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the training DataLoader.

Parameters:
  • loader_mode – Per-call mode controlling output format. RAW_SERIES yields full series (existing behavior). INPUT_TARGET yields (input, target) sliding-window pairs. INPUT_ONLY yields input windows without targets.

  • shuffle – Whether to shuffle. Defaults to shuffle.

  • strict_batch_size – If True, pad the last batch.

  • extra_args – Additional keyword arguments for DataLoader.

Returns:

Configured DataLoader for training.

val_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader | None#

Build the validation DataLoader.

Returns None when valid_size is 0.0.

Returns:

Configured DataLoader for validation, or None.

test_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the test DataLoader.

Returns:

Configured DataLoader for testing.

class chronocratic.datasets.modules.ElectricityLoadModule(*, dataset_file_path: Path, seq_len: int = 128, mode: ForecastingMode = ForecastingMode.UNIVARIATE, batch_size: int = 32, valid_size: float = 0.1, test_size: float = 0.5, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), num_workers: int = 0, forecast_horizon: int = 24, step: int | None = None)#

Bases: BaseForecastingTimeSeriesDataModule

LightningDataModule for electricity load forecasting.

Reads semicolon-delimited CSV with comma decimals, resamples to hourly, and applies 60/20/20 fractional splits.

Data shape reference

Electricity contains 370 independent power clients (from 2012-2014). Each client is treated as a separate time series, not as a feature.

Dataset

Raw CSV Shape

Post-Transform

Notes

Electricity

(27340, 370)

(370, 27340, 1)

Hourly, 3 years

The data transform uses transpose + expand_dims(axis=-1) to produce (370, 27340, 1). This matches the TS2Vec/CoST/AutoTCL reference: np.expand_dims(data.T, -1) with comment “Each variable is an instance rather than a feature”.

For univariate mode, only client MT_001 is retained.

Parameters:
  • dataset_file_path – Path to the CSV file.

  • seq_len – Input window length.

  • mode – UNIVARIATE or MULTIVARIATE.

  • batch_size – Batch size.

  • valid_size – Validation fraction (unused, fixed 60/20/20).

  • test_size – Test fraction (unused, fixed 60/20/20).

  • shuffle – Whether to shuffle training data.

  • scale_data – Whether to scale features.

  • data_scaling_method – Scaling algorithm.

  • data_scaling_range – Target min-max range.

  • num_workers – DataLoader worker count.

train_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, shuffle: bool | None = None, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the training DataLoader.

Parameters:
  • loader_mode – Per-call mode controlling output format. RAW_SERIES yields full series (existing behavior). INPUT_TARGET yields (input, target) sliding-window pairs. INPUT_ONLY yields input windows without targets.

  • shuffle – Whether to shuffle. Defaults to shuffle.

  • strict_batch_size – If True, pad the last batch.

  • extra_args – Additional keyword arguments for DataLoader.

Returns:

Configured DataLoader for training.

val_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader | None#

Build the validation DataLoader.

test_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the test DataLoader.

class chronocratic.datasets.modules.UCRClassificationDataModule(*, dataset_folder_path: Path, target_column_name: str, batch_size: int = 32, valid_size: float = 0.1, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), splitting_strategy: ClassificationSplitMode = ClassificationSplitMode.AS_DEFINED, test_size: float = 0.5, num_workers: int = 0)#

Bases: BaseClassificationTimeSeriesDataModule

LightningDataModule for UCR univariate classification datasets.

Reads train/test ARFF files, applies optional manual re-splitting, creates a validation split, and handles variable-length series.

Accepts dataset_folder_path (Path) and target_column_name as explicit constructor parameters. No JSON config files. ARFF file patterns are hardcoded: {dataset_name}_TRAIN.arff and {dataset_name}_TEST.arff.

data_form is hardcoded as DataForm.REGULAR.

Parameters:
  • dataset_folder_path – Path to the dataset ARFF directory.

  • target_column_name – Name of the target/label column in the ARFF files.

  • batch_size – Batch size for dataloaders.

  • valid_size – Fraction of training data for validation.

  • shuffle – Whether to shuffle training data.

  • scale_data – Whether to scale features.

  • data_scaling_method – Scaling algorithm, typed as ScalingMethod.

  • data_scaling_range – Target (min, max) range for ScalingMethod.MINMAX.

  • splitting_strategyAS_DEFINED or MANUAL splitting, typed as ClassificationSplitMode.

  • test_size – Test set fraction for MANUAL splitting.

  • num_workers – DataLoader worker count.

train_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, shuffle: bool | None = None, strict_batch_size: bool = True, extra_args: dict[str, Any] | None = None) DataLoader#

Build the training DataLoader.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • shuffle – Whether to shuffle. Defaults to shuffle.

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for training.

val_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, strict_batch_size: bool = True, extra_args: dict[str, Any] | None = None) DataLoader | None#

Build the validation DataLoader.

Returns None when valid_size is 0.0.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for validation, or None.

test_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the test DataLoader.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for testing.

class chronocratic.datasets.modules.UEAClassificationDataModule(*, dataset_folder_path: Path, target_column_name: str, batch_size: int = 32, valid_size: float = 0.1, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), splitting_strategy: ClassificationSplitMode = ClassificationSplitMode.AS_DEFINED, test_size: float = 0.5, num_workers: int = 0)#

Bases: BaseClassificationTimeSeriesDataModule

LightningDataModule for UEA multivariate classification datasets.

Reads multi-dimensional nested ARFF files using raw scipy.io.arff.loadarff(), decodes byte values, encodes labels with sklearn.preprocessing.LabelEncoder, and manages splits with variable-length handling.

data_form is hardcoded as DataForm.NESTED. ARFF file patterns are hardcoded: {dataset_name}_TRAIN.arff and {dataset_name}_TEST.arff.

Parameters:
  • dataset_folder_path – Path to the dataset ARFF directory.

  • target_column_name – Name of the target/label column in the ARFF files.

  • batch_size – Batch size for dataloaders.

  • valid_size – Fraction of training data for validation.

  • shuffle – Whether to shuffle training data.

  • scale_data – Whether to scale features.

  • data_scaling_method – Scaling algorithm, typed as ScalingMethod.

  • data_scaling_range – Target (min, max) range for ScalingMethod.MINMAX.

  • splitting_strategyAS_DEFINED or MANUAL splitting, typed as ClassificationSplitMode.

  • test_size – Test set fraction for MANUAL splitting.

  • num_workers – DataLoader worker count.

train_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, shuffle: bool | None = None, strict_batch_size: bool = True, extra_args: dict[str, Any] | None = None) DataLoader#

Build the training DataLoader.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • shuffle – Whether to shuffle. Defaults to shuffle.

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for training.

val_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, strict_batch_size: bool = True, extra_args: dict[str, Any] | None = None) DataLoader | None#

Build the validation DataLoader.

Returns None when valid_size is 0.0.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for validation, or None.

test_dataloader(*, mode: ClassificationLoaderMode = ClassificationLoaderMode.SAMPLE_LABEL, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the test DataLoader.

Parameters:
  • mode – Dataset mode (with/without labels, forecasting).

  • strict_batch_size – If True, pad the last batch via custom_collate_fn().

  • extra_args – Additional keyword arguments forwarded to the DataLoader constructor.

Returns:

Configured DataLoader for testing.

class chronocratic.datasets.modules.WeatherModule(*, dataset_file_path: Path, seq_len: int = 128, mode: ForecastingMode = ForecastingMode.UNIVARIATE, batch_size: int = 32, valid_size: float = 0.1, test_size: float = 0.5, shuffle: bool = False, scale_data: bool = True, data_scaling_method: ScalingMethod = ScalingMethod.MINMAX, data_scaling_range: tuple[float, float] = (0, 1), num_workers: int = 0, forecast_horizon: int = 96, step: int | None = None)#

Bases: BaseForecastingTimeSeriesDataModule

LightningDataModule for weather forecasting.

Reads CSV with standard format (comma-separated, period decimals), applies 60/20/20 fractional splits.

Data shape reference

Weather is a single multivariate time series with 22 features.

Dataset

Raw CSV Shape

Post-Transform

Notes

Weather

(52696, 22)

(1, 52696, 22)

Hourly, 7 years

For univariate mode, only the last column (WetBulbCelsius) is retained. The data transform uses expand_dims(axis=0), producing shape (1, T, F).

Parameters:
  • dataset_file_path – Path to the CSV file.

  • seq_len – Input window length.

  • mode – UNIVARIATE or MULTIVARIATE.

  • batch_size – Batch size.

  • valid_size – Validation fraction (unused, fixed 60/20/20).

  • test_size – Test fraction (unused, fixed 60/20/20).

  • shuffle – Whether to shuffle training data.

  • scale_data – Whether to scale features.

  • data_scaling_method – Scaling algorithm.

  • data_scaling_range – Target min-max range.

  • num_workers – DataLoader worker count.

train_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, shuffle: bool | None = None, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the training DataLoader.

Parameters:
  • loader_mode – Per-call mode controlling output format. RAW_SERIES yields full series (existing behavior). INPUT_TARGET yields (input, target) sliding-window pairs. INPUT_ONLY yields input windows without targets.

  • shuffle – Whether to shuffle. Defaults to shuffle.

  • strict_batch_size – If True, pad the last batch.

  • extra_args – Additional keyword arguments for DataLoader.

Returns:

Configured DataLoader for training.

val_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader | None#

Build the validation DataLoader.

test_dataloader(*, loader_mode: ForecastingLoaderMode = ForecastingLoaderMode.RAW_SERIES, strict_batch_size: bool = False, extra_args: dict[str, Any] | None = None) DataLoader#

Build the test DataLoader.