Skip to content

Main Loops

backtest_score

backtest_score(trained_pipelines: TrainedPipelineCard, X: pd.DataFrame | None, y: pd.Series, splitter: Splitter, backend: BackendType | Backend | str = BackendType.no, events: EventDataFrame | None = None, silent: bool = False, return_artifacts: bool = False, krisi_args: dict | None = None) -> tuple[ScoreCard, OutOfSamplePredictions] | tuple[ScoreCard, OutOfSamplePredictions, Artifact]

Run backtest then scoring. krisi is required to be installed.

Parameters:

Name Type Description Default
trained_pipelines TrainedPipelineCard

The fitted pipelines, for all folds.

required
X DataFrame | None

Exogenous Data.

required
y Series

Endogenous Data (Target).

required
splitter Splitter

Defines how the folds should be constructed.

required
backend BackendType | Backend | str

The library/service to use for parallelization / distributed computing, by default no.

no
events EventDataFrame | None

Events that should be passed into the pipeline, by default None.

None
silent bool

Wether the pipeline should print to the console, by default False.

False
return_artifacts bool

Whether to return the artifacts of the training process, by default False.

False
krisi_args dict | None

Arguments that will be passed into krisi score function, by default None.

None

Returns:

Type Description
'ScoreCard'

A ScoreCard from krisi.

OutOfSamplePredictions

Predictions for all folds, concatenated.

train_backtest

train_backtest(pipeline: Pipeline | PipelineCard, X: pd.DataFrame | None, y: pd.Series, splitter: Splitter, backend: BackendType | Backend | str = BackendType.no, events: EventDataFrame | None = None, silent: bool = False) -> tuple[OutOfSamplePredictions, TrainedPipelineCard, Artifact, InSamplePredictions]

Run train and backtest.

Parameters:

Name Type Description Default
pipeline Pipeline | PipelineCard

The pipeline to be fitted.

required
X DataFrame | None

Exogenous Data.

required
y Series

Endogenous Data (Target).

required
splitter Splitter

Defines how the folds should be constructed.

required
backend BackendType | Backend | str

The library/service to use for parallelization / distributed computing, by default no.

no
events EventDataFrame | None

Events that should be passed into the pipeline, by default None.

None
silent bool

Wether the pipeline should print to the console, by default False.

False
return_artifacts

Whether to return the artifacts of the process, by default False.

required
return_insample
required

Returns:

Type Description
OutOfSamplePredictions

Predictions for all folds, concatenated.

TrainedPipelineCard

The fitted pipelines, for all folds.

train_evaluate

train_evaluate(pipeline: Pipeline | PipelineCard, X: pd.DataFrame | None, y: pd.Series, splitter: Splitter, backend: BackendType | Backend | str = BackendType.no, events: EventDataFrame | None = None, silent: bool = False, krisi_args: dict | None = None) -> tuple[ScoreCard, OutOfSamplePredictions, TrainedPipelineCard, Artifact, ScoreCard]

Run train, backtest then run scoring. krisi needs to be installed.

Parameters:

Name Type Description Default
pipeline Pipeline | PipelineCard

The pipeline to be fitted.

required
X DataFrame | None

Exogenous Data.

required
y Series

Endogenous Data (Target).

required
splitter Splitter

Defines how the folds should be constructed.

required
backend BackendType | Backend | str

The library/service to use for parallelization / distributed computing, by default no.

no
events EventDataFrame | None

Events that should be passed into the pipeline, by default None.

None
silent bool

Wether the pipeline should print to the console, by default False.

False
krisi_args dict | None

Arguments that will be passed into krisi score function, by default None.

None

Returns:

Type Description
'ScoreCard'

A ScoreCard from krisi.

OutOfSamplePredictions

Predictions for all folds, concatenated.

TrainedPipelineCard

The fitted pipelines, for all folds.

train

train(pipelinecard: PipelineCard | Pipeline, X: pd.DataFrame | None, y: pd.Series, splitter: Splitter, events: EventDataFrame | None = None, backend: BackendType | Backend | str = BackendType.no, silent: bool = False, for_deployment: bool = False) -> tuple[TrainedPipelineCard, Artifact, InSamplePredictions]

Trains a pipeline on a given dataset, for all folds returned by the Splitter.

Parameters:

Name Type Description Default
pipeline

The pipeline to be fitted.

required
X DataFrame | None

Exogenous Data.

required
y Series

Endogenous Data (Target).

required
splitter Splitter

Defines how the folds should be constructed.

required
backend BackendType | Backend | str

The library/service to use for parallelization / distributed computing, by default no.

no
events EventDataFrame | None

Events that should be passed into the pipeline, by default None.

None
silent bool

Wether the pipeline should print to the console, by default False.

False
return_artifacts

Whether to return the artifacts of the training process, by default False.

required
return_insample

Whether to return the in-sample predictions of the training process, by default False.

required
for_deployment bool

Whether the pipeline is being trained for deployment, meaning it'll only have the last fold, by default False.

False

Returns:

Type Description
TrainedPipelineCard

The fitted pipelines, for all folds.

backtest

backtest(trained_pipelinecard: TrainedPipelineCard, X: pd.DataFrame | None, y: pd.Series, splitter: Splitter, backend: BackendType | Backend | str = BackendType.no, events: EventDataFrame | None = None, silent: bool = False, mutate: bool = False, return_artifacts: bool = False) -> OutOfSamplePredictions | tuple[OutOfSamplePredictions, Artifact]

Run backtest on TrainedPipelineCard and given data.

Parameters:

Name Type Description Default
trained_pipelines

The fitted pipelines, for all folds.

required
X DataFrame | None

Exogenous Data.

required
y Series

Endogenous Data (Target).

required
splitter Splitter

Defines how the folds should be constructed.

required
backend BackendType | Backend | str

The library/service to use for parallelization / distributed computing, by default no.

no
sample_weights

Weights assigned to each sample/timestamp, that are passed into models that support it, by default None.

required
events EventDataFrame | None

Events that should be passed into the pipeline, by default None.

None
silent bool

Wether the pipeline should print to the console, by default False.

False
mutate bool

Whether trained_pipelines should be mutated, by default False. This is discouraged.

False
return_artifacts bool

Whether to return the artifacts of the backtesting process, by default False.

False

Returns:

Type Description
OutOfSamplePredictions

Predictions for all folds, concatenated.

BackendType

Bases: ParsableEnum

Parameters:

Name Type Description Default
no

Uses sequential processing. This is the default.

required
ray

Uses ray as a backend. Call ray.init() before using this backend.

required
pathos

Uses pathos.multiprocessing as a backend (via p_tqdm).

required
thread

Uses threading as a backend (via tqdm.contrib.concurrent.thread_map).

required

classes

InSamplePredictions module-attribute

InSamplePredictions = DataFrame

The backtest's resulting in-sample output.

OutOfSamplePredictions module-attribute

OutOfSamplePredictions = DataFrame

The backtest's resulting out-of-sample output.

Pipeline module-attribute

Pipeline = Block | Sequence[Block]

A list of fold objects that are executed sequentially. Or a single object.

Pipelines module-attribute

Pipelines = Sequence[Pipeline]

Multiple, independent Pipelines.

TrainedPipelines module-attribute

TrainedPipelines = list[Series]

A list of trained Pipelines, to be used for backtesting.

Composite

Bases: Block, Clonable, ABC

A Composite contains other transformations.

Optimizer

Bases: Block, Clonable, ABC

get_candidates abstractmethod

get_candidates(only_traversal: bool) -> list[Pipeline]

Called iteratively, until an array with a length of zero is returned. Then the loop finishes the candidate evaluation process.

Transformation

Bases: Block, ABC

A transformation is a single step in a pipeline.

fit abstractmethod

fit(X: pd.DataFrame, y: pd.Series, sample_weights: pd.Series | None = None, raw_y: pd.Series | None = None) -> Artifact | None

Called once, with on initial training window.

update abstractmethod

update(X: pd.DataFrame, y: pd.Series, sample_weights: pd.Series | None = None, raw_y: pd.Series | None = None) -> Artifact | None

Subsequent calls to update the model.

Tunable

Bases: ABC

clone_with_params

clone_with_params(parameters: dict, clone_children: Callable | None = None) -> Tunable

The default implementation only works for Transformations, when parameters and the init parameters match 100%.

get_params

get_params() -> dict

The default implementation assumes that: 1. All init parameters are stored on the object as property (with the same name/key). 2. There are no modifications/conversions of the init parameters that'd prevent them from being used again (reconstructing the object from them).

scoring

utils

traverse

traverse(pipeline: Pipeline | list[Pipeline])

Iterates over a pipeline and yields each transformation. CAUTION: It does not "unroll" Optimizer's candidates.