Skip to content

Transformations

columns

AddColumnSuffix

Bases: Transformation

Add suffix to column names.

Parameters:

Name Type Description Default
suffix str

Suffix to add to all column names.

required

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import RenameColumns
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = AddColumnSuffix("_2")
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                     sine_2
2021-12-31 15:40:00 -0.0000
2021-12-31 15:41:00  0.0126
2021-12-31 15:42:00  0.0251
2021-12-31 15:43:00  0.0377
2021-12-31 15:44:00  0.0502

DropColumns

Bases: Transformation, Tunable

Drops a single or multiple columns.

Parameters:

Name Type Description Default
columns (list[str], str)

The column or columns to drop.

required

OnlyPredictions

Bases: Transformation

Drops all columns except the output model(s)' predictions.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import OnlyProbabilities
>>> from fold.models.dummy import DummyClassifier
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = [DummyClassifier(1, [0, 1], [0.5, 0.5]), OnlyPredictions()]
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                     predictions_DummyClassifier
2021-12-31 15:40:00                            1
2021-12-31 15:41:00                            1
2021-12-31 15:42:00                            1
2021-12-31 15:43:00                            1
2021-12-31 15:44:00                            1

OnlyProbabilities

Bases: Transformation

Drops all columns except the output model(s)' probabilities.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import OnlyProbabilities
>>> from fold.models.dummy import DummyClassifier
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = [DummyClassifier(1, [0, 1], [0.5, 0.5]), OnlyProbabilities()]
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                     probabilities_DummyClassifier_0  probabilities_DummyClassifier_1
2021-12-31 15:40:00                              0.5                              0.5
2021-12-31 15:41:00                              0.5                              0.5
2021-12-31 15:42:00                              0.5                              0.5
2021-12-31 15:43:00                              0.5                              0.5
2021-12-31 15:44:00                              0.5                              0.5

RenameColumns

Bases: Transformation

Renames columns.

Parameters:

Name Type Description Default
columns_mapper dict

A dictionary containing the old column names as keys and the new column names as values.

required

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import RenameColumns
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = RenameColumns({"sine": "sine_renamed"})
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                     sine_renamed
2021-12-31 15:40:00       -0.0000
2021-12-31 15:41:00        0.0126
2021-12-31 15:42:00        0.0251
2021-12-31 15:43:00        0.0377
2021-12-31 15:44:00        0.0502

SelectColumns

Bases: Transformation, Tunable

Selects a single or multiple columns, drops the rest.

Parameters:

Name Type Description Default
columns Union[list[str], str]

The column or columns to select (dropping the rest).

required

dev

Breakpoint

Bases: Transformation

A transformation that stops execution at the specified point.

difference

Difference

Bases: Transformation, Tunable

Takes the returns (percentage change between the current and a prior element).

Parameters:

Name Type Description Default
log_returns bool

If True, computes the log returns instead of the simple returns, default False.

False.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import Difference
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data(freq="min")
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = Difference()
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> X["sine"].loc[preds.index].head()
2021-12-31 15:40:00   -0.0000
2021-12-31 15:41:00    0.0126
2021-12-31 15:42:00    0.0251
2021-12-31 15:43:00    0.0377
2021-12-31 15:44:00    0.0502
Freq: T, Name: sine, dtype: float64
>>> preds["sine"].head()
2021-12-31 15:40:00   -1.000000
2021-12-31 15:41:00        -inf
2021-12-31 15:42:00    0.992063
2021-12-31 15:43:00    0.501992
2021-12-31 15:44:00    0.331565
Freq: T, Name: sine, dtype: float64

features

AddFeatures

Bases: Transformation, Tunable

Applies a function to one or more columns.

Parameters:

Name Type Description Default
column_func ColumnFunction | list[ColumnFunction]

A tuple of a column or list of columns and a function to apply to them.

required
fillna bool

Fill NaNs in the resulting DataFrame

False
name str | None

Name of the transformation.

None
params_to_try dict | None

Dictionary of parameters to try when tuning.

None

Returns:

Type Description
tuple[pd.DataFrame, Artifact| None]: returns the transformed DataFrame with the original dataframe concatinated.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import AddFeatures
>>> from fold.models.dummy import DummyClassifier
>>> from fold.utils.tests import generate_sine_wave_data
>>> import numpy as np
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = AddFeatures([("sine", np.square)])
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                       sine  sine~square
2021-12-31 15:40:00 -0.0000     0.000000
2021-12-31 15:41:00  0.0126     0.000159
2021-12-31 15:42:00  0.0251     0.000630
2021-12-31 15:43:00  0.0377     0.001421
2021-12-31 15:44:00  0.0502     0.002520

AddWindowFeatures

Bases: Transformation, Tunable

Creates rolling window features on the specified columns. Equivalent to adding a new column by running: df[column].rolling(window).function().

Parameters:

Name Type Description Default
column_window_func (ColumnWindowFunction, list[ColumnWindowFunction])

A list of tuples, where each tuple contains the column name, the window size and the function to apply. The function can be a predefined function (see PredefinedFunction) or a Callable (with a single parameter).

required
fillna bool

Fill NaNs in the resulting DataFrame

False

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import AddWindowFeatures
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = AddWindowFeatures(("sine", 10, "mean"))
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                       sine  sine~mean_10
2021-12-31 15:40:00 -0.0000      -0.05649
2021-12-31 15:41:00  0.0126      -0.04394
2021-12-31 15:42:00  0.0251      -0.03139
2021-12-31 15:43:00  0.0377      -0.01883
2021-12-31 15:44:00  0.0502      -0.00628

function

ApplyFunction

Bases: Transformation, Tunable

Wraps and arbitrary function that will run at inference.

holidays

AddExchangeHolidayFeatures

Bases: Transformation, Tunable

Adds holiday features for given exchange(s) as new column(s). It uses the pattern "holiday_{exchange}" for naming the columns.

Parameters:

Name Type Description Default
exchange_codes list[str] | str

List of exchange codes (eg.: NYSE) for which to add holiday features.

required
labeling str | LabelingMethod
  • holiday_binary: Workdays = 0 | National Holidays = 1
  • weekday_weekend_holiday: Workdays = 0 | Weekends = 1 | National Holidays == 2
weekday_weekend_holiday

AddHolidayFeatures

Bases: Transformation, Tunable

Adds holiday features for given region(s) as new column(s). It uses the pattern "holiday_{country_code}" for naming the columns.

Parameters:

Name Type Description Default
country_codes list[str] | str

List of country codes (eg.: US, DE) for which to add holiday features.

required
labeling str | LabelingMethod
  • holiday_binary: Workdays = 0 | National Holidays = 1
  • weekday_weekend_holiday: Workdays = 0 | Weekends = 1 | National Holidays == 2
  • weekday_weekend_uniqueholiday: Workdays = 0 | Weekends = 1 | National Holidays == Unique int (>1)
  • weekday_weekend_uniqueholiday_string: Workdays = 0 | Weekends = 1 | National Holidays == string
weekday_weekend_holiday

LabelingMethod

Bases: ParsableEnum

Parameters:

Name Type Description Default
holiday_binary
  • Workdays = 0
  • National Holidays = 1
required
weekday_weekend_holiday
  • Workdays = 0
  • Weekends = 1
  • National Holidays == 2
required
weekday_weekend_uniqueholiday
  • Workdays = 0
  • Weekends = 1
  • National Holidays == Unique int (>1)
required
weekday_weekend_uniqueholiday_string
  • Workdays = 0
  • Weekends = 1
  • National Holidays == string
required

lags

AddLagsX

Bases: Transformation, Tunable

Adds past values of X for the desired column(s).

Parameters:

Name Type Description Default
columns_and_lags (list[ColumnAndLag], ColumnAndLag)

A tuple (or a list of tuples) of the column name and a single or a list of lags to add as features.

required

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import AddLagsX
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = AddLagsX([("sine", 1), ("sine", [2,3])])
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds.head()
                       sine  sine~lag_1  sine~lag_2  sine~lag_3
2021-12-31 15:40:00 -0.0000     -0.0126     -0.0251     -0.0377
2021-12-31 15:41:00  0.0126     -0.0000     -0.0126     -0.0251
2021-12-31 15:42:00  0.0251      0.0126     -0.0000     -0.0126
2021-12-31 15:43:00  0.0377      0.0251      0.0126     -0.0000
2021-12-31 15:44:00  0.0502      0.0377      0.0251      0.0126

math

MultiplyBy

Bases: InvertibleTransformation, Tunable

Multiplies the data by a constant.

TakeLog

Bases: InvertibleTransformation, Tunable

Takes the logarithm of the data.

Parameters:

Name Type Description Default
base (int, str)

The base of the logarithm, by default "e". Valid values are "e", np.e, "10", 10, "2", 2.

'e'

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import TakeLog
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data(freq="min")
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = TakeLog()
>>> X["sine"].head()
2021-12-31 07:20:00    0.0000
2021-12-31 07:21:00    0.0126
2021-12-31 07:22:00    0.0251
2021-12-31 07:23:00    0.0377
2021-12-31 07:24:00    0.0502
Freq: T, Name: sine, dtype: float64
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds["sine"].head()
2021-12-31 15:40:00        -inf
2021-12-31 15:41:00   -4.374058
2021-12-31 15:42:00   -3.684887
2021-12-31 15:43:00   -3.278095
2021-12-31 15:44:00   -2.991740
Freq: T, Name: sine, dtype: float64

TurnPositive

Bases: InvertibleTransformation

Adds a constant to the data, varying by column, so that all values are positive. It identifies the constant during training, and applies it during inference (and backtesting). Therefore there's no guarantee that the data will be positive during inference (and backtesting).

It can not be updated after the initial training, as that'd change the underlying distribution of the data.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import TurnPositive
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data(freq="min")
>>> X, y  = X - 1, y - 1
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = TurnPositive()
>>> X["sine"].head()
2021-12-31 07:20:00   -1.0000
2021-12-31 07:21:00   -0.9874
2021-12-31 07:22:00   -0.9749
2021-12-31 07:23:00   -0.9623
2021-12-31 07:24:00   -0.9498
Freq: T, Name: sine, dtype: float64
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds["sine"].head()
2021-12-31 15:40:00    2.0000
2021-12-31 15:41:00    2.0126
2021-12-31 15:42:00    2.0251
2021-12-31 15:43:00    2.0377
2021-12-31 15:44:00    2.0502
Freq: T, Name: sine, dtype: float64

scaling

MinMaxScaler

Bases: WrapInvertibleSKLearnTransformation

Transform features by scaling each feature to a given range.

A wrapper around SKLearn's StandardScaler. Capable of further updates after the initial fit.

Parameters:

Name Type Description Default
feature_range tuple(min, max)

Desired range of transformed data.

(0, 1)
clip bool

Set to True to clip transformed values of held-out data to provided feature range.

False

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import MinMaxScaler
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = MinMaxScaler()
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> X["sine"].loc[preds.index].head()
2021-12-31 15:40:00   -0.0000
2021-12-31 15:41:00    0.0126
2021-12-31 15:42:00    0.0251
2021-12-31 15:43:00    0.0377
2021-12-31 15:44:00    0.0502
Freq: T, Name: sine, dtype: float64
>>> preds["sine"].head()
2021-12-31 15:40:00    0.50000
2021-12-31 15:41:00    0.50630
2021-12-31 15:42:00    0.51255
2021-12-31 15:43:00    0.51885
2021-12-31 15:44:00    0.52510
Freq: T, Name: sine, dtype: float64
References

SKLearn's MinMaxScaler documentation

StandardScaler

Bases: WrapInvertibleSKLearnTransformation

Standardize features by removing the mean and scaling to unit variance.

A wrapper around SKLearn's StandardScaler. Capable of further updates after the initial fit.

Examples:

>>> from fold.loop import train_backtest
>>> from fold.splitters import SlidingWindowSplitter
>>> from fold.transformations import StandardScaler
>>> from fold.utils.tests import generate_sine_wave_data
>>> X, y  = generate_sine_wave_data()
>>> splitter = SlidingWindowSplitter(train_window=0.5, step=0.2)
>>> pipeline = StandardScaler()
>>> X["sine"].head()
2021-12-31 07:20:00    0.0000
2021-12-31 07:21:00    0.0126
2021-12-31 07:22:00    0.0251
2021-12-31 07:23:00    0.0377
2021-12-31 07:24:00    0.0502
Freq: T, Name: sine, dtype: float64
>>> preds, trained_pipeline, _, _ = train_backtest(pipeline, X, y, splitter)
>>> preds["sine"].head()
2021-12-31 15:40:00   -0.000000
2021-12-31 15:41:00    0.017819
2021-12-31 15:42:00    0.035497
2021-12-31 15:43:00    0.053316
2021-12-31 15:44:00    0.070994
Freq: T, Name: sine, dtype: float64
References

SKLearn's StandardScaler documentation

sklearn

WrapSKLearnFeatureSelector

Bases: FeatureSelector, Tunable

Wraps an SKLearn Feature Selector class, stores the selected columns in selected_features property. There's no need to use it directly, fold automatically wraps all sklearn feature selectors into this class.

WrapSKLearnTransformation

Bases: Transformation, Tunable

Wraps an SKLearn Transformation. There's no need to use it directly, fold automatically wraps all sklearn transformations into this class.