Evaluate

krisi.evaluate.compare

compare

compare(scorecards: List[ScoreCard], metric_keys: Optional[List[str]] = None, sort_by: Optional[str] = None, dataframe: bool = True) -> Union[pd.DataFrame, str]

Creates a table where each column is a metric and each row is a scorecard and its corresponding results.

Parameters:

Name	Type	Description	Default
`scorecards`	`List[ScoreCard]`	ScoreCards to compare.	required
`metric_keys`	`Optional[List[str]]`	List of metrics to display. If not set it will return all evaluated metrics on the first scorecard. Sorts the results by the first element of this list if `sort_by` is not specified., by default None	`None`
`sort_by`	`Optional[str]`	`Metric` to sort results by. Selected `Metric` will be displayed in the first row. If not specified metrics will be sorted by the first element of `metric_keys`. If `metric_keys` is not specified it will default to the first metric found on the first `ScoreCard`, by default None	`None`
`dataframe`	`bool`	Whether it should return a `pd.DataFrame` or a `str`, by default True	`True`

Returns:

Type	Description
`Union[DataFrame, str]`	A comparison table, either in `pd.DataFrame` or `string` format.

krisi.evaluate.score

score

score(y: Targets, predictions: Predictions, probabilities: Optional[Probabilities] = None, sample_weight: Optional[Weights] = None, model_name: Optional[str] = None, dataset_name: Optional[str] = None, project_name: Optional[str] = None, default_metrics: Optional[Union[List[Metric], Metric]] = None, custom_metrics: Optional[Union[List[Metric], Metric]] = None, dataset_type: Optional[Union[DatasetType, str]] = None, sample_type: Union[str, SampleTypes] = SampleTypes.outofsample, calculation: Union[Calculation, str] = Calculation.single, rolling_args: Optional[Dict[str, Any]] = None, raise_exceptions: bool = False, benchmark_models: Optional[Union[Model, List[Model]]] = None, num_benchmark_iter: int = 100, **kwargs) -> ScoreCard

Creates a ScoreCard based on the passed in arguments, evaluates and then returns the ScoreCard.

Parameters:

Name	Type	Description	Default
`y`	`Targets`	True Targets to which the metrics are evaluated to.	required
`predictions`	`Predictions`	The single point predictions to which the metrics are evaluated to.	required
`model_name`	`Optional[str]`	The name of the model that the predictions were generated by. Used for identifying scorecards.	`None`
`dataset_name`	`Optional[str]`	The name of the dataset from which the `y` (targets) orginate from. Used for reporting.	`None`
`project_name`	`Optional[str]`	The name of the project. Used for reporting and saving to a directory (eg.: multiple scorecards)	`None`
`default_metrics`	`Optional[Union[List[Metric], Metric]]`	Default metrics that get evaluated. See `library`.	`None`
`custom_metrics`	`Optional[Union[List[Metric], Metric]]`	Custom metrics that get evaluated. If specified it will evaluate these after `default_metric` See `library`.	`None`
`dataset_type`	`Optional[Union[DatasetType, str]]`	Whether the task was a binar/multi-label classifiction of regression. If set to `None` it will infer from the target.	`None`
`sample_type`	`Union[str, SampleTypes]`	Whether we should evaluate it on insample or out of sample. - `SampleTypes.outofsample` - `SampleTypes.insample`	`outofsample`
`calculation`	`Union[Calculation, str]`	Whether it should evaluate `Metrics` on a rolling basis or on the whole prediction or both, by default Calculation.single - `Calculation.single` - `Calculation.rolling` - `Calculation.both`	`single`
`rolling_args`	`Dict[str, Any]`	Arguments to be passed onto `pd.DataFrame.rolling`. Default: The window size of the rolling metric evaluation. If `None` evaluation over time will be on expanding window basis, by default `len(dataset)//100`. The step size of the rolling metric evaluation, by default `len(dataset)//100`.	`None`

Returns:

Type	Description
`ScoreCard`	The ScoreCard Evaluated

Raises:

Type	Description
`ValueError`	If Calculation type is incorrectly specified.