Evaluate
krisi.evaluate.compare
compare
compare(scorecards: List[ScoreCard], metric_keys: Optional[List[str]] = None, sort_by: Optional[str] = None, dataframe: bool = True) -> Union[pd.DataFrame, str]
Creates a table where each column is a metric and each row is a scorecard and its corresponding results.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
scorecards | 
          
                List[ScoreCard]
           | 
          
             ScoreCards to compare.  | 
          required | 
metric_keys | 
          
                Optional[List[str]]
           | 
          
             List of metrics to display. If not set it will return all
evaluated metrics on the first scorecard.
Sorts the results by the first element of this list if   | 
          
                None
           | 
        
sort_by | 
          
                Optional[str]
           | 
          
             
  | 
          
                None
           | 
        
dataframe | 
          
                bool
           | 
          
             Whether it should return a   | 
          
                True
           | 
        
Returns:
| Type | Description | 
|---|---|
                Union[DataFrame, str]
           | 
          
             A comparison table, either in   | 
        
krisi.evaluate.score
score
score(y: Targets, predictions: Predictions, probabilities: Optional[Probabilities] = None, sample_weight: Optional[Weights] = None, model_name: Optional[str] = None, dataset_name: Optional[str] = None, project_name: Optional[str] = None, default_metrics: Optional[Union[List[Metric], Metric]] = None, custom_metrics: Optional[Union[List[Metric], Metric]] = None, dataset_type: Optional[Union[DatasetType, str]] = None, sample_type: Union[str, SampleTypes] = SampleTypes.outofsample, calculation: Union[Calculation, str] = Calculation.single, rolling_args: Optional[Dict[str, Any]] = None, raise_exceptions: bool = False, benchmark_models: Optional[Union[Model, List[Model]]] = None, num_benchmark_iter: int = 100, **kwargs) -> ScoreCard
Creates a ScoreCard based on the passed in arguments, evaluates and then returns the ScoreCard.
Parameters:
| Name | Type | Description | Default | 
|---|---|---|---|
y | 
          
                Targets
           | 
          
             True Targets to which the metrics are evaluated to.  | 
          required | 
predictions | 
          
                Predictions
           | 
          
             The single point predictions to which the metrics are evaluated to.  | 
          required | 
model_name | 
          
                Optional[str]
           | 
          
             The name of the model that the predictions were generated by. Used for identifying scorecards.  | 
          
                None
           | 
        
dataset_name | 
          
                Optional[str]
           | 
          
             The name of the dataset from which the   | 
          
                None
           | 
        
project_name | 
          
                Optional[str]
           | 
          
             The name of the project. Used for reporting and saving to a directory (eg.: multiple scorecards)  | 
          
                None
           | 
        
default_metrics | 
          
                Optional[Union[List[Metric], Metric]]
           | 
          
             Default metrics that get evaluated. See   | 
          
                None
           | 
        
custom_metrics | 
          
                Optional[Union[List[Metric], Metric]]
           | 
          
             Custom metrics that get evaluated. If specified it will evaluate these after   | 
          
                None
           | 
        
dataset_type | 
          
                Optional[Union[DatasetType, str]]
           | 
          
             Whether the task was a binar/multi-label classifiction of regression. If set to   | 
          
                None
           | 
        
sample_type | 
          
                Union[str, SampleTypes]
           | 
          
             Whether we should evaluate it on insample or out of sample.  | 
          
                outofsample
           | 
        
calculation | 
          
                Union[Calculation, str]
           | 
          
             Whether it should evaluate   | 
          
                single
           | 
        
rolling_args | 
          
                Dict[str, Any]
           | 
          
             Arguments to be passed onto  
  | 
          
                None
           | 
        
Returns:
| Type | Description | 
|---|---|
                ScoreCard
           | 
          
             The ScoreCard Evaluated  | 
        
Raises:
| Type | Description | 
|---|---|
                ValueError
           | 
          
             If Calculation type is incorrectly specified.  |