Skip to content

Metrics

Classes:

Name Description
ConfusionMatrix

Result object returned by rapidstats.metrics.confusion_matrix

Functions:

Name Description
adverse_impact_ratio

Computes the Adverse Impact Ratio (AIR), which is the ratio of negative

adverse_impact_ratio_at_thresholds

Computes the Adverse Impact Ratio (AIR) at each threshold of y_score. See

average_precision

Computes Average Precision.

brier_loss

Computes the Brier loss (smaller is better). The Brier loss measures the mean

confusion_matrix

Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).

confusion_matrix_at_thresholds

Computes the confusion matrix at each threshold. When the strategy is

max_ks

Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the

mean

Computes the mean of the input array.

mean_squared_error

Computes Mean Squared Error (MSE) as

predicted_positive_ratio_at_thresholds

Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is

r2

Computes R2 as

roc_auc

Computes Area Under the Receiver Operating Characteristic Curve.

root_mean_squared_error

Computes Root Mean Squared Error (RMSE) as

ConfusionMatrix dataclass

Result object returned by rapidstats.metrics.confusion_matrix

Attributes:

Name Type Description
tn float

↑Count of True Negatives; y_true == False and y_pred == False

fp float

↓Count of False Positives; y_true == False and y_pred == True

fn float

↓Count of False Negatives; y_true == True and y_pred == False

tp float

↑Count of True Positives; y_true == True, y_pred == True

tpr float

↑True Positive Rate, Recall, Sensitivity; Probability that an actual positive will be predicted positive; \( \frac{TP}{TP + FN} \)

fpr float

↓False Positive Rate, Type I Error; Probability that an actual negative will be predicted positive; \( \frac{FP}{FP + TN} \)

fnr float

↓False Negative Rate, Type II Error; Probability an actual positive will be predicted negative; \( \frac{FN}{TP + FN} \)

tnr float

↑True Negative Rate, Specificity; Probability an actual negative will be predicted negative; \( \frac{TN}{FP + TN} \)

prevalence float

Prevalence; Proportion of positive classes; \( \frac{TP + FN}{TN + FP + FN + TP} \)

prevalence_threshold float

Prevalence Threshold; \( \frac{\sqrt{TPR \times FPR} - FPR}{TPR - FPR} \)

informedness float

↑Informedness, Youden's J; \( TPR + TNR - 1 \)

precision float

↑Precision, Positive Predicted Value (PPV); Probability a predicted positive was actually correct; \( \frac{TP}{TP + FP} \)

false_omission_rate float

↓False Omission Rate (FOR); Proportion of predicted negatives that were wrong \( \frac{FN}{FN + TN} \)

plr float

↑Positive Likelihood Ratio, LR+; \( \frac{TPR}{FPR} \)

nlr float

Negative Likelihood Ratio, LR-; \( \frac{FNR}{TNR} \)

acc float

↑Accuracy (ACC); Probability of a correct prediction; \( \frac{TP + TN}{TN + FP + FN + TP} \)

balanced_accuracy float

↑Balanced Accuracy (BA); \( \frac{TP + TN}{2} \)

fbeta float

\( F_{\beta} \); Harmonic mean of Precision and Recall; \( \frac{(1 + \beta)^2 \times PPV \times TPR}{(\beta^2 \times PPV) + TPR} \)

folkes_mallows_index float

↑Folkes Mallows Index (FM); \( \sqrt{PPV \times TPR} \)

mcc float

↑Matthew Correlation Coefficient (MCC), Yule Phi Coefficient; \( \sqrt{TPR \times TNR \times PPV \times NPV} - \sqrt{FNR \times FPR \times FOR \times FDR} \)

threat_score float

↑Threat Score (TS), Critical Success Index (CSI), Jaccard Index; \( \frac{TP}{TP + FN + FP} \)

markedness float

Markedness (MP), deltaP; \( PPV + NPV - 1 \)

fdr float

↓False Discovery Rate, Proportion of predicted positives that are wrong; \( \frac{FP}{TP + FP} \)

↑npv float

Negative Predictive Value; Proportion of predicted negatives that are correct; \( \frac{TN}{FN + TN} \)

dor float

Diagnostic Odds Ratio; \( \frac{LR+}{LR-} \)

ppr float

Predicted Positive Ratio; Proportion that are predicted positive; \( \frac{TP + FP}{TN + FP + FN + TP} \)

pnr float

Predicted Negative Ratio; Proportion that are predicted negative; \( \frac{TN + FN}{TN + FP + FN + TP} \)

Methods:

Name Description
to_polars

Convert the dataclass to a long Polars DataFrame with columns metric and

Source code in python/rapidstats/metrics.py
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
@dataclasses.dataclass
class ConfusionMatrix:
    r"""Result object returned by `rapidstats.metrics.confusion_matrix`

    Attributes
    ----------
    tn : float
        ↑Count of True Negatives; y_true == False and y_pred == False
    fp : float
        ↓Count of False Positives; y_true == False and y_pred == True
    fn : float
        ↓Count of False Negatives; y_true == True and y_pred == False
    tp : float
        ↑Count of True Positives; y_true == True, y_pred == True
    tpr : float
        ↑True Positive Rate, Recall, Sensitivity; Probability that an actual positive
        will be predicted positive; \( \frac{TP}{TP + FN} \)
    fpr : float
        ↓False Positive Rate, Type I Error; Probability that an actual negative will
        be predicted positive; \( \frac{FP}{FP + TN} \)
    fnr : float
        ↓False Negative Rate, Type II Error; Probability an actual positive will be
        predicted negative; \( \frac{FN}{TP + FN} \)
    tnr : float
        ↑True Negative Rate, Specificity; Probability an actual negative will be
        predicted negative; \( \frac{TN}{FP + TN} \)
    prevalence : float
        Prevalence; Proportion of positive classes; \( \frac{TP + FN}{TN + FP + FN + TP} \)
    prevalence_threshold : float
        Prevalence Threshold; \( \frac{\sqrt{TPR \times FPR} - FPR}{TPR - FPR} \)
    informedness : float
        ↑Informedness, Youden's J; \( TPR + TNR - 1 \)
    precision : float
        ↑Precision, Positive Predicted Value (PPV); Probability a predicted positive was
        actually correct; \( \frac{TP}{TP + FP} \)
    false_omission_rate : float
        ↓False Omission Rate (FOR); Proportion of predicted negatives that were wrong
        \( \frac{FN}{FN + TN} \)
    plr : float
        ↑Positive Likelihood Ratio, LR+; \( \frac{TPR}{FPR} \)
    nlr : float
        Negative Likelihood Ratio, LR-; \( \frac{FNR}{TNR} \)
    acc : float
        ↑Accuracy (ACC); Probability of a correct prediction; \( \frac{TP + TN}{TN + FP + FN + TP} \)
    balanced_accuracy : float
        ↑Balanced Accuracy (BA); \( \frac{TP + TN}{2} \)
    fbeta : float
        ↑\( F_{\beta} \); Harmonic mean of Precision and Recall; \( \frac{(1 + \beta)^2 \times PPV \times TPR}{(\beta^2 \times PPV) + TPR} \)
    folkes_mallows_index : float
        ↑Folkes Mallows Index (FM); \( \sqrt{PPV \times TPR} \)
    mcc : float
        ↑Matthew Correlation Coefficient (MCC), Yule Phi Coefficient; \( \sqrt{TPR \times TNR \times PPV \times NPV} - \sqrt{FNR \times FPR \times FOR \times FDR} \)
    threat_score : float
        ↑Threat Score (TS), Critical Success Index (CSI), Jaccard Index; \( \frac{TP}{TP + FN + FP} \)
    markedness : float
        Markedness (MP), deltaP; \( PPV + NPV - 1 \)
    fdr : float
        ↓False Discovery Rate, Proportion of predicted positives that are wrong; \( \frac{FP}{TP + FP} \)
    ↑npv : float
        Negative Predictive Value; Proportion of predicted negatives that are correct; \( \frac{TN}{FN + TN} \)
    dor : float
        Diagnostic Odds Ratio; \( \frac{LR+}{LR-} \)
    ppr : float
        Predicted Positive Ratio; Proportion that are predicted positive; \( \frac{TP + FP}{TN + FP + FN + TP} \)
    pnr : float
        Predicted Negative Ratio; Proportion that are predicted negative; \( \frac{TN + FN}{TN + FP + FN + TP} \)
    """

    tn: float
    fp: float
    fn: float
    tp: float
    tpr: float
    fpr: float
    fnr: float
    tnr: float
    prevalence: float
    prevalence_threshold: float
    informedness: float
    precision: float
    false_omission_rate: float
    plr: float
    nlr: float
    acc: float
    balanced_accuracy: float
    fbeta: float
    folkes_mallows_index: float
    mcc: float
    threat_score: float
    markedness: float
    fdr: float
    npv: float
    dor: float
    ppr: float
    pnr: float

    def to_polars(self) -> pl.DataFrame:
        """Convert the dataclass to a long Polars DataFrame with columns `metric` and
        `value`.

        Returns
        -------
        pl.DataFrame
            DataFrame with columns `metric` and `value`
        """
        dct = self.__dict__

        return pl.DataFrame({"metric": dct.keys(), "value": dct.values()})

to_polars()

Convert the dataclass to a long Polars DataFrame with columns metric and value.

Returns:

Type Description
DataFrame

DataFrame with columns metric and value

Source code in python/rapidstats/metrics.py
161
162
163
164
165
166
167
168
169
170
171
172
def to_polars(self) -> pl.DataFrame:
    """Convert the dataclass to a long Polars DataFrame with columns `metric` and
    `value`.

    Returns
    -------
    pl.DataFrame
        DataFrame with columns `metric` and `value`
    """
    dct = self.__dict__

    return pl.DataFrame({"metric": dct.keys(), "value": dct.values()})

adverse_impact_ratio(y_pred, protected, control, sample_weight=None)

Computes the Adverse Impact Ratio (AIR), which is the ratio of negative predictions for the protected class and the control class. The ideal ratio is 1. For example, in an underwriting context, this means that the model is equally as likely to approve protected applicants as it is unprotected applicants, given that the model score is probability of bad.

Parameters:

Name Type Description Default
y_pred ArrayLike

Predicted negative

required
protected ArrayLike

An array of booleans identifying the protected class

required
control ArrayLike

An array of booleans identifying the control class

required
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None

Returns:

Type Description
float

Adverse Impact Ratio (AIR)

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
def adverse_impact_ratio(
    y_pred: ArrayLike,
    protected: ArrayLike,
    control: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
) -> float:
    """Computes the Adverse Impact Ratio (AIR), which is the ratio of negative
    predictions for the protected class and the control class. The ideal ratio is 1.
    For example, in an underwriting context, this means that the model is equally as
    likely to approve protected applicants as it is unprotected applicants, given that
    the model score is probability of bad.

    Parameters
    ----------
    y_pred : ArrayLike
        Predicted negative
    protected : ArrayLike
        An array of booleans identifying the protected class
    control : ArrayLike
        An array of booleans identifying the control class
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        Adverse Impact Ratio (AIR)

    Added in version 0.1.0
    ----------------------
    """
    return _adverse_impact_ratio(
        pl.DataFrame(
            {
                "y_pred": y_pred,
                "protected": protected,
                "control": control,
                "sample_weight": 1.0 if sample_weight is None else sample_weight,
            }
        )
        .with_columns(pl.col("y_pred", "protected", "control").cast(pl.Boolean))
        .with_columns(pl.col("y_pred").cast(pl.Float64))
    )

adverse_impact_ratio_at_thresholds(y_score, protected, control, sample_weight=None, thresholds=None, strategy='auto')

Computes the Adverse Impact Ratio (AIR) at each threshold of y_score. See rapidstats.metrics.adverse_impact_ratio for more details. When the strategy is cum_sum, computes

for t in y_score:
    is_predicted_negative = y_score < t
    adverse_impact_ratio(is_predicted_negative, protected, control)

Parameters:

Name Type Description Default
y_score ArrayLike

Predicted scores

required
protected ArrayLike

An array of booleans identifying the protected class

required
control ArrayLike

An array of booleans identifying the control class

required
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None
thresholds Optional[list[float]]

The thresholds to compute is_predicted_negative at, i.e. y_score < t. If None, uses every score present in y_score, by default None

None
strategy LoopStrategy

Computation method, by default "auto"

'auto'

Returns:

Type Description
DataFrame

A DataFrame of threshold and air

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
def adverse_impact_ratio_at_thresholds(
    y_score: ArrayLike,
    protected: ArrayLike,
    control: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
    thresholds: Optional[list[float]] = None,
    strategy: LoopStrategy = "auto",
) -> pl.DataFrame:
    """Computes the Adverse Impact Ratio (AIR) at each threshold of `y_score`. See
    [rapidstats.metrics.adverse_impact_ratio][] for more details. When the `strategy` is
    `cum_sum`, computes


    ``` py
    for t in y_score:
        is_predicted_negative = y_score < t
        adverse_impact_ratio(is_predicted_negative, protected, control)
    ```

    Parameters
    ----------
    y_score : ArrayLike
        Predicted scores
    protected : ArrayLike
        An array of booleans identifying the protected class
    control : ArrayLike
        An array of booleans identifying the control class
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0
    thresholds : Optional[list[float]], optional
        The thresholds to compute `is_predicted_negative` at, i.e. y_score < t. If None,
        uses every score present in `y_score`, by default None
    strategy : LoopStrategy, optional
        Computation method, by default "auto"

    Returns
    -------
    pl.DataFrame
        A DataFrame of `threshold` and `air`

    Added in version 0.1.0
    ----------------------
    """
    has_sample_weight = sample_weight is not None
    df = pl.DataFrame(
        {
            "y_score": y_score,
            "protected": protected,
            "control": control,
            "sample_weight": sample_weight if has_sample_weight else 1.0,
        }
    ).with_columns(
        pl.col("protected", "control").cast(pl.Boolean),
        pl.col("y_score", "sample_weight").cast(pl.Float64),
    )

    strategy = _set_loop_strategy(thresholds, strategy)

    if strategy == "loop":

        def _air(t):
            return {
                "threshold": t,
                "air": _adverse_impact_ratio(
                    df.select(
                        pl.col("y_score").lt(t).cast(pl.Float64).alias("y_pred"),
                        "protected",
                        "control",
                        "sample_weight",
                    )
                ),
            }

        airs = _run_concurrent(_air, set(thresholds or y_score))

        res = pl.LazyFrame(airs)
    elif strategy == "cum_sum":
        res = _air_at_thresholds_core(
            df, thresholds, has_sample_weight=has_sample_weight
        )

    return res.pipe(_fill_infinite, None).fill_nan(None).collect()

average_precision(y_true, y_score, sample_weight=None)

Computes Average Precision.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None

Returns:

Type Description
float

Average Precision (AP)

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
def average_precision(
    y_true: ArrayLike, y_score: ArrayLike, sample_weight: Optional[ArrayLike] = None
) -> float:
    """Computes Average Precision.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        Average Precision (AP)

    Added in version 0.1.0
    ----------------------
    """
    return (
        _y_true_y_score_to_df(y_true, y_score, sample_weight)
        .lazy()
        .rename({"y_score": "threshold"})
        .pipe(_base_confusion_matrix_at_thresholds)
        .pipe(_full_confusion_matrix_from_base)
        .select("threshold", "precision", "tpr")
        .drop_nulls()
        .unique("threshold")
        .sort("threshold")
        .select(_ap_from_pr_curve(pl.col("precision"), pl.col("tpr")).alias("ap"))
        .collect()["ap"]
        .item()
    )

brier_loss(y_true, y_score)

Computes the Brier loss (smaller is better). The Brier loss measures the mean squared difference between the predicted scores and the ground truth target. Calculated as:

\[ \frac{1}{N} \sum_{i=1}^N (yt_i - ys_i)^2 \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required

Returns:

Type Description
float

Brier loss

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
def brier_loss(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes the Brier loss (smaller is better). The Brier loss measures the mean
    squared difference between the predicted scores and the ground truth target.
    Calculated as:

    \[ \frac{1}{N} \sum_{i=1}^N (yt_i - ys_i)^2 \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Brier loss

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score)

    return _brier_loss(df)

confusion_matrix(y_true, y_pred, beta=1.0, sample_weight=None)

Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_pred ArrayLike

Predicted target

required
beta float

\( \beta \) to use in \( F_\beta \), by default 1

1.0
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None

Returns:

Type Description
ConfusionMatrix

Dataclass of confusion matrix metrics

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
def confusion_matrix(
    y_true: ArrayLike,
    y_pred: ArrayLike,
    beta: float = 1.0,
    sample_weight: Optional[ArrayLike] = None,
) -> ConfusionMatrix:
    r"""Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_pred : ArrayLike
        Predicted target
    beta : float, optional
        \( \beta \) to use in \( F_\beta \), by default 1
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    ConfusionMatrix
        Dataclass of confusion matrix metrics

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_pred_to_df(y_true, y_pred, sample_weight).with_columns(
        pl.col("y_true").cast(pl.UInt8)
    )

    return ConfusionMatrix(*_confusion_matrix(df, beta))

confusion_matrix_at_thresholds(y_true, y_score, thresholds=None, metrics=DefaultConfusionMatrixMetrics, strategy='auto', beta=1.0, sample_weight=None)

Computes the confusion matrix at each threshold. When the strategy is "cum_sum", computes

for t in y_score:
    y_pred = y_score >= t
    confusion_matrix(y_true, y_pred)

using fast DataFrame operations.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required
thresholds Optional[list[float]]

The thresholds to compute y_pred at, i.e. y_score >= t. If None, uses every score present in y_score, by default None

None
metrics Iterable[ConfusionMatrixMetric]

The metrics to compute, by default DefaultConfusionMatrixMetrics

DefaultConfusionMatrixMetrics
strategy LoopStrategy

Computation method, by default "auto"

'auto'
beta float

\( \beta \) to use in \( F_\beta \), by default 1

1.0
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None

Returns:

Type Description
DataFrame

A Polars DataFrame of threshold, metric, and value

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
def confusion_matrix_at_thresholds(
    y_true: ArrayLike,
    y_score: ArrayLike,
    thresholds: Optional[list[float]] = None,
    metrics: Iterable[ConfusionMatrixMetric] = DefaultConfusionMatrixMetrics,
    strategy: LoopStrategy = "auto",
    beta: float = 1.0,
    sample_weight: Optional[ArrayLike] = None,
) -> pl.DataFrame:
    r"""Computes the confusion matrix at each threshold. When the `strategy` is
    "cum_sum", computes

    ``` py
    for t in y_score:
        y_pred = y_score >= t
        confusion_matrix(y_true, y_pred)
    ```

    using fast DataFrame operations.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    thresholds : Optional[list[float]], optional
        The thresholds to compute `y_pred` at, i.e. y_score >= t. If None,
        uses every score present in `y_score`, by default None
    metrics : Iterable[ConfusionMatrixMetric], optional
        The metrics to compute, by default DefaultConfusionMatrixMetrics
    strategy : LoopStrategy, optional
        Computation method, by default "auto"
    beta : float, optional
        \( \beta \) to use in \( F_\beta \), by default 1
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    pl.DataFrame
        A Polars DataFrame of `threshold`, `metric`, and `value`

    Added in version 0.1.0
    ----------------------
    """
    strategy = _set_loop_strategy(thresholds, strategy)

    if strategy == "loop":
        df = _y_true_y_score_to_df(y_true, y_score)

        def _cm(t):
            return (
                confusion_matrix(
                    df["y_true"],
                    df["y_score"].ge(t),
                    beta=beta,
                    sample_weight=sample_weight,
                )
                .to_polars()
                .with_columns(pl.lit(t).alias("threshold"))
            )

        cms: list[pl.DataFrame] = _run_concurrent(_cm, set(thresholds or y_score))

        return pl.concat(cms, how="vertical").fill_nan(None)
    elif strategy == "cum_sum":
        return (
            pl.LazyFrame(
                {
                    "y_true": y_true,
                    "threshold": y_score,
                    "sample_weight": 1.0 if sample_weight is None else sample_weight,
                }
            )
            .with_columns(pl.col("y_true").cast(pl.Boolean))
            .drop_nulls()
            .pipe(_base_confusion_matrix_at_thresholds)
            .pipe(_full_confusion_matrix_from_base, beta=beta)
            .select("threshold", *metrics)
            .unique("threshold")
            .pipe(_map_to_thresholds, thresholds)
            .drop("_threshold_actual", strict=False)
            .unpivot(index="threshold")
            .rename({"variable": "metric"})
            .collect()
        )

max_ks(y_true, y_score)

Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the ground truth positive and ground truth negative classes. The KS test measures the highest distance between two CDFs, so the Max-KS metric measures how well the model separates two classes. In pseucode:

df = Frame(y_true, y_score)
class0 = df.filter(~y_true)["y_score"]
class1 = df.filter(y_true)["y_score"]

ks(class0, class1)

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required

Returns:

Type Description
float

Max-KS

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
def max_ks(y_true: ArrayLike, y_score: ArrayLike) -> float:
    """Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the
    ground truth positive and ground truth negative classes. The KS test measures the
    highest distance between two CDFs, so the Max-KS metric measures how well the model
    separates two classes. In pseucode:

    ``` py
    df = Frame(y_true, y_score)
    class0 = df.filter(~y_true)["y_score"]
    class1 = df.filter(y_true)["y_score"]

    ks(class0, class1)
    ```

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Max-KS

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score)

    return _max_ks(df)

mean(y)

Computes the mean of the input array.

Parameters:

Name Type Description Default
y ArrayLike

A 1D-array of numbers

required

Returns:

Type Description
float

Mean

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
def mean(y: ArrayLike) -> float:
    """Computes the mean of the input array.

    Parameters
    ----------
    y : ArrayLike
        A 1D-array of numbers

    Returns
    -------
    float
        Mean

    Added in version 0.1.0
    ----------------------
    """
    return _mean(pl.DataFrame({"y": y}))

mean_squared_error(y_true, y_score)

Computes Mean Squared Error (MSE) as

\[ \frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2 \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required

Returns:

Type Description
float

Mean Squared Error (MSE)

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
def mean_squared_error(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes Mean Squared Error (MSE) as

    \[ \frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2 \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Mean Squared Error (MSE)

    Added in version 0.1.0
    ----------------------
    """
    return _mean_squared_error(_regression_to_df(y_true, y_score))

predicted_positive_ratio_at_thresholds(y_score, sample_weight=None, thresholds=None, strategy='auto')

Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is the ratio of predicted positive to the total, and a positive is defined as y_score >= threshold.

Parameters:

Name Type Description Default
y_score ArrayLike

Predicted scores

required
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None
thresholds Optional[list[float]]

The thresholds to compute y_pred at, i.e. y_score >= t. If None, uses every score present in y_score, by default None

None
strategy LoopStrategy

Computation method, by default "auto"

'auto'

Returns:

Type Description
DataFrame

A DataFrame of threshold and ppr

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
def predicted_positive_ratio_at_thresholds(
    y_score: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
    thresholds: Optional[list[float]] = None,
    strategy: LoopStrategy = "auto",
) -> pl.DataFrame:
    """Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is
    the ratio of predicted positive to the total, and a positive is defined as
    `y_score` >= threshold.

    Parameters
    ----------
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0
    thresholds : Optional[list[float]], optional
        The thresholds to compute `y_pred` at, i.e. y_score >= t. If None,
        uses every score present in `y_score`, by default None
    strategy : LoopStrategy, optional
        Computation method, by default "auto"

    Returns
    -------
    pl.DataFrame
        A DataFrame of `threshold` and `ppr`

    Added in version 0.1.0
    ----------------------
    """
    lf = pl.LazyFrame(
        {
            "y_score": y_score,
            "sample_weight": 1.0 if sample_weight is None else sample_weight,
        }
    ).drop_nulls()

    strategy = _set_loop_strategy(y_score, strategy)

    if strategy == "loop":
        df = lf.collect()

        def _ppr(t: float) -> float:
            return {
                "threshold": t,
                "ppr": _weighted_mean(df["y_score"].ge(t), df["sample_weight"]),
            }

        return pl.DataFrame(_run_concurrent(_ppr, set(thresholds or y_score)))
    elif strategy == "cum_sum":

        def _cumulative_ppr(lf: pl.LazyFrame, has_sample_weight: bool):
            if not has_sample_weight:
                return lf.with_row_index(
                    "cumulative_predicted_positive", offset=1
                ).with_columns(
                    pl.col("cumulative_predicted_positive")
                    .truediv(pl.len())
                    .alias("ppr")
                )
            else:
                return lf.with_columns(
                    pl.col("sample_weight")
                    .cum_sum()
                    .alias("cumulative_predicted_positive")
                ).with_columns(
                    pl.col("cumulative_predicted_positive")
                    .truediv(pl.col("sample_weight").sum())
                    .alias("ppr")
                )

        return (
            lf.sort("y_score", descending=True)
            .pipe(_cumulative_ppr, sample_weight is not None)
            .rename({"y_score": "threshold"})
            .select("threshold", "ppr")
            .unique("threshold")
            .pipe(_map_to_thresholds, thresholds)
            .drop("_threshold_actual", strict=False)
            .collect()
        )

r2(y_true, y_score)

Computes R2 as

\[ 1 - \frac{\sum{(y_i - \hat{y_i})^2}{}}{\sum{(y_{i} - \bar{y})^2}} \]

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required

Returns:

Type Description
float

R2

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
def r2(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes R2 as

    \[
        1 - \frac{\sum{(y_i - \hat{y_i})^2}{}}{\sum{(y_{i} - \bar{y})^2}}
    \]

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        R2

    Added in version 0.1.0
    ----------------------
    """
    return _r2(_regression_to_df(y_true, y_score))

roc_auc(y_true, y_score, sample_weight=None)

Computes Area Under the Receiver Operating Characteristic Curve.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required
sample_weight Optional[ArrayLike]

Sample weights, set to 1 if None

Version

Added 0.2.0

None

Returns:

Type Description
float

ROC-AUC

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
def roc_auc(
    y_true: ArrayLike, y_score: ArrayLike, sample_weight: Optional[ArrayLike] = None
) -> float:
    """Computes Area Under the Receiver Operating Characteristic Curve.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        ROC-AUC

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score, sample_weight).with_columns(
        pl.col("y_true").cast(pl.Float64)
    )

    return _roc_auc(df)

root_mean_squared_error(y_true, y_score)

Computes Root Mean Squared Error (RMSE) as

\[ \sqrt{\frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2} \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name Type Description Default
y_true ArrayLike

Ground truth target

required
y_score ArrayLike

Predicted scores

required

Returns:

Type Description
float

Root Mean Squared Error (RMSE)

Added in version 0.1.0
Source code in python/rapidstats/metrics.py
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
def root_mean_squared_error(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes Root Mean Squared Error (RMSE) as

    \[ \sqrt{\frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2} \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Root Mean Squared Error (RMSE)

    Added in version 0.1.0
    ----------------------
    """
    return _root_mean_squared_error(_regression_to_df(y_true, y_score))