Metrics

Classes:

Name	Description
`ConfusionMatrix`	Result object returned by `rapidstats.metrics.confusion_matrix`

Functions:

Name	Description
`adverse_impact_ratio`	Computes the Adverse Impact Ratio (AIR), which is the ratio of negative
`adverse_impact_ratio_at_thresholds`	Computes the Adverse Impact Ratio (AIR) at each threshold of `y_score`. See
`average_precision`	Computes Average Precision.
`brier_loss`	Computes the Brier loss (smaller is better). The Brier loss measures the mean
`confusion_matrix`	Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).
`confusion_matrix_at_thresholds`	Computes the confusion matrix at each threshold. When the `strategy` is
`max_ks`	Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the
`mean`	Computes the mean of the input array.
`mean_squared_error`	Computes Mean Squared Error (MSE) as
`predicted_positive_ratio_at_thresholds`	Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is
`r2`	Computes R2 as
`roc_auc`	Computes Area Under the Receiver Operating Characteristic Curve.
`root_mean_squared_error`	Computes Root Mean Squared Error (RMSE) as

`ConfusionMatrix` `dataclass`

Result object returned by rapidstats.metrics.confusion_matrix

Attributes:

Name	Type	Description
`tn`	`float`	↑Count of True Negatives; y_true == False and y_pred == False
`fp`	`float`	↓Count of False Positives; y_true == False and y_pred == True
`fn`	`float`	↓Count of False Negatives; y_true == True and y_pred == False
`tp`	`float`	↑Count of True Positives; y_true == True, y_pred == True
`tpr`	`float`	↑True Positive Rate, Recall, Sensitivity; Probability that an actual positive will be predicted positive; \( \frac{TP}{TP + FN} \)
`fpr`	`float`	↓False Positive Rate, Type I Error; Probability that an actual negative will be predicted positive; \( \frac{FP}{FP + TN} \)
`fnr`	`float`	↓False Negative Rate, Type II Error; Probability an actual positive will be predicted negative; \( \frac{FN}{TP + FN} \)
`tnr`	`float`	↑True Negative Rate, Specificity; Probability an actual negative will be predicted negative; \( \frac{TN}{FP + TN} \)
`prevalence`	`float`	Prevalence; Proportion of positive classes; \( \frac{TP + FN}{TN + FP + FN + TP} \)
`prevalence_threshold`	`float`	Prevalence Threshold; \( \frac{\sqrt{TPR \times FPR} - FPR}{TPR - FPR} \)
`informedness`	`float`	↑Informedness, Youden's J; \( TPR + TNR - 1 \)
`precision`	`float`	↑Precision, Positive Predicted Value (PPV); Probability a predicted positive was actually correct; \( \frac{TP}{TP + FP} \)
`false_omission_rate`	`float`	↓False Omission Rate (FOR); Proportion of predicted negatives that were wrong \( \frac{FN}{FN + TN} \)
`plr`	`float`	↑Positive Likelihood Ratio, LR+; \( \frac{TPR}{FPR} \)
`nlr`	`float`	Negative Likelihood Ratio, LR-; \( \frac{FNR}{TNR} \)
`acc`	`float`	↑Accuracy (ACC); Probability of a correct prediction; \( \frac{TP + TN}{TN + FP + FN + TP} \)
`balanced_accuracy`	`float`	↑Balanced Accuracy (BA); \( \frac{TP + TN}{2} \)
`fbeta`	`float`	↑\( F_{\beta} \); Harmonic mean of Precision and Recall; \( \frac{(1 + \beta)^2 \times PPV \times TPR}{(\beta^2 \times PPV) + TPR} \)
`folkes_mallows_index`	`float`	↑Folkes Mallows Index (FM); \( \sqrt{PPV \times TPR} \)
`mcc`	`float`	↑Matthew Correlation Coefficient (MCC), Yule Phi Coefficient; \( \sqrt{TPR \times TNR \times PPV \times NPV} - \sqrt{FNR \times FPR \times FOR \times FDR} \)
`threat_score`	`float`	↑Threat Score (TS), Critical Success Index (CSI), Jaccard Index; \( \frac{TP}{TP + FN + FP} \)
`markedness`	`float`	Markedness (MP), deltaP; \( PPV + NPV - 1 \)
`fdr`	`float`	↓False Discovery Rate, Proportion of predicted positives that are wrong; \( \frac{FP}{TP + FP} \)
`↑npv`	`float`	Negative Predictive Value; Proportion of predicted negatives that are correct; \( \frac{TN}{FN + TN} \)
`dor`	`float`	Diagnostic Odds Ratio; \( \frac{LR+}{LR-} \)
`ppr`	`float`	Predicted Positive Ratio; Proportion that are predicted positive; \( \frac{TP + FP}{TN + FP + FN + TP} \)
`pnr`	`float`	Predicted Negative Ratio; Proportion that are predicted negative; \( \frac{TN + FN}{TN + FP + FN + TP} \)

Methods:

Name	Description
`to_polars`	Convert the dataclass to a long Polars DataFrame with columns `metric` and

Source code in python/rapidstats/metrics.py

@dataclasses.dataclass
class ConfusionMatrix:
    r"""Result object returned by `rapidstats.metrics.confusion_matrix`

    Attributes
    ----------
    tn : float
        ↑Count of True Negatives; y_true == False and y_pred == False
    fp : float
        ↓Count of False Positives; y_true == False and y_pred == True
    fn : float
        ↓Count of False Negatives; y_true == True and y_pred == False
    tp : float
        ↑Count of True Positives; y_true == True, y_pred == True
    tpr : float
        ↑True Positive Rate, Recall, Sensitivity; Probability that an actual positive
        will be predicted positive; \( \frac{TP}{TP + FN} \)
    fpr : float
        ↓False Positive Rate, Type I Error; Probability that an actual negative will
        be predicted positive; \( \frac{FP}{FP + TN} \)
    fnr : float
        ↓False Negative Rate, Type II Error; Probability an actual positive will be
        predicted negative; \( \frac{FN}{TP + FN} \)
    tnr : float
        ↑True Negative Rate, Specificity; Probability an actual negative will be
        predicted negative; \( \frac{TN}{FP + TN} \)
    prevalence : float
        Prevalence; Proportion of positive classes; \( \frac{TP + FN}{TN + FP + FN + TP} \)
    prevalence_threshold : float
        Prevalence Threshold; \( \frac{\sqrt{TPR \times FPR} - FPR}{TPR - FPR} \)
    informedness : float
        ↑Informedness, Youden's J; \( TPR + TNR - 1 \)
    precision : float
        ↑Precision, Positive Predicted Value (PPV); Probability a predicted positive was
        actually correct; \( \frac{TP}{TP + FP} \)
    false_omission_rate : float
        ↓False Omission Rate (FOR); Proportion of predicted negatives that were wrong
        \( \frac{FN}{FN + TN} \)
    plr : float
        ↑Positive Likelihood Ratio, LR+; \( \frac{TPR}{FPR} \)
    nlr : float
        Negative Likelihood Ratio, LR-; \( \frac{FNR}{TNR} \)
    acc : float
        ↑Accuracy (ACC); Probability of a correct prediction; \( \frac{TP + TN}{TN + FP + FN + TP} \)
    balanced_accuracy : float
        ↑Balanced Accuracy (BA); \( \frac{TP + TN}{2} \)
    fbeta : float
        ↑\( F_{\beta} \); Harmonic mean of Precision and Recall; \( \frac{(1 + \beta)^2 \times PPV \times TPR}{(\beta^2 \times PPV) + TPR} \)
    folkes_mallows_index : float
        ↑Folkes Mallows Index (FM); \( \sqrt{PPV \times TPR} \)
    mcc : float
        ↑Matthew Correlation Coefficient (MCC), Yule Phi Coefficient; \( \sqrt{TPR \times TNR \times PPV \times NPV} - \sqrt{FNR \times FPR \times FOR \times FDR} \)
    threat_score : float
        ↑Threat Score (TS), Critical Success Index (CSI), Jaccard Index; \( \frac{TP}{TP + FN + FP} \)
    markedness : float
        Markedness (MP), deltaP; \( PPV + NPV - 1 \)
    fdr : float
        ↓False Discovery Rate, Proportion of predicted positives that are wrong; \( \frac{FP}{TP + FP} \)
    ↑npv : float
        Negative Predictive Value; Proportion of predicted negatives that are correct; \( \frac{TN}{FN + TN} \)
    dor : float
        Diagnostic Odds Ratio; \( \frac{LR+}{LR-} \)
    ppr : float
        Predicted Positive Ratio; Proportion that are predicted positive; \( \frac{TP + FP}{TN + FP + FN + TP} \)
    pnr : float
        Predicted Negative Ratio; Proportion that are predicted negative; \( \frac{TN + FN}{TN + FP + FN + TP} \)
    """

    tn: float
    fp: float
    fn: float
    tp: float
    tpr: float
    fpr: float
    fnr: float
    tnr: float
    prevalence: float
    prevalence_threshold: float
    informedness: float
    precision: float
    false_omission_rate: float
    plr: float
    nlr: float
    acc: float
    balanced_accuracy: float
    fbeta: float
    folkes_mallows_index: float
    mcc: float
    threat_score: float
    markedness: float
    fdr: float
    npv: float
    dor: float
    ppr: float
    pnr: float

    def to_polars(self) -> pl.DataFrame:
        """Convert the dataclass to a long Polars DataFrame with columns `metric` and
        `value`.

        Returns
        -------
        pl.DataFrame
            DataFrame with columns `metric` and `value`
        """
        dct = self.__dict__

        return pl.DataFrame({"metric": dct.keys(), "value": dct.values()})

`to_polars()`

Convert the dataclass to a long Polars DataFrame with columns metric and value.

Returns:

Type	Description
`DataFrame`	DataFrame with columns `metric` and `value`

Source code in python/rapidstats/metrics.py

def to_polars(self) -> pl.DataFrame:
    """Convert the dataclass to a long Polars DataFrame with columns `metric` and
    `value`.

    Returns
    -------
    pl.DataFrame
        DataFrame with columns `metric` and `value`
    """
    dct = self.__dict__

    return pl.DataFrame({"metric": dct.keys(), "value": dct.values()})

`adverse_impact_ratio(y_pred, protected, control, sample_weight=None)`

Computes the Adverse Impact Ratio (AIR), which is the ratio of negative predictions for the protected class and the control class. The ideal ratio is 1. For example, in an underwriting context, this means that the model is equally as likely to approve protected applicants as it is unprotected applicants, given that the model score is probability of bad.

Parameters:

Name	Type	Description	Default
`y_pred`	`ArrayLike`	Predicted negative	required
`protected`	`ArrayLike`	An array of booleans identifying the protected class	required
`control`	`ArrayLike`	An array of booleans identifying the control class	required
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`

Returns:

Type	Description
`float`	Adverse Impact Ratio (AIR)

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def adverse_impact_ratio(
    y_pred: ArrayLike,
    protected: ArrayLike,
    control: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
) -> float:
    """Computes the Adverse Impact Ratio (AIR), which is the ratio of negative
    predictions for the protected class and the control class. The ideal ratio is 1.
    For example, in an underwriting context, this means that the model is equally as
    likely to approve protected applicants as it is unprotected applicants, given that
    the model score is probability of bad.

    Parameters
    ----------
    y_pred : ArrayLike
        Predicted negative
    protected : ArrayLike
        An array of booleans identifying the protected class
    control : ArrayLike
        An array of booleans identifying the control class
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        Adverse Impact Ratio (AIR)

    Added in version 0.1.0
    ----------------------
    """
    return _adverse_impact_ratio(
        pl.DataFrame(
            {
                "y_pred": y_pred,
                "protected": protected,
                "control": control,
                "sample_weight": 1.0 if sample_weight is None else sample_weight,
            }
        )
        .with_columns(pl.col("y_pred", "protected", "control").cast(pl.Boolean))
        .with_columns(pl.col("y_pred").cast(pl.Float64))
    )

`adverse_impact_ratio_at_thresholds(y_score, protected, control, sample_weight=None, thresholds=None, strategy='auto')`

Computes the Adverse Impact Ratio (AIR) at each threshold of y_score. See rapidstats.metrics.adverse_impact_ratio for more details. When the strategy is cum_sum, computes

for t in y_score:
    is_predicted_negative = y_score < t
    adverse_impact_ratio(is_predicted_negative, protected, control)

Parameters:

Name	Type	Description	Default
`y_score`	`ArrayLike`	Predicted scores	required
`protected`	`ArrayLike`	An array of booleans identifying the protected class	required
`control`	`ArrayLike`	An array of booleans identifying the control class	required
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`
`thresholds`	`Optional[list[float]]`	The thresholds to compute `is_predicted_negative` at, i.e. y_score < t. If None, uses every score present in `y_score`, by default None	`None`
`strategy`	`LoopStrategy`	Computation method, by default "auto"	`'auto'`

Returns:

Type	Description
`DataFrame`	A DataFrame of `threshold` and `air`

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def adverse_impact_ratio_at_thresholds(
    y_score: ArrayLike,
    protected: ArrayLike,
    control: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
    thresholds: Optional[list[float]] = None,
    strategy: LoopStrategy = "auto",
) -> pl.DataFrame:
    """Computes the Adverse Impact Ratio (AIR) at each threshold of `y_score`. See
    [rapidstats.metrics.adverse_impact_ratio][] for more details. When the `strategy` is
    `cum_sum`, computes


    ``` py
    for t in y_score:
        is_predicted_negative = y_score < t
        adverse_impact_ratio(is_predicted_negative, protected, control)
    ```

    Parameters
    ----------
    y_score : ArrayLike
        Predicted scores
    protected : ArrayLike
        An array of booleans identifying the protected class
    control : ArrayLike
        An array of booleans identifying the control class
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0
    thresholds : Optional[list[float]], optional
        The thresholds to compute `is_predicted_negative` at, i.e. y_score < t. If None,
        uses every score present in `y_score`, by default None
    strategy : LoopStrategy, optional
        Computation method, by default "auto"

    Returns
    -------
    pl.DataFrame
        A DataFrame of `threshold` and `air`

    Added in version 0.1.0
    ----------------------
    """
    has_sample_weight = sample_weight is not None
    df = pl.DataFrame(
        {
            "y_score": y_score,
            "protected": protected,
            "control": control,
            "sample_weight": sample_weight if has_sample_weight else 1.0,
        }
    ).with_columns(
        pl.col("protected", "control").cast(pl.Boolean),
        pl.col("y_score", "sample_weight").cast(pl.Float64),
    )

    strategy = _set_loop_strategy(thresholds, strategy)

    if strategy == "loop":

        def _air(t):
            return {
                "threshold": t,
                "air": _adverse_impact_ratio(
                    df.select(
                        pl.col("y_score").lt(t).cast(pl.Float64).alias("y_pred"),
                        "protected",
                        "control",
                        "sample_weight",
                    )
                ),
            }

        airs = _run_concurrent(_air, set(thresholds or y_score))

        res = pl.LazyFrame(airs)
    elif strategy == "cum_sum":
        res = _air_at_thresholds_core(
            df, thresholds, has_sample_weight=has_sample_weight
        )

    return res.pipe(_fill_infinite, None).fill_nan(None).collect()

`average_precision(y_true, y_score, sample_weight=None)`

Computes Average Precision.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`

Returns:

Type	Description
`float`	Average Precision (AP)

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def average_precision(
    y_true: ArrayLike, y_score: ArrayLike, sample_weight: Optional[ArrayLike] = None
) -> float:
    """Computes Average Precision.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        Average Precision (AP)

    Added in version 0.1.0
    ----------------------
    """
    return (
        _y_true_y_score_to_df(y_true, y_score, sample_weight)
        .lazy()
        .rename({"y_score": "threshold"})
        .pipe(_base_confusion_matrix_at_thresholds)
        .pipe(_full_confusion_matrix_from_base)
        .select("threshold", "precision", "tpr")
        .drop_nulls()
        .unique("threshold")
        .sort("threshold")
        .select(_ap_from_pr_curve(pl.col("precision"), pl.col("tpr")).alias("ap"))
        .collect()["ap"]
        .item()
    )

`brier_loss(y_true, y_score)`

Computes the Brier loss (smaller is better). The Brier loss measures the mean squared difference between the predicted scores and the ground truth target. Calculated as:

\[ \frac{1}{N} \sum_{i=1}^N (yt_i - ys_i)^2 \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required

Returns:

Type	Description
`float`	Brier loss

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def brier_loss(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes the Brier loss (smaller is better). The Brier loss measures the mean
    squared difference between the predicted scores and the ground truth target.
    Calculated as:

    \[ \frac{1}{N} \sum_{i=1}^N (yt_i - ys_i)^2 \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Brier loss

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score)

    return _brier_loss(df)

`confusion_matrix(y_true, y_pred, beta=1.0, sample_weight=None)`

Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_pred`	`ArrayLike`	Predicted target	required
`beta`	`float`	\( \beta \) to use in \( F_\beta \), by default 1	`1.0`
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`

Returns:

Type	Description
`ConfusionMatrix`	Dataclass of confusion matrix metrics

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def confusion_matrix(
    y_true: ArrayLike,
    y_pred: ArrayLike,
    beta: float = 1.0,
    sample_weight: Optional[ArrayLike] = None,
) -> ConfusionMatrix:
    r"""Computes confusion matrix metrics (TP, FP, TN, FN, TPR, Fbeta, etc.).

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_pred : ArrayLike
        Predicted target
    beta : float, optional
        \( \beta \) to use in \( F_\beta \), by default 1
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    ConfusionMatrix
        Dataclass of confusion matrix metrics

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_pred_to_df(y_true, y_pred, sample_weight).with_columns(
        pl.col("y_true").cast(pl.UInt8)
    )

    return ConfusionMatrix(*_confusion_matrix(df, beta))

`confusion_matrix_at_thresholds(y_true, y_score, thresholds=None, metrics=DefaultConfusionMatrixMetrics, strategy='auto', beta=1.0, sample_weight=None)`

Computes the confusion matrix at each threshold. When the strategy is "cum_sum", computes

for t in y_score:
    y_pred = y_score >= t
    confusion_matrix(y_true, y_pred)

using fast DataFrame operations.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required
`thresholds`	`Optional[list[float]]`	The thresholds to compute `y_pred` at, i.e. y_score >= t. If None, uses every score present in `y_score`, by default None	`None`
`metrics`	`Iterable[ConfusionMatrixMetric]`	The metrics to compute, by default DefaultConfusionMatrixMetrics	`DefaultConfusionMatrixMetrics`
`strategy`	`LoopStrategy`	Computation method, by default "auto"	`'auto'`
`beta`	`float`	\( \beta \) to use in \( F_\beta \), by default 1	`1.0`
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`

Returns:

Type	Description
`DataFrame`	A Polars DataFrame of `threshold`, `metric`, and `value`

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def confusion_matrix_at_thresholds(
    y_true: ArrayLike,
    y_score: ArrayLike,
    thresholds: Optional[list[float]] = None,
    metrics: Iterable[ConfusionMatrixMetric] = DefaultConfusionMatrixMetrics,
    strategy: LoopStrategy = "auto",
    beta: float = 1.0,
    sample_weight: Optional[ArrayLike] = None,
) -> pl.DataFrame:
    r"""Computes the confusion matrix at each threshold. When the `strategy` is
    "cum_sum", computes

    ``` py
    for t in y_score:
        y_pred = y_score >= t
        confusion_matrix(y_true, y_pred)
    ```

    using fast DataFrame operations.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    thresholds : Optional[list[float]], optional
        The thresholds to compute `y_pred` at, i.e. y_score >= t. If None,
        uses every score present in `y_score`, by default None
    metrics : Iterable[ConfusionMatrixMetric], optional
        The metrics to compute, by default DefaultConfusionMatrixMetrics
    strategy : LoopStrategy, optional
        Computation method, by default "auto"
    beta : float, optional
        \( \beta \) to use in \( F_\beta \), by default 1
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    pl.DataFrame
        A Polars DataFrame of `threshold`, `metric`, and `value`

    Added in version 0.1.0
    ----------------------
    """
    strategy = _set_loop_strategy(thresholds, strategy)

    if strategy == "loop":
        df = _y_true_y_score_to_df(y_true, y_score)

        def _cm(t):
            return (
                confusion_matrix(
                    df["y_true"],
                    df["y_score"].ge(t),
                    beta=beta,
                    sample_weight=sample_weight,
                )
                .to_polars()
                .with_columns(pl.lit(t).alias("threshold"))
            )

        cms: list[pl.DataFrame] = _run_concurrent(_cm, set(thresholds or y_score))

        return pl.concat(cms, how="vertical").fill_nan(None)
    elif strategy == "cum_sum":
        return (
            pl.LazyFrame(
                {
                    "y_true": y_true,
                    "threshold": y_score,
                    "sample_weight": 1.0 if sample_weight is None else sample_weight,
                }
            )
            .with_columns(pl.col("y_true").cast(pl.Boolean))
            .drop_nulls()
            .pipe(_base_confusion_matrix_at_thresholds)
            .pipe(_full_confusion_matrix_from_base, beta=beta)
            .select("threshold", *metrics)
            .unique("threshold")
            .pipe(_map_to_thresholds, thresholds)
            .drop("_threshold_actual", strict=False)
            .unpivot(index="threshold")
            .rename({"variable": "metric"})
            .collect()
        )

`max_ks(y_true, y_score)`

Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the ground truth positive and ground truth negative classes. The KS test measures the highest distance between two CDFs, so the Max-KS metric measures how well the model separates two classes. In pseucode:

df = Frame(y_true, y_score)
class0 = df.filter(~y_true)["y_score"]
class1 = df.filter(y_true)["y_score"]

ks(class0, class1)

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required

Returns:

Type	Description
`float`	Max-KS

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def max_ks(y_true: ArrayLike, y_score: ArrayLike) -> float:
    """Performs the two-sample Kolmogorov-Smirnov test on the predicted scores of the
    ground truth positive and ground truth negative classes. The KS test measures the
    highest distance between two CDFs, so the Max-KS metric measures how well the model
    separates two classes. In pseucode:

    ``` py
    df = Frame(y_true, y_score)
    class0 = df.filter(~y_true)["y_score"]
    class1 = df.filter(y_true)["y_score"]

    ks(class0, class1)
    ```

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Max-KS

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score)

    return _max_ks(df)

`mean(y)`

Computes the mean of the input array.

Parameters:

Name	Type	Description	Default
`y`	`ArrayLike`	A 1D-array of numbers	required

Returns:

Type	Description
`float`	Mean

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def mean(y: ArrayLike) -> float:
    """Computes the mean of the input array.

    Parameters
    ----------
    y : ArrayLike
        A 1D-array of numbers

    Returns
    -------
    float
        Mean

    Added in version 0.1.0
    ----------------------
    """
    return _mean(pl.DataFrame({"y": y}))

`mean_squared_error(y_true, y_score)`

Computes Mean Squared Error (MSE) as

\[ \frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2 \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required

Returns:

Type	Description
`float`	Mean Squared Error (MSE)

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def mean_squared_error(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes Mean Squared Error (MSE) as

    \[ \frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2 \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Mean Squared Error (MSE)

    Added in version 0.1.0
    ----------------------
    """
    return _mean_squared_error(_regression_to_df(y_true, y_score))

`predicted_positive_ratio_at_thresholds(y_score, sample_weight=None, thresholds=None, strategy='auto')`

Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is the ratio of predicted positive to the total, and a positive is defined as y_score >= threshold.

Parameters:

Name	Type	Description	Default
`y_score`	`ArrayLike`	Predicted scores	required
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`
`thresholds`	`Optional[list[float]]`	The thresholds to compute `y_pred` at, i.e. y_score >= t. If None, uses every score present in `y_score`, by default None	`None`
`strategy`	`LoopStrategy`	Computation method, by default "auto"	`'auto'`

Returns:

Type	Description
`DataFrame`	A DataFrame of `threshold` and `ppr`

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def predicted_positive_ratio_at_thresholds(
    y_score: ArrayLike,
    sample_weight: Optional[ArrayLike] = None,
    thresholds: Optional[list[float]] = None,
    strategy: LoopStrategy = "auto",
) -> pl.DataFrame:
    """Computes the Predicted Positive Ratio (PPR) at each threshold, where the PPR is
    the ratio of predicted positive to the total, and a positive is defined as
    `y_score` >= threshold.

    Parameters
    ----------
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0
    thresholds : Optional[list[float]], optional
        The thresholds to compute `y_pred` at, i.e. y_score >= t. If None,
        uses every score present in `y_score`, by default None
    strategy : LoopStrategy, optional
        Computation method, by default "auto"

    Returns
    -------
    pl.DataFrame
        A DataFrame of `threshold` and `ppr`

    Added in version 0.1.0
    ----------------------
    """
    lf = pl.LazyFrame(
        {
            "y_score": y_score,
            "sample_weight": 1.0 if sample_weight is None else sample_weight,
        }
    ).drop_nulls()

    strategy = _set_loop_strategy(y_score, strategy)

    if strategy == "loop":
        df = lf.collect()

        def _ppr(t: float) -> float:
            return {
                "threshold": t,
                "ppr": _weighted_mean(df["y_score"].ge(t), df["sample_weight"]),
            }

        return pl.DataFrame(_run_concurrent(_ppr, set(thresholds or y_score)))
    elif strategy == "cum_sum":

        def _cumulative_ppr(lf: pl.LazyFrame, has_sample_weight: bool):
            if not has_sample_weight:
                return lf.with_row_index(
                    "cumulative_predicted_positive", offset=1
                ).with_columns(
                    pl.col("cumulative_predicted_positive")
                    .truediv(pl.len())
                    .alias("ppr")
                )
            else:
                return lf.with_columns(
                    pl.col("sample_weight")
                    .cum_sum()
                    .alias("cumulative_predicted_positive")
                ).with_columns(
                    pl.col("cumulative_predicted_positive")
                    .truediv(pl.col("sample_weight").sum())
                    .alias("ppr")
                )

        return (
            lf.sort("y_score", descending=True)
            .pipe(_cumulative_ppr, sample_weight is not None)
            .rename({"y_score": "threshold"})
            .select("threshold", "ppr")
            .unique("threshold")
            .pipe(_map_to_thresholds, thresholds)
            .drop("_threshold_actual", strict=False)
            .collect()
        )

`r2(y_true, y_score)`

Computes R2 as

\[ 1 - \frac{\sum{(y_i - \hat{y_i})^2}{}}{\sum{(y_{i} - \bar{y})^2}} \]

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required

Returns:

Type	Description
`float`	R2

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def r2(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes R2 as

    \[
        1 - \frac{\sum{(y_i - \hat{y_i})^2}{}}{\sum{(y_{i} - \bar{y})^2}}
    \]

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        R2

    Added in version 0.1.0
    ----------------------
    """
    return _r2(_regression_to_df(y_true, y_score))

`roc_auc(y_true, y_score, sample_weight=None)`

Computes Area Under the Receiver Operating Characteristic Curve.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required
`sample_weight`	`Optional[ArrayLike]`	Sample weights, set to 1 if None Version Added 0.2.0	`None`

Returns:

Type	Description
`float`	ROC-AUC

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def roc_auc(
    y_true: ArrayLike, y_score: ArrayLike, sample_weight: Optional[ArrayLike] = None
) -> float:
    """Computes Area Under the Receiver Operating Characteristic Curve.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores
    sample_weight: Optional[ArrayLike], optional
        Sample weights, set to 1 if None

        !!! Version
            Added 0.2.0

    Returns
    -------
    float
        ROC-AUC

    Added in version 0.1.0
    ----------------------
    """
    df = _y_true_y_score_to_df(y_true, y_score, sample_weight).with_columns(
        pl.col("y_true").cast(pl.Float64)
    )

    return _roc_auc(df)

`root_mean_squared_error(y_true, y_score)`

Computes Root Mean Squared Error (RMSE) as

\[ \sqrt{\frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2} \]

where \( yt \) is y_true and \( ys \) is y_score.

Parameters:

Name	Type	Description	Default
`y_true`	`ArrayLike`	Ground truth target	required
`y_score`	`ArrayLike`	Predicted scores	required

Returns:

Type	Description
`float`	Root Mean Squared Error (RMSE)

Added in version 0.1.0

Source code in python/rapidstats/metrics.py

def root_mean_squared_error(y_true: ArrayLike, y_score: ArrayLike) -> float:
    r"""Computes Root Mean Squared Error (RMSE) as

    \[ \sqrt{\frac{1}{N} \sum_{i=1}^{N} (yt_i - ys_i)^2} \]

    where \( yt \) is `y_true` and \( ys \) is `y_score`.

    Parameters
    ----------
    y_true : ArrayLike
        Ground truth target
    y_score : ArrayLike
        Predicted scores

    Returns
    -------
    float
        Root Mean Squared Error (RMSE)

    Added in version 0.1.0
    ----------------------
    """
    return _root_mean_squared_error(_regression_to_df(y_true, y_score))

Metrics

ConfusionMatrix dataclass

to_polars()

adverse_impact_ratio(y_pred, protected, control, sample_weight=None)

adverse_impact_ratio_at_thresholds(y_score, protected, control, sample_weight=None, thresholds=None, strategy='auto')

average_precision(y_true, y_score, sample_weight=None)

brier_loss(y_true, y_score)

confusion_matrix(y_true, y_pred, beta=1.0, sample_weight=None)

confusion_matrix_at_thresholds(y_true, y_score, thresholds=None, metrics=DefaultConfusionMatrixMetrics, strategy='auto', beta=1.0, sample_weight=None)

max_ks(y_true, y_score)

mean(y)

mean_squared_error(y_true, y_score)

predicted_positive_ratio_at_thresholds(y_score, sample_weight=None, thresholds=None, strategy='auto')

r2(y_true, y_score)

roc_auc(y_true, y_score, sample_weight=None)

root_mean_squared_error(y_true, y_score)

`ConfusionMatrix` `dataclass`

`to_polars()`

`adverse_impact_ratio(y_pred, protected, control, sample_weight=None)`

`adverse_impact_ratio_at_thresholds(y_score, protected, control, sample_weight=None, thresholds=None, strategy='auto')`

`average_precision(y_true, y_score, sample_weight=None)`

`brier_loss(y_true, y_score)`

`confusion_matrix(y_true, y_pred, beta=1.0, sample_weight=None)`

`confusion_matrix_at_thresholds(y_true, y_score, thresholds=None, metrics=DefaultConfusionMatrixMetrics, strategy='auto', beta=1.0, sample_weight=None)`

`max_ks(y_true, y_score)`

`mean(y)`

`mean_squared_error(y_true, y_score)`

`predicted_positive_ratio_at_thresholds(y_score, sample_weight=None, thresholds=None, strategy='auto')`

`r2(y_true, y_score)`

`roc_auc(y_true, y_score, sample_weight=None)`

`root_mean_squared_error(y_true, y_score)`