Skip to content

Error if a bin has only a single class in it #10

@zacps

Description

@zacps

If a bin only contains a single class label the logistic regression fails to fit (solver requires more than one class label).

I'm not sure if this is the 'right' fix but it worked as a quick-n-dirty workaround:

def get_platt_scaler(model_probs, labels):
    clf = LogisticRegression(C=1e10, solver='lbfgs')
    eps = 1e-12
    model_probs = model_probs.astype(dtype=np.float64)
    model_probs = np.expand_dims(model_probs, axis=-1)
    model_probs = np.clip(model_probs, eps, 1 - eps)
    model_probs = np.log(model_probs / (1 - model_probs))
    unique_labels = np.unique(labels) # +
    if unique_labels.shape[0] != 1: # +
        clf.fit(model_probs, labels)
    def calibrator(probs):
        x = np.array(probs, dtype=np.float64)
        x = np.clip(x, eps, 1 - eps)
        x = np.log(x / (1 - x))
        if unique_labels.shape[0] != 1: # +
            x = x * clf.coef_[0] + clf.intercept_
        output = 1 / (1 + np.exp(-x))
        return output
    return calibrator

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions