Scikit-learn Model

class python.model.sklearn.SklearnModel(file_name)[source]

Bases: python.model.base.BaseModel

Class that handles the loaded model.

This class can handle models that respect the scikit-learn API. This includes sklearn.pipeline.Pipeline.

The data coming from a request if validated using the metadata setored with the model. The data fed to the predict, predict_proba, explain handle preprocess should be a dictionary that object must contain one key per feature or a list of such dictionaries (recors). Example: {‘feature1’: 5, ‘feature2’: ‘A’, ‘feature3’: 10}

Parameters

file_name (str) – File path of the serialized model. It must be a file that can be loaded using joblib

explain(features, samples=None)[source]

Explain the prediction of a model.

Explanation function that returns the SHAP value for each feture. The returned object contais one value per feature of the model.

If samples is not given, then the explanations are the raw output of the trees, which varies by model (for binary classification in XGBoost this is the log odds ratio). On the contrary, if sample is given, then the explanations are the output of the model transformed into probability space (note that this means the SHAP values now sum to the probability output of the model). See the SHAP documentation for details.

Parameters
  • features (dict) – Record to be used as input data to explain the model. The expected object must contain one key per feature. Example: {‘feature1’: 5, ‘feature2’: ‘A’, ‘feature3’: 10}

  • samples (dict) – Records to be used as a sample pool for the explanations. It must have the same structure as features parameter. According to SHAP documentation, anywhere from 100 to 1000 random background samples are good sizes to use.

Returns

Explanations.

Return type

dict

Raises
  • RuntimeError – If the model is not ready.

  • ValueError – If the model’ predictor doesn’t support SHAP explanations or the model is not already loaded. Or if the explainer outputs an unknown object

family = 'SKLEARN_MODEL'
predict(features)[source]

Make a prediciton

Prediction function that returns the predicted class. The returned value is an integer when the class names are not expecified in the model’s metadata.

Parameters

features (dict) – Record to be used as input data to make predictions. The expected object must contain one key per feature. Example: {‘feature1’: 5, ‘feature2’: ‘A’, ‘feature3’: 10}

Returns

Predicted class.

Return type

int or str

Raises

RuntimeError – If the model is not ready.

predict_proba(features)[source]

Make a prediciton

Prediction function that returns the probability of the predicted classes. The returned object contais one value per class. The keys of the dictionary are the classes of the model.

Parameters

features (dict) – Record to be used as input data to make predictions. The expected object must contain one key per feature. Example: {‘feature1’: 5, ‘feature2’: ‘A’, ‘feature3’: 10}

Returns

Predicted class probabilities.

Return type

dict

Raises

RuntimeError – If the model isn’t ready or the task isn’t classification.

preprocess(features)[source]

Preprocess data

This function is used before prediction or interpretation.

Parameters

features (dict) – The expected object must contain one key per feature. Example: {‘feature1’: 5, ‘feature2’: ‘A’, ‘feature3’: 10}

Returns

Processed data if a preprocessing function was definded in the model’s metadata. The format must be the same as the input.

Return type

dict

Raises

RuntimeError – If the model is not ready.