scoring_methods

cosmo_utils.ml.ml_utils.scoring_methods(truth_arr, feat_arr=None, pred_arr=None, model=None, score_method='perc', threshold=0.1, perc=0.68)[source] [edit on github]

Determines the overall score for given arrays, i.e. the predicted array and the truth array

Parameters:

truth_arr : numpy.ndarray or array-like, shape (n_samples, n_outcomes)

Array consisting of the true values for the n_samples observations. The dimensions of truth_arr are n_samples by n_outcomes, where n_samples is the number of observations, and n_outcomes the number of predicted outcomes.

feat_arr : numpy.ndarray, array-like, or NoneType, shape (n_samples, n_features)

Array consisting of the predicted values. The dimensions of feat_arr are n_samples by n_features, where n_samples is the number of observations, and n_features the number of features used. This variable is set to None by default.

pred_arr : numpy.ndarray, array-like, or NoneType, shape (n_samples, n_outcomes)

Array of predicted values from feat_arr. If model == None, this variable must be an array-like object. If model != None, this variable will not be used, and will be calculated using the model object. This variable is set to None by default.

model : scikit-learn model object or NoneType

Model used to estimate the score if score_method == 'model_score' This variable is set to None by default.

score_method : {‘perc’, ‘threshold’, ‘model_score’, ‘r2’} str, optional

Type of scoring to use when determining how well an algorithm is performing.

Options:
  • ‘perc’ : Use percentage and rank-ordering of the values
  • ‘threshold’ : Score based on diffs of threshold or less from true value.
  • ‘model_score’ : Out-of-the-box metod from sklearn to determine success.
  • ‘r2’: R-squared statistic for error calcuation.

threshold : float, optional

Value to use when calculating the error within threshold value from the truth. This variable is set to None by default. If None, this variable assumes a value of 0.1.

perc : float, optional

Value used when determining score within some perc percentile value form [0,1]. This variable is set to None by default. If None, it assumes a value of 0.68.

Returns:

method_score : float

Overall score from pred_arr to predict truth_arr.

Notes

For more information on how to pre-process your data, see `http://scikit-learn.org/stable/modules/model_evaluation.html`_.