scoring_methods¶
-
cosmo_utils.ml.ml_utils.
scoring_methods
(truth_arr, feat_arr=None, pred_arr=None, model=None, score_method='perc', threshold=0.1, perc=0.68)[source] [edit on github]¶ Determines the overall score for given arrays, i.e. the
predicted
array and thetruth
arrayParameters: truth_arr :
numpy.ndarray
or array-like, shape (n_samples, n_outcomes)Array consisting of the
true
values for then_samples
observations. The dimensions oftruth_arr
aren_samples
byn_outcomes
, wheren_samples
is the number of observations, andn_outcomes
the number of predicted outcomes.feat_arr :
numpy.ndarray
, array-like, orNoneType
, shape (n_samples, n_features)Array consisting of the
predicted values
. The dimensions offeat_arr
aren_samples
byn_features
, wheren_samples
is the number of observations, andn_features
the number of features used. This variable is set toNone
by default.pred_arr :
numpy.ndarray
, array-like, orNoneType
, shape (n_samples, n_outcomes)Array of predicted values from
feat_arr
. Ifmodel == None
, this variable must be an array-like object. Ifmodel != None
, this variable will not be used, and will be calculated using themodel
object. This variable is set toNone
by default.model : scikit-learn model object or
NoneType
Model used to estimate the score if
score_method == 'model_score'
This variable is set toNone
by default.score_method : {‘perc’, ‘threshold’, ‘model_score’, ‘r2’}
str
, optionalType of scoring to use when determining how well an algorithm is performing.
- Options:
- ‘perc’ : Use percentage and rank-ordering of the values
- ‘threshold’ : Score based on diffs of
threshold
or less from true value. - ‘model_score’ : Out-of-the-box metod from
sklearn
to determine success. - ‘r2’: R-squared statistic for error calcuation.
threshold : float, optional
perc : float, optional
Returns: method_score : float
Overall score from
pred_arr
to predicttruth_arr
.Notes
For more information on how to pre-process your data, see `http://scikit-learn.org/stable/modules/model_evaluation.html`_.