data_preprocessing, pre_opt='min_max', reshape=False)[source] [edit on github]

Preprocess the data used, in order to clean and make the data more suitable for the machine learning algorithms


feat_arr : numpy.ndarray, list, pandas.DataFrame

Array of feature values. This array is used for training a ML algorithm.

pre_opt : {‘min_max’, ‘standard’, ‘normalize’, ‘no’} str, optional

Type of preprocessing to do on feat_arr.

  • ‘min_max’ : Turns feat_arr to values between (0,1)
  • ‘standard’ : Uses StandardScaler method
  • ‘normalize’ : Uses the Normalizer method
  • ‘no’ : No preprocessing on feat_arr

reshape : bool, optional

If True, it reshapes feat_arr into a 1d array if its shapes is equal to (ncols, 1), where ncols is the number of columns. This variable is set to False by default.


feat_arr_scaled : numpy.ndarray

Rescaled version of feat_arr based on the choice of pre_opt.


For more information on how to pre-process your data, see ``_.