Data Normalization In Python

Usually, most of the data-files that we download online contain:

  1. Some missing values in their columns(features/attributes).

  2. Some carry NaN with it.

  3. Some carry categorical data like the name of cities, that are generally in string format which is not acceptable in the model’s predictive approach.

All the above points are responsible for the outbreak of errors that we need to resolve. Normalization or feature scaling is the required approach. It is a process of cleaning the entire data to make it more suitable/reliable for the fluent predictions. Scaling is that approach that leads the machine learning models towards the optimum predictions. It comes into action when data carries distant values, which means they have values that are found to have huge intervals. So we normalize data and bring them close to zero.


sklearn library makes our task easier.

  • Using StandardScaler() function, it adjust our data close to 0, means std=1 and mean=0.

  • Using MinMaxScaler() function, it works on sigmoid function hence convert data in between 0 & 1

code

Contact us for help

#datanormalization #normalization #featurescaling #standardscaler #minmaxscaler #data

#onehotencoder #labelencoder #datacleaning #mean #python #machinelearning #project #crazzylearners #programming #codes #coding #categoricaldata #labelencoder #machinelearningprojects #mlcoding #datasets #datascience #dataanalytic #dataanalytics #bigdata #sklearn #programs #numpy #datatypes #python #pandas #sigmoidfunction #models #predictions

24 views
  • CREATED BY ANMOL VARSHNEY & PALAK GUPTA