Amazing technological breakthrough possible @S-Logix

Office Address

  • 2nd Floor, #7a, High School Road, Secretariat Colony Ambattur, Chennai-600053 (Landmark: SRM School) Tamil Nadu, India
  • +91- 81240 01111

Social List

How to impute missing values and Encode the target variable using sklearn in python?


To impute the missing values and encode the target variables using imputer and Label encoder from sklearn in python.


Employee data set.


Imputed feature variable
Encoded target variable


   Load the data set (Sample).

   Import necessary libraries.

  Fit the imputer constructor from sklearn.

   Define method of filling. (Mean or Median).

   Fit the label encoder constructor from sklearn.

  Fit the target variable.

   Print the results.

Sample Code

#import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split

#load the data
data = pd.read_csv(“/home/soft50/soft50/Sathish/practice/Employee.csv”)

#make it data frame
df = pd.DataFrame(data)

#Original data
print(“Original data\n\n”,df)
#Check missing values
print(“Missing values count in each column\n”)

X = df[[‘salary’]]
y = df[‘dpmnt’]

#impute missing values
from sklearn.preprocessing import Imputer
imp = Imputer(strategy=’mean’, axis=0)
imp_X = imp.fit_transform(X)
print(“After Imputing\n”,imp_X)

#Encode the input features
from sklearn.preprocessing import LabelEncoder
enc = LabelEncoder()
y = enc.fit_transform(y)
print(“After Encoding\n\n”,y)

impute missing values and Encode the target variable using sklearn in python
import necessary libraries
Load the data set
Fit the label encoder constructor from sklearn