Implement sample code for Encode the target variable

How to impute missing values and Encode the target variable using sklearn in python?

Description

To impute the missing values and encode the target variables using imputer and Label encoder from sklearn in python.

Input

Employee data set.

Output

Imputed feature variable
Encoded target variable

Process

Load the data set (Sample).

Import necessary libraries.

Fit the imputer constructor from sklearn.

Define method of filling. (Mean or Median).

Fit the label encoder constructor from sklearn.

Fit the target variable.

Print the results.

Sample Code

#import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split

#load the data
data = pd.read_csv(“/home/soft50/soft50/Sathish/practice/Employee.csv”)

#make it data frame
df = pd.DataFrame(data)

#Original data
print(“Original data\n\n”,df)
print(“\n”)
#Check missing values
print(“Missing values count in each column\n”)
print(df.isnull().sum())

X = df[[‘salary’]]
y = df[‘dpmnt’]

#impute missing values
from sklearn.preprocessing import Imputer
imp = Imputer(strategy=’mean’, axis=0)
imp_X = imp.fit_transform(X)
print(“\n”)
print(“After Imputing\n”,imp_X)

#Encode the input features
from sklearn.preprocessing import LabelEncoder
enc = LabelEncoder()
y = enc.fit_transform(y)
print(“\n”)
print(“After Encoding\n\n”,y)

Screenshots

List

Office Address

Social List