#5, First Floor, 4th Street , Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

How to implement multiple linear regression using statsmodels library in python?
Description

To build multiple linear regression model using python.

Process

   Import necessary libraries.

   Import OLS model from statsmodel.

   Plot scatter diagram to check linearity.

   Assign independent variables(X).

   Assign dependent variable(Y).

  Build the regression model.

   Fit X and Y

Sapmle Code

#import libraries

import statsmodels.api as sm

import pandas as pd

import matplotlib.pyplot as plt

#read the data set

data=pd.read_csv(‘/home/soft27/soft27

/Sathish/Pythonfiles/Employee.csv’)

#creating data frame

df=pd.DataFrame(data)

print(df)

#plotting the scatter diagram for independent variable 1

plt.scatter(df[‘rating’], df[‘salary’], color=’red’)

plt.title(‘rating vs salary’, fontsize=14)

plt.xlabel(‘rating’, fontsize=14)

plt.ylabel(‘salary’, fontsize=14)

plt.grid(True)

plt.show()

#plotting the scatter diagram for independent variable 2

plt.scatter(df[‘bonus’], df[‘salary’], color=’green’)

plt.title(‘bonus vs salary’, fontsize=14)

plt.xlabel(‘bonus’, fontsize=14)

plt.ylabel(‘salary’, fontsize=14)

plt.grid(True)

plt.show()

#assigning the independent variable

X = df[[‘rating’,’bonus’]]

#assigning the dependent variable

Y = df[‘salary’]

#Build multiple linear regression

X = sm.add_constant(X)

#fit the variables in to the linear model

model = sm.OLS(Y, X).fit()

#print the intercept and regression co-efficient

print_model = model.summary()

print(print_model)

Screenshots