How to implement random Forest in R Programming?

Description

To implement Random Forest for readingSkills Dataset using R.

What is Random Forest?

    • It is a supervised learning algorithm
    • It is an ensemble algorithm
    • It consists of many decision trees and outputs the class that is the mode of the classes output by individual trees

Data set :

    • readingSkills – R inbuilt dataset

R Packages :

    • caret – Classification and regression trees
    • randomForest – For Random Forest Algorithm

R Function:

    • str(data frame) – To display the internal structure of an data frame
    • summary(data frame) – Displays Minimum, Maximum, Mean, Median, 1st quartile, 2nd quartile values of each numeric, integer like R object in the data frame. And displays count for factor like R object.
    • sample(x=,size= ) – To take sample data set from the whole data set

x – vector of one or more elements from which to choose

size – a non-negative integer giving the number of items to choose.

    • randomForest(formula, data=, ntree=, mtry=) – To implement random forest algorithm

data – data frame

ntree – Number of trees to grow

mtry – Number of variables randomly sampled as candidates at each split

    • predict(model=, data=) – to predict an outcome value on the basis of one or multiple predictor variables

model – fitted model

data – New data

  • confusionMatrix(predicted_value=, y_test=) – To implement Confusion Matrix

#Import data

#install.packages(“party”)
library(“party”)

my_input<-readingSkills
View(my_input)

str(my_input)
summary(my_input)

#Check Missing Values

sum(is.na(my_input))

#Splitting into train and test

table(my_input$nativeSpeaker)
set.seed(123)
training<-sample(nrow(my_input),0.7*nrow(my_input))
train<-my_input[training,]
test<-my_input[-training,]

summary(train)
summary(test)

#Random Forest Model on train

#install.packages(“randomForest”)
library(“randomForest”)

random_in<-randomForest(nativeSpeaker~.,data=train,ntree=500,mtry=4)
print(random_in)

#Preiction on test

predict_in<-predict(random_in,test)
table(predict_in,test$nativeSpeaker)

mean(predict_in == test$nativeSpeaker)

#Confusion Matrix

confusionMatrix(predict_in,test$nativeSpeaker)

#To check important Variable

importance(random_in)
varImpPlot(random_in)

Leave Comment

Your email address will not be published. Required fields are marked *

clear formSubmit