To implement random Forest in R Machine Learning

Process

What is Random Forest?

It is a supervised learning algorithm
It is an ensemble algorithm
It consists of many decision trees and outputs the class that is the mode of the classes output by individual trees

Data set :

readingSkills – R inbuilt dataset

R Packages :

caret – Classification and regression trees
randomForest – For Random Forest Algorithm

R Function:

str(data frame) – To display the internal structure of an data frame
summary(data frame) – Displays Minimum, Maximum, Mean, Median, 1st quartile, 2nd quartile values of each numeric, integer like R object in the data frame. And displays count for factor like R object.
sample(x=,size= ) – To take sample data set from the whole data set

x – vector of one or more elements from which to choose

size – a non-negative integer giving the number of items to choose.

randomForest(formula, data=, ntree=, mtry=) – To implement random forest algorithm

data – data frame

ntree – Number of trees to grow

mtry – Number of variables randomly sampled as candidates at each split

predict(model=, data=) – to predict an outcome value on the basis of one or multiple predictor variables

model – fitted model

data – New data

confusionMatrix(predicted_value=, y_test=) – To implement Confusion Matrix

Sapmle Code

#Import data

#install.packages(“party”)
library(“party”)

my_input<-readingSkills
View(my_input)

str(my_input)
summary(my_input)

#Check Missing Values

sum(is.na(my_input))

#Splitting into train and test

table(my_input$nativeSpeaker)
set.seed(123)
training<-sample(nrow(my_input),0.7*nrow(my_input))
train<-my_input[training,]
test<-my_input[-training,]

summary(train)
summary(test)

#Random Forest Model on train

#install.packages(“randomForest”)
library(“randomForest”)

random_in<-randomForest(nativeSpeaker~.,data=train,ntree=500,mtry=4)
print(random_in)

#Preiction on test

predict_in<-predict(random_in,test)
table(predict_in,test$nativeSpeaker)

mean(predict_in == test$nativeSpeaker)

#Confusion Matrix

confusionMatrix(predict_in,test$nativeSpeaker)

#To check important Variable

importance(random_in)
varImpPlot(random_in)

Office Address

Social List

How to implement random Forest in R?

Description