#5, First Floor, 4th Street , Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

How to manipulate data using dplyr package in R?
Description

To manipulate the data using dplyr package in R programming.

Process
Package and Functions:
  • R Package :dplyr
  • R Function : filter(data,condition)-- return all the rows that satisfy the condition
  • R Function : mutate(data, column name and data)-- to add new variables to the data
  • R Function :summarise(data,condition)-- to summarise multiple value into single value
  • R Function :group_by(variable)--to group data by one or more variable
  • R Function : sample_n(data,size=)--to select random rows from the given data
  • R Function :sample_frac(data, size= ) --to select rows by percentage
  • R Function : count(data, variable)-count the no. of observations corresponding to the factor of the variable
  • R Function : arrange(data, condition)--to arrange rows by variables
  • Pipe Operator -- to chain code together
Sapmle Code

#Data Manipulation using plyr package

#Loading required Packages

#install.packages(“dplyr”)
library(“dplyr”)

#Input data set
input<-read.csv(“my_input.csv”) View(input) #Function : filter filter(input, salary > 600)
filter(input , salary > 600 & dept==”IT”)

#Function : mutate
mutate(input, place=c(“London”,”Orissa”,”TamilNadu”,”Assam”,”Russia”,”China”,”Netherland”,”Pakistan”))

#Function : summarise
mode<-function(f){
uniq<-unique(f)
uniq[which.max(tabulate(match(f,uniq)))]
}
summarise(input,mean(salary),mode(dept))

#Function group_by
summarise(group_by(input,dept),mean(salary))

#Function : sample_n
sample_n(input,size = 3)

#Function sample_frac
sample_frac(input,size = 0.3)

#Function : count
count(input,dept)

#Function : arrange
arrange(input,desc(salary))

#Operator : pipe
#Without pipe operator
filt<-filter(input,dept!=”IT”)
summ<-summarise(input,mean(salary))
grp<-group_by(input,dept) #With pipe operator input %>%
filter(dept!=”IT”) %>%
group_by(dept)
summarise(mean(salary))

Screenshots