To stem words for text processing in R
text_tokens(text, stemmer) – To stem the words
Library(“corpus”)
Load the necessary libraries
Load the data
For Snow ball stemmer call the text_tokens function
For Hunspell stemmer define a function by looking up the term in the dictonary
If there are no stems, use the original term
If there are multiple stems, use the last one
library(“corpus”)
#Snowball Stemmer
text text_tokens(text, stemmer = “en”) # english stemmer
#Hunspell Stemmer
stem_hunspell # look up the term in the dictionary
stems
if (length(stems) == 0) { # if there are no stems, use the original term
stem } else { # if there are multiple stems, use the last one
stem }
stem
}
text_tokens(text, stemmer = stem_hunspell)