#5, First Floor, 4th Street , Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

How to do word and sentence tokenize using nltk in python
Description

To write a piece of python code for tokenize the words and sentences in the text data using nltk.

Input

Sample text

Output

Tokenized word
Tokenized sentence

Process

  Import nltk library.

  Import word_tokenize() and sent_tokenize().

  Took sample text data.

  Fit the data ti the constructor properly.

  Tokenize the words and sentence in the text data.

Sapmle Code

from nltk.tokenize import sent_tokenize, word_tokenize

#sample text
sample_text = “Python is a scripting language. Also used as a general purpose language”
print(“Original text”)
print(sample_text,”\n”)

#tokenize the words
word_token = word_tokenize(sample_text)
print(“After word tokenizing\n”,word_token,”\n”)

#sentence tokenize
sent_token = sent_tokenize(sample_text)
print(“After sentence tokenizing\n”,sent_token)

Screenshots