Experiment 4
Implement N-Gram (Bigram) model.
Objective: To understand text preprocessing techniques including tokenization, stop word removal, and script validation using NLTK.
Unofficial Journal
View the unofficial journal for reference
Reference Outputs
View the reference outputs for this experiment
Prerequisites
Install NLTK
Open your terminal or command prompt and run:
pip install nltk
Perform
- Open your text editor or IDE (IDLE, VS Code, etc.).
- Create a new file named
exp2.py. - Paste the code below.
- Run the script.
Code
from nltk.util import ngrams
sentence = input("Enter the sentence: ")
tokens = sentence.split()
n_grams = int(input("Enter 'n': "))
output = list(ngrams(tokens, n_grams))
print(output)