ESE begin 27 April 2026. View Timetable
Logo
CoreCuratedNLPModule 2

Morphological Analysis

Morphology types, Regular expressions & finite automata

LECTURE 5: MORPHOLOGICAL ANALYSIS – SURVEY OF ENGLISH MORPHOLOGY

Q1. Explain Morphology, morphemes, and word formation in Natural Language Processing.

Morphology deals with the internal structure of words and the way complex words are formed using smaller meaningful units called morphemes. It studies both the structural (syntactic) and semantic aspects of words.

A morpheme is the smallest meaning-bearing unit in a language. Morphemes may represent entities, actions, or grammatical relationships. Morphology enables humans and machines to recognize words and understand their meanings in discourse.

Words are formed using a base form (also called stem or lemma) along with affixes such as prefixes, suffixes, or infixes. Examples include believe (stem), un- (prefix), and -able, -ly (suffixes).

Thus, morphology plays a fundamental role in understanding how words are constructed and interpreted in natural language.


Q2. Explain morphological parsing and its importance in NLP.

Morphological parsing is the task of identifying and analyzing the morphemes inside a word. It determines how a word is decomposed into its constituent morphemes.

For example, words such as hands, foxes, and children contain morphemes that encode grammatical information like number. Morphological parsing identifies these internal components.

Morphological parsing is important for several NLP applications such as machine translation, information retrieval, syntactic parsing, and text simplification. By breaking words into meaningful units, it improves the accuracy of higher-level language processing tasks.


Q3. Explain words and morphemes with examples.

In formal languages, words are treated as arbitrary strings. In contrast, in natural languages, words are composed of meaningful subunits called morphemes.

Morphemes are abstract units that denote entities or relationships. They are classified into:

  • Stems, which carry the core meaning of a word
  • Affixes, which add grammatical or semantic information such as tense, number, or polarity

Examples: cats = cat (stem) + s (suffix) undo = un (prefix) + do (stem)

This distinction highlights the importance of morphemes in natural language understanding.


Q4. Why is morphological analysis needed in NLP?

Morphological analysis is required to understand how words are composed from smaller meaning-bearing units and how grammatical information is encoded within them.

It is used in applications such as spelling correction, hyphenation algorithms, part-of-speech analysis, text-to-speech systems, and grapheme-to-phoneme conversion. It also helps resolve pronunciation ambiguities, such as the different pronunciations of hothouse.

Thus, morphological analysis is essential for accurate word-level processing in NLP systems.


Q5. Explain word structure, morphology–syntax interaction, and word creation systems.

Words are orthographic tokens usually separated by whitespace. However, in some languages such as Chinese and Japanese, word boundaries are unclear, while in languages like Turkish, a single word may represent an entire sentence.

Morphology studies word structure, whereas syntax studies sentence structure, and the two interact closely. Languages with rich morphology tend to have freer word order and less rigid syntax. For example, Hindi allows multiple word orders due to extensive morphological marking.

Word creation systems include concatenative morphology, which attaches affixes to stems and is common in English, and non-concatenative morphology, such as infixation in Tagalog and templatic morphology in Arabic and Hebrew.


LECTURE 6: INFLECTIONAL AND DERIVATIONAL MORPHOLOGY

Q6. Explain free and bound morphemes and types of morphology.

Morphemes are classified as free or bound. Free morphemes can stand alone as independent words, such as girl, boy, and weak. Bound morphemes cannot stand alone and must attach to other morphemes, such as -er, -s, and -ling.

Based on how morphemes are used, morphology is classified into inflection, derivation, compounding, and cliticization. These processes explain how words change form and function in a language.


Q7. Differentiate between inflectional and derivational morphology with examples.

Inflectional morphology modifies words to express grammatical features such as tense, number, person, aspect, and case without changing the word category. Example: perform → performs → performed

Inflection may be regular or irregular. Regular plurals include cats and tables, while irregular plurals include mouse → mice and child → children. Suppletion occurs when there is no phonological similarity between forms, such as go → went.

Derivational morphology creates new words by adding prefixes or suffixes and often changes word class or meaning. Examples include delight → delightful and like → unlikeableness.

Thus, inflection adapts words grammatically, while derivation creates new lexical items.

FeatureInflectional MorphologyDerivational Morphology
Word ClassUnchanged (Noun→Noun)Changes (Verb→Noun)
MeaningGrammatical (Tense/Number)New distinct word meaning
Examplescats, walkedteacher, unhappy

LECTURE 7: REGULAR EXPRESSIONS

Q8. Explain Regular Expressions and their operators.

A Regular Expression (RE) is a formal language used to specify patterns in text. Regular expressions are widely used in search, text processing, and pattern matching.

Basic constructs include character classes, ranges, negation, optional characters, and wildcards. Repetition operators such as *, +, and ? specify how many times a pattern may occur.

Advanced operators represent digits, whitespace, and alphanumeric characters. Tools such as UNIX grep extensively use regular expressions for efficient text processing.


LECTURE 8: FINITE STATE AUTOMATA AND TRANSDUCERS

Q9. Explain Finite State Automata (FSA) and Finite State Transducers (FST).

Regular expressions define regular languages, which are recognized by Finite State Automata (FSA). An FSA processes input strings and either accepts or rejects them based on state transitions.

Formally, an FSA consists of a finite set of states, an input alphabet, a start state, final states, and a transition function.

A Finite State Transducer (FST) extends an FSA by mapping input strings to output strings. FSTs represent relations rather than simple acceptance and are widely used in morphological parsing, word generation, and lexical analysis.

FSA: Recognizer

Accepts "cat"

FST: Transducer

Maps "cat" ↔ "c-a-t"


Q10. Explain FST operations and applications in NLP.

Two important operations on FSTs are inversion and composition. Inversion swaps the input and output labels of a transducer, while composition combines two transducers to create a new mapping.

FSTs are used for morphological parsing of languages such as English and Spanish. Other applications include spelling correction, speech recognition, machine translation, and probabilistic modeling using weighted FSTs.

Thus, FSTs form the computational foundation of word-level analysis in NLP.


On this page