Industrial Training

mca Syllabus

Natural Language Processing:
Code: CS 802F
Contact: 3L
Credits: 3
Allotted Hrs: 39L

Introduction to NLP [2L]:
Definition, issues and strategies, application domain, tools for NLP, Linguistic organisation of NLP, NLP vs PLP.
Word Classes [13L]:
Review of Regular Expressions, CFG and different parsing techniques 1L
Morphology: Inflectional, derivational, parsing and parsing with FST, Combinational Rules                                                                                                                                 3L

Phonology: Speech sounds, phonetic transcription, phoneme and phonological rules, optimality theory, machine learning of phonological rules, phonological aspects of prosody and speech synthesis.                                                                                4L
Pronunciation, Spelling and N-grams: Spelling errors, detection and elimination using probabilistic models, pronunciation variation (lexical, allophonic, dialect), decision tree model, counting words in Corpora, simple N-grams, smoothing (Add One, Written-Bell, Good-Turing), N-grams for spelling and pronunciation.                                              5L
Syntax [7L]:
POS Tagging: Tagsets, concept of HMM tagger, rule based and stochastic POST, algorithm for HMM tagging, transformation based tagging                                                 4L
Sentence level construction & unification: Noun phrase, co-ordination, sub-categorization, concept of feature structure and unification.                                            3L

Semantics [9L]:
Representing Meaning: Unambiguous representation, canonical form, expressiveness, meaning structure of language, basics of FOPC                                                           2L
Semantic Analysis: Syntax driven, attachment & integration, robustness 2L
Lexical Semantics: Lexemes (homonymy, polysemy, synonymy, hyponymy), WordNet, internal structure of words, metaphor and metonymy and their computational approaches
3L
Word Sense Disambiguation: Selectional restriction based, machine learning based and dictionary based approaches.                                                                                                 2L

Pragmatics[8L]:
Discourse: Reference resolution and phenomena, syntactic and semantic constraints on Coreference, pronoun resolution algorithm, text coherence, discourse structure       4L
Dialogues: Turns and utterances, grounding, dialogue acts and structures                1L
Natural Language Generation: Introduction to language generation, architecture, dicourse planning (text schemata, rhetorical relations).                                                    3L

Text Book:
1. D. Jurafsky & J. H. Martin – “Speech and Language Processing – An introduction to Language processing, Computational Linguistics, and Speech Recognition”, Pearson Education
Reference Books:
1. Allen, James. 1995. – “Natural Language Understanding”. Benjamin/Cummings, 2ed.
2. Bharathi, A., Vineet Chaitanya and Rajeev Sangal. 1995. Natural Language Processing- “A Pananian Perspective”. Prentice Hll India, Eastern Economy Edition.
3. Eugene Cherniak: “Statistical Language Learning”, MIT Press, 1993.
4. Manning, Christopher and Heinrich Schutze. 1999. “Foundations of Statistical Natural Language Processing”. MIT Press.

Hi I am Pluto.