Data Loading...

morphology Flipbook PDF

Morphology • Morphology is concerned with the construction of words and the meaning of their components. • A morpheme is


104 Views
19 Downloads
FLIP PDF 106.72KB

DOWNLOAD FLIP

REPORT DMCA

Morphology •  Morphology is concerned with the construction of words and the meaning of their components. •  A morpheme is the smallest meaning-bearing unit of a word. •  Words are composed of stems and affixes. •  Affixes consist of: prefixes, suffixes, infixes, circumfixes

Examples Root walk noise icon order active elect view

Morphological variants walks, walked, walking noisy, noisily iconic reorder, orderly hyperactive, proactive reelect, reelection preview, previewer, previewers

! !

Inflectional vs. Derivational •  Inflectional: variants that have essentially the same meaning and the same general part-of-speech. Ex: kick, kicks, kicked, kicking

•  Derivational: variants that represent a different concept. The part-of-speech may be the same or different. Ex: friend, friendly, friendliness, friendship ! !

! !

Blindly stripping affixes can produce strange results! preempt news pretend hardly glasses

empt new tend hard glass

TRUE STORIES Mrs Easter atomic ! !

Stemming

Limitations of Stemming

•  Some systems use stemming to approximately match terms with variants in documents. •  Most stemmers just chop off common prefixes and suffixes. Example: assassination would be stemmed to assassinat which would match: assassination assassinations assassinate assassinated assassinating ! !

Sample Morphology Rules Derived Root Word Word

Affix Repl POS Chars Root

POS Derived

Properties

cats sings walked hairy noisy unfair antiwar

s s ed y y un anti

noun verb verb adj adj adj adj

plural present tense past tense

cat sing walk hair noise fair war

e -

noun verb verb noun noun adj noun

! !

Errors by the Porter Stemmer organization organ doing doe generalization generic numerical numerous policy police university universe easy easily addition additive negligible negligent execute executive define definite past paste ignore ignorant special specialized arm army head heading

Omissions by the Porter Stemmer european europe analysis analyzes matrices matrix noise noisy sparse sparsity explain explanation resolve resolution triangle triangular urgency urgent cylinder cylindrical

! !

Morphological Analysis as Search •  Morphological analysis can be implemented as exhaustive search of the rule set. •  The rules must be applied recursively! •  All successful derivations should be returned, so that the NLP system is informed of all possible parts-of-speech for the word. Multiple derivations are common!

! !

Morphological Analysis Given a word w: IF w is in the dictionary, return its definition! ELSE apply all of the morphology rules to find all possible root forms of w

An Example (1) SUFFIX ed - verb -> adjective (2) SUFFIX ed - verb -> verb TENSE past (3) PREFIX re - verb -> verb

For each rule, strip off the affix and add the replacement chars to produce a candidate root. If the candidate root is in the dictionary with the appropriate POS, then success! •  If success, return the derived word with the POS and properties assigned by the rule. •  If failure, then recursively apply the rule set to the candidate root to see if it! can be derived. !

! !

Morphology with FSAs

Simple FSA for Morphology stood, grew, ate, eaten

•  Morphological analysis is often performed with finitestate transducers (FST). •  An FST is a finite-state automaton that maps between two sets of symbols. •  An FST can be viewed as a generator or as a recognizer between pairs of strings.

irreg-past-verb-form

q0!

reg-verb-stem walk, listen

q1!

past (-ed) past participle (-ed)

walk, listen

reg-verb-stem irreg-verb-stem

3rd singular (-s)

q2!

present participle (-ing)

stand, grow, eat

! !

q3!

! !

Simple FSA for Adjectives clear, happy, true

q0!

un!"

q1!

adj

q2!

adjAL equal, formal, natural

! !

q3!

-er -est

q3!