|
|
Line 1: |
Line 1: |
| In the field of computational linguistics, a '''morphological dictionary''' is a linguistic resource that contains correspondences between surface form and lexical forms of words. Surface forms of words are those found in any text. The corresponding lexical form of a surface form is the [[Lemma (morphology)|lemma]] followed by grammatical information (for example the [[part of speech]], [[Grammatical gender|gender]] and [[Grammatical number|number]]). In English ''give'', ''gives'', ''giving'', ''gave'' and ''given'' are surface forms of the verb ''give''. The lexical form would be "give", verb. There are two kinds of morphological dictionaries: aligned and non-aligned.
| | My name is Tabitha and I am studying Athletics and Physical Education and Business and Management at Gdansk / Poland.<br><br>Stop by my web site; [http://www.bibleclassteachers.com/members/alvinstaten/profile/ wordpress dropbox backup] |
| | |
| ==Aligned morphological dictionaries==
| |
| | |
| In an aligned morphological dictionary, the correspondence between the surface form and the lexical form of a word is aligned at the character level, for example:
| |
| | |
| :(h,h) (o,o) (u,u) (s,s) (e,e) (s,<n>), (θ,<pl>)
| |
| | |
| Where θ is the empty symbol and <n> signifies "noun", and <pl> signifies "plural".
| |
| | |
| In the example the left hand side is the surface form (input), and the right hand side is the lexical form (output). This order is used in [[Morphology (linguistics)|morphological analysis]] where a lexical form is generated from a surface form. In morphological generation this order would be reversed.
| |
| | |
| Formally, if Σ is the alphabet of the input symbols, and <math> \Gamma </math> is the alphabet of the output symbols, an aligned morphological dictionary is a subset <math> A \subset 2^{(L^*)} </math>, where:
| |
| | |
| :<math> L = (( \Sigma \cup { \theta } ) \times \Gamma) \cup (\Sigma \times ( \Gamma \cup { \theta } )) </math>
| |
| | |
| is the alphabet of all the possible alignments including the empty symbol. That is, an aligned morphological dictionary is a set of string in <math>L^*</math>.
| |
| | |
| == Non-aligned morphological dictionary ==
| |
| | |
| A non-aligned morphological dictionary is simply a set <math> U \subset 2^{(\Gamma^* \times \Sigma^*)}</math> of pairs of input and output strings. A non-aligned morphological dictionary would represent the previous example as:
| |
| | |
| :(houses, house<n><pl>)
| |
| | |
| It is possible to convert a non-aligned dictionary into an aligned dictionary. Besides trivial alignments to the left or to the right, linguistically motivated alignments which align characters to their corresponding morphemes are possible.
| |
| | |
| == Lexical ambiguities ==
| |
| | |
| Frequently there exists more than one lexical form associated with a surface form of a word. For example "house" may be a noun in the singular, {{IPA|/haʊs/}}, or may be a verb in the present tense, {{IPA|/haʊz/}}. As a result of this it is necessary to have a function which relates input strings with their corresponding output strings.
| |
| | |
| If we define the set <math> E \subset \Sigma^* </math> of input words such that <math> E = { w: (w,w') \in U } </math>, the correspondence function would be <math> \tau : E \rightarrow 2^{\Gamma^{*}} </math> defined as <math> \tau(w) = w' : (w,w') \in U </math>.
| |
| | |
| ==List of online morphological dictionaries==
| |
| * [http://www.canoo.net Canoo.net – German]
| |
| * [http://www.babelpoint.org/english Babelpoint.org – English]
| |
| * [http://www.babelpoint.org/french Babelpoint.org – French]
| |
| * [http://www.babelpoint.org/german Babelpoint.org – German]
| |
| * [http://www.babelpoint.org/russian Babelpoint.org – Russian]
| |
| * [http://www.babelpoint.org/spanish Babelpoint.org – Spanish]
| |
| * [http://www.babelpoint.org/swedish Babelpoint.org – Swedish]
| |
| | |
| ==References==
| |
| {{reflist}}
| |
| * Garrido-Alenda, A. and Forcada, M. L. (2002). "[http://www.dlsi.ua.es/~mlf/docum/garrido02j.pdf Comparing nondeterministic and quasideterministic finite-state transducers built from morphological dictionaries]". ''Procesamiento del Lenguaje Natural'', (XVIII Congreso de la Sociedad Española de Procesamiento del Lenguaje Natural, Valladolid, Spain, 11-13.09.2002)
| |
| | |
| [[Category:Computational linguistics]]
| |
| [[Category:Translation databases]]
| |
| [[Category:Morphology|dictionary]]
| |
My name is Tabitha and I am studying Athletics and Physical Education and Business and Management at Gdansk / Poland.
Stop by my web site; wordpress dropbox backup