Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge. Samer Hassan and Rada Mihalcea. EMNLP 2009
- Key Ideas
- Introduce the problem of cross-lingual semantic relatedness.
- Map words in different languages to their concept vectors (concepts are Wikipedia articles, similar to Gabrilovich and Markovitch, AAAI 2007). Map concepts using Wikipedia langlinks. The vectors are now comparable.
- Comments 
- Wikipedia data accessed using Wikipedia Miner.
- Experiments
- WS-30 (G. Miller and W. Charles. Contextual correlates of semantic similarity. Language and Cognitive Processes 1998) and WS-353 (L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: the concept revisited. WWW 2001) semantic similarity evaluation sets were translated and used for evaluation.
- Detailed description of creation of data sets and evaluation sets (including instructions given to annotators).
- Also devise an "obvious" baseline which illustrates where their method helps.
 
EvaluatingWordNet-based Measures of Lexical Semantic Relatedness. Alexander Budanitsky, Graeme Hirst. CL 2006
- Comments on the problem
- Distinguishing semantic similarity and relatedness
- "... semantic relatedness is a more general concept than similarity; similar entities are semantically related by virtue of their similarity (bank–trust company), but dissimilar entities may also be semantically related by lexical relationships such as meronymy (car–wheel) and antonymy (hot–cold), ..."
- "the more-general idea of relatedness, not just similarity ... not just ... relationships in WordNet ... but also associative and ad  hoc relationships ... just about any kind of functional relation or frequent association in the world. ... Morris and Hirst (2004, 2005) have termed these non-classical lexical semantic  relationships ... shown in experiments ... that around 60% of the lexical relationships ... in a text are of this nature."
- "[A study found that] the words sex, drinking, and drag racing were semantically related, by all being “dangerous behaviors”, in the context of an article about teenagers emulating what they see in movies. Thus lexical semantic relatedness is sometimes constructed  in context and cannot always be determined purely from an a priori lexical resource ... However, [such] ad hoc relationships accounted for only a small fraction of those reported [in the study]" 
- "... in this paper the term concept will refer to a particular sense of a  given word. ... when we say that two words are “similar”, ... they denote similar concepts; ... [and] not ... similarity ofdistributional or co-occurrence behavior of the words, ...While similarity of denotation might be inferred from similarity of distributional or co-occurrence behavior (Dagan 2000; Weeds 2003), the two are distinct ideas."
- "All approaches to measuring semantic relatedness that use a lexical  resource construe the resource, in one way or another, as a network  or directed graph, and then base the measure of relatedness on  properties of paths in this graph." (Compare with probabilistic approaches.)
- Relating semantic relatedness and distributional similarity
- "Weeds (2003), in her study of 15 distributional-similarity measures,  found that words distributionally similar to hope (noun) included  confidence, dream, feeling, and desire; Lin (1998b) found pairs such  as earnings–profit, biggest–largest, nylon–silk, and pill–tablet. ... if two concepts are similar or related, it is likely that their role in the  world will be similar, so similar things will be said about them, and so the contexts of occurrence of the corresponding words will be similar.  And conversely (albeit with less certainty), if the contexts of  occurrence of two words are similar, then similar things are being  said about each, so they are playing similar roles in the world and  hence are semantically similar — at least to the extent of these roles."
- Differences between the two
-  "while semantic relatedness is inherently a relation on concepts, ... distributional similarity is a (corpus-dependent) relation on words."
- "whereas semantic relatedness is symmetric, distributional  similarity is a potentially asymmetrical relationship. If  distributional similarity is conceived of as substitutability, ... then asymmetries arise ...; for example, ... fruit substitutes for apple better than apple substitutes for fruit."
- "Imbalance in the corpus and data sparseness is an additional  source of anomalous results even for “good” measures."
- Evaluation issues
- "severe limitation on the data means that this was not really a fair test of the principles underlying the [distributional] hypothesis; a fair test  would require data allowing the comparison of any ... two words in WordNet, but obtaining such [corpus] data for less-frequent words ... would be a massive task."
- Comments on experiments 
- Lists 3 kinds of evaluation
- "theoretical examination .. for ... mathematical properties thought desirable, such as whether it is a metric ..., whether it has singularities, whether its parameter-projections are smooth  functions, ..."
- "comparison with human judgments. Insofar as human  judgments of similarity and relatedness are deemed to be  correct by definition, this clearly gives the best assessment of  the “goodness” of a measure."
- "evaluate ... with respect to ... performance in the framework of a particular application."
- "While comparison with human judgments is the ideal way to  evaluate
 a measure of similarity or semantic relatedness, in practice  the tiny 
amount of data available (and only for similarity, not  relatedness) is 
quite inadequate." and "Finkelstein [-353] ... is still very small, and,
 as Jarmasz and Szpakowicz (2003) point out, is culturally and 
politically biased."
- "... often what we are really interested in
 is the relationship between the concepts for which the words are merely
 surrogates; the human  judgments that we need are of the relatedness of
 word-senses, not  words. So the experimental situation would need to 
set up contexts  that bias the sense selection for each target word and 
yet don’t bias the subject’s judgment of their a priori relationship, an
 almost  self-contradictory situation." (and hence justifying extrinsic evaluation)
- Application to malapropism detection
 
 
 
 
          
      
 
  
 
 
 
 
 
 
 
 
 
 
No comments:
Post a Comment