Thursday, March 7, 2013

Recent Advances in Methods of Lexical Semantic Relatedness – a Survey. Ziqi Zhang, Anna Lisa Gentile, Fabio Ciravegna. NLE 2012

  • Corpora: Wikipedia, Wiktionary, Wordnet, various biomedical corpora
  • Methods: 
    • based on Path, Information Content, Gloss, Vector
      • all methods use structure, mainly from Wordnet/Wikipedia
      • Some methods that treat Wiki articles as concepts (and use no other structure)
    • based on distributional similarity
      • PMI, Chi-squared test
      • Dice, Jaccard and Cosine (search engine based)
    • hybrid
      • combination: run each method separately, and then combine scores e.g. by linear combination
      • integration: run each method separately, and then use scores as features in hybrid model
  • Notes
    • "distributional similarity methods ... have been used as a proxy for [methods of semantic relatedness]."
    • Distinguish concept and word; model relatedness between concepts, and between words separately; usually a (polysemous) word w has several associated concepts C(w), and the relatedness between words w1 and w2 would some function of the relatedness between the concepts in C(w1) and C(w2).
    • Distinguish similarity and relatedness between words/concepts; model them separately; also model distance.
      • "two words are distributionally similar if (1) they tend to occur in each other’s context; or (2) the contexts each tends to occur in are similar; or (3) that if one word is substituted for another in a context, its “plausibility” is unchanged [(measured using search engines)]. Different methods have adopted different definitions of contexts ..."
      • Method surveys: Weeds (2003), Turney and Pantel (2010)
      • "Budantisky and Hirst (2006) argued that there are three essential differences between [semantic relatedness and distributional similarity] ... Firstly, semantic relatedness is inherently a relation on concepts, while distributional similarity is a relation on words; secondly, semantic relatedness is typically symmetric, whereas distributional similarity can be potentially asymmetric; finally, semantic relatedness depends on a structured lexicographic or knowledge bases, distributional similarity is relative to a corpus."
  • Evaluation
    • In-vitro
      • "In-vitro evaluation ... [i.e.] correlation with human judgement ... does not assess how well the method performs on real data ... Spearman correlation is a more robust measure ... [but] it may yield skewed results on datasets with many tied ranks."
      • "we argue that vector based methods are generally superior to other[s]"
    • In-vivo
      • text similarity, word choice (e.g. TOEFL), WSD, sense clustering, IR: {document ranking, query expansion}, coreference resolution, ontology construction and matching, Malapropism detection
    • "there is no strong evidence of a positive correlation between the ... [performance] in in-vitro evaluation ... and in in-vivo evaluation"
    • Data sets: Rubenstein and Goodenough, Finkelstein et al., and many others; all were originally used for similarity (not relatedness)
  • Tools
    • Wikipedia: Parse::MediaWikiDump, Ponzetto and Strube (2007)
    • DEXTRACT: creating evaluation datasets
    • WordNet::Similarity

No comments:

Post a Comment