## Wednesday, January 30, 2013

#### A Relational Model of Semantic Similarity between Words using Automatically Extracted Lexical Pattern Clusters from the Web. Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuka. EMNLP 2009

• Key ideas
• Past work modelled similarity between two words in terms of context overlap, where context consisted of other words known to be closely related to the word (derived either from a corpus or an ontology like wordnet). On the other hand, the authors claim:
• We propose a relational model to compute the semantic similarity between two words. Intuitively, if the relations that exist between a and b are typical relations that hold between synonymous word pairs, then we get a high similarity score for a and b.
• Define relations as patterns such as "X is a Y". For each word pair, compute a feature vector with a weight for each pattern (relation). Do this for a set of seed pairs, and compute a "prototype" vector. For a new word pair, declare similar if its vector is similar to the prototype vector (i.e. is n^T p is high).
• Many patterns represent same/similar relations. They solve this problem at 2 levels:
• They cluster similar patterns together, and use the clusters as features (instead of patterns).
• Since the clusters may also be similar, use a correlation matrix in the dot product, i.e. instead of n^T p, use n^T C p.