## Friday, February 8, 2013

#### A Graph-Theoretic Framework for Semantic Distance. Vivian Tsang, Suzanne Stevenson. CL 2010

• Problem: similarity of texts (not single words)
• Claims
• "[we do] integration of distributional and ontological factors in measuring semantic distance between two sets of concepts (mapped from two texts) [within a network flow formalism]"
• Key ideas
• "Our goal is to measure the distance between two subgraphs (representing two texts to be compared), taking into account both the ontological distance between the component concepts and their frequency distributions. To achieve this, we measure the amount of “effort” required to transform one profile to match the other graphically: The more similar they are, the less effort it takes to transform one into the other. (This view is similar to that motivating the use of “earth mover’s distance” in computer vision [Levina and Bickel 2001].)"
• "[our] notion of semantic distance as transport effort of concept frequency over the relations (edges) of an ontology differs  significantly from ... [using] concept vectors of frequency. ... our approach can [compare] texts that use related but non-equivalent concepts." (seems to the main argument in favor of graph-based iterative methods)
• "viewed as a supply–demand problem, in which we find the minimum cost flow (MCF) from the supply profile to the demand profile ... Each edge ... has a cost ... Each node [has a] supply ... [or] demand ... The goal is to find a flow from supply nodes to demand nodes that satisfies the supply/demand constraints of each node and minimizes the overall “transport cost.”"
• Comments
• Requires an ontology
• "Distributional" refers to term frequencies within the compared text (not in some corpus)
• Interesting papers
• Using network flows
• Pang, Bo and Lillian Lee. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. ACL 2004
• Barzilay, Regina and Mirella Lapata. Collective content selection for concept-to-text generation. HLT/EMNLP 2005
• Mihalcea, Rada. Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. HLT/EMNLP 2005

#### Structural Semantic Relatedness: A Knowledge-Based Method to Named Entity Disambiguation. Xianpei Han Jun Zhao. ACL 2010

• Claims
• "proposes a reliable semantic relatedness measure between concepts ... which can capture both the explicit semantic relations between concepts and the implicit semantic knowledge embedded in [multiple] graphs and networks."
• Key ideas
• “two concepts are semantic related if they are both semantic related to the neighbor concepts of each other”
• Interesting papers
• Amigo, E., Gonzalo, J., Artiles, J. and Verdejo, F. A comparison of extrinsic clustering evaluation metrics based on formal constraints. Information Retrieval 2008