Wednesday, January 16, 2013

Notes on COLING 2012 - Part 3

Grammarless Parsing for Joint Inference. Jason Naradowsky Tim Vieira, David A. Smith

  • Problem: Jointly do grammar and NER (rather do one after the other, in the hope that they may help each other, e.g. an NE span suggests there is a noun phrase)
  • Approach: New to the methods applied in this area. Need background to make sense.
  • Interesting papers:
    • Finkel, J. R. and Manning, C. D. Joint parsing and named entity recognition. NAACL-HLT 2009
    • Sarawagi, S. and Cohen, W. W. Semi-Markov conditional random fields for information extraction. NIPS 2004

Text Reuse Detection Using a Composition of Text Similarity Measures. Daniel Bär, Torsten Zesch, Iryna Gurevych.

  • Problem: Measure similarity of two pieces of text (for e.g. plagiarism detection)
  • Key idea: Previous efforts used content-based measures; they use in addition structure and style as features.
    • content: words, synonyms, semantically related words, LSA representations
    • structure: stopword/POS n-grams
    • style: type/token ratio, function word frequency, token/sentence length
    • Use above as features for a machine-learned classifier (Naive Bayes, and decision tree)
  • Comments
    • Experiments on each corpus discussed separately, including error analysis.
    • Report confusion matrix when discussing classification performance.
  • Interesting papers:
    • Lin, D. An information-theoretic definition of similarity. ICML 1998
    • Gabrilovich, E. and Markovitch, S. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. IJCAI 2007
    • Artstein, R. and Poesio, M. Inter-Coder Agreement for Computational Linguistics. CL 2008

No comments:

Post a Comment