Notes on COLING 2012 - Part 3
Grammarless Parsing for Joint Inference. Jason Naradowsky Tim Vieira, David A. Smith
- Problem: Jointly do grammar and NER (rather do one after the other, in the hope that they may help each other, e.g. an NE span suggests there is a noun phrase)
- Approach: New to the methods applied in this area. Need background to make sense.
- Interesting papers:
- Finkel, J. R. and Manning, C. D. Joint parsing and named entity recognition. NAACL-HLT 2009
- Sarawagi, S. and Cohen, W. W. Semi-Markov conditional random fields for information extraction. NIPS 2004
Text Reuse Detection Using a Composition of Text Similarity Measures. Daniel Bär, Torsten Zesch, Iryna Gurevych.
- Problem: Measure similarity of two pieces of text (for e.g. plagiarism detection)
- Key idea: Previous efforts used content-based measures; they use in addition structure and style as features.
- content: words, synonyms, semantically related words, LSA representations
- structure: stopword/POS n-grams
- style: type/token ratio, function word frequency, token/sentence length
- Use above as features for a machine-learned classifier (Naive Bayes, and decision tree)
- Comments
- Experiments on each corpus discussed separately, including error analysis.
- Report confusion matrix when discussing classification performance.
- Interesting papers:
- Lin, D. An information-theoretic definition of similarity. ICML 1998
- Gabrilovich, E. and Markovitch, S. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. IJCAI 2007
- Artstein, R. and Poesio, M. Inter-Coder Agreement for Computational Linguistics. CL 2008
No comments:
Post a Comment