Research Notes: Notes on COLING 2012 - Part 3

Wednesday, January 16, 2013

Notes on COLING 2012 - Part 3

Grammarless Parsing for Joint Inference. Jason Naradowsky Tim Vieira, David A. Smith

Problem: Jointly do grammar and NER (rather do one after the other, in the hope that they may help each other, e.g. an NE span suggests there is a noun phrase)
Approach: New to the methods applied in this area. Need background to make sense.
Interesting papers:

Finkel, J. R. and Manning, C. D. Joint parsing and named entity recognition. NAACL-HLT 2009
Sarawagi, S. and Cohen, W. W. Semi-Markov conditional random fields for information extraction. NIPS 2004

Text Reuse Detection Using a Composition of Text Similarity Measures. Daniel Bär, Torsten Zesch, Iryna Gurevych.

Problem: Measure similarity of two pieces of text (for e.g. plagiarism detection)
Key idea: Previous efforts used content-based measures; they use in addition structure and style as features.

content: words, synonyms, semantically related words, LSA representations
structure: stopword/POS n-grams
style: type/token ratio, function word frequency, token/sentence length
Use above as features for a machine-learned classifier (Naive Bayes, and decision tree)

Comments

Experiments on each corpus discussed separately, including error analysis.
Report confusion matrix when discussing classification performance.

Interesting papers:

Lin, D. An information-theoretic definition of similarity. ICML 1998
Gabrilovich, E. and Markovitch, S. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. IJCAI 2007
Artstein, R. and Poesio, M. Inter-Coder Agreement for Computational Linguistics. CL 2008

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)