Research Notes

Wednesday, January 30, 2013

Machine Learning that Matters. Kiri L. Wagstaff. ICML 2012

Key message: An analysis of what ails ML research today, especially w.r.t. its impact to real life problems
Comments on empirical analysis

Needed: domain interpretation of reported results

Which classes were well-classified; which were not
What are the common error types
Why particular data sets were chosen

Metrics

Instead of domain-independent metrics like accuracy or F-measure, domain-specific metrics might shed more light

For example, in classification of mushrooms, 80% might be good for botany, but we need more than 99% for deciding if a mushroom is poisonous to eat or not.

Don't just compare the performance of algorithms; analyze

how each algorithm is doing well
what is the effect of domain characteristics

Threshold ablation

Also discuss which threshold ranges or performance regimes are relevant to the domain
Do not summarize over all regimes, especially those irrelevant to the domain

Comments on impact

Take the method all the way through, to deployment
"What matters is achieving performance sufficient to make an impact on the world. As an analogy, consider a sick child in a rural setting. A neighbor who runs two miles to fetch the doctor need not achieve Olympic-level running speed (performance), so long as the doctor arrives in time to address the sick child’s needs (impact)."
The proposed solution might be complex internally, but easy to use externally, i.e. a lay person should be able to apply it to his problem without having to know a lot about ML.

Interesting citations

The changing science of machine learning. Pat Langley. Machine Learning 2011.

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)