Machine Learning that Matters. Kiri L. Wagstaff. ICML 2012
- Key message: An analysis of what ails ML research today, especially w.r.t. its impact to real life problems
- Comments on empirical analysis
- Needed: domain interpretation of reported results
- Which classes were well-classified; which were not
- What are the common error types
- Why particular data sets were chosen
- Metrics
- Instead of domain-independent metrics like accuracy or F-measure, domain-specific metrics might shed more light
- For example, in classification of mushrooms, 80% might be good for botany, but we need more than 99% for deciding if a mushroom is poisonous to eat or not.
- Don't just compare the performance of algorithms; analyze
- how each algorithm is doing well
- what is the effect of domain characteristics
- Threshold ablation
- Also discuss which threshold ranges or performance regimes are relevant to the domain
- Do not summarize over all regimes, especially those irrelevant to the domain
- Comments on impact
- Take the method all the way through, to deployment
- "What matters is achieving performance sufficient to make an impact on the world. As an analogy, consider a sick child in a rural setting. A neighbor who runs two miles to fetch the doctor need not achieve Olympic-level running speed (performance), so long as the doctor arrives in time to address the sick child’s needs (impact)."
- The proposed solution might be complex internally, but easy to use externally, i.e. a lay person should be able to apply it to his problem without having to know a lot about ML.
- Interesting citations
- The changing science of machine learning. Pat Langley. Machine Learning 2011.
No comments:
Post a Comment