Timothy Chklovski at Factual Blog has this cool list of 5 principles for Applying Machine Learning Techniques. His datacentric techniques are:
- Don’t Ignore the Corners – The “Corners” are unusual cases in the Data
- Be Attentive to the Boundaries – If you use a linear discriminant or decision tree, pay special attention to boundary cases.
- Spend Time on Special Cases – i.e. special cases in the data.
- Listen to the Data
- Love Your Data
Jack Coughlin also at Factual Blog adds this list:
- Ask for help first.
- The documentation is your best friend.
- Know the ecosystem. (Python, Java/Hadoop/Weka, R, Malab, …)
- Machine Learning applications are mostly the boring stuff. “The majority of the effort is in pre-processing”
- Save the ML for the problems you can’t think to solve in any other way.
- Coding in R makes you feel like a ninja. “The R core library is full of awesome one-liners ….”