“Detecting Adversarial Advertisements in the Wild”

January 20, 2013 in General ML by hundalhh | Permalink

David Andrzejewski at Bayes’ Cave wrote up a nice summary of practical machine learning advice from the KDD 2011 paper “Detecting Advesarial Advertisements in the Wild”. I’ve quoted below several of the main points from David’s summary:

ABE: Always Be Ensemble-ing
Throw a ton of features at the model and let L1 sparsity figure it out
Map features with the “hashing trick“
Handle the class imbalance problem with ranking
Use a cascade of classifiers
make sure the system “still works” as its inputs evolve over time
Make efficient use of expert effort
Allow humans to hard-code rules
periodically use non-expert evaluations to make sure the system is working