In the seminal paper “Gene Selection for Cancer Classification using Support Vector Machines“, Guyon, Weston, Barnhill, and Vapnik (2002) use Recursive Feature Elimination to find the genes which are the most predictive of cancer. Recursive Feature Elimination repeatedly ranks the features and eliminates the worst feature until only a small subset of the original set of features remains. Although several feature ranking methods were explored, the main method was a soft margin SVM classifier with which the authors found 8 key colon cancer genes out of 7000.
You are currently browsing the archive for the General ML category.
David Andrzejewski at Bayes’ Cave wrote up a nice summary of practical machine learning advice from the KDD 2011 paper “Detecting Advesarial Advertisements in the Wild”. I’ve quoted below several of the main points from David’s summary:
- ABE: Always Be Ensemble-ing
- Throw a ton of features at the model and let L1 sparsity figure it out
- Map features with the “hashing trick“
- Handle the class imbalance problem with ranking
- Use a cascade of classifiers
- make sure the system “still works” as its inputs evolve over time
- Make efficient use of expert effort
- Allow humans to hard-code rules
- periodically use non-expert evaluations to make sure the system is working
The core of Kurzweil’s theory is that the brain is made up of pattern processing units comprised of around 100 neurons, and he suggests that the brain can be understood and simulated primarily by looking at how these lego-like building blocks are interconnected.
NIPS was pretty fantastic this year. There were a number of breakthroughs in the areas that interest me most: Markov Decision Processes, Game Theory, Multi-Armed Bandits, and Deep Belief Networks. Here is the list of papers, workshops, and presentations I found the most interesting or potentially useful:
- Representation, Inference and Learning in Structured Statistical Models
- Stochastic Search and Optimization
- Quantum information and the Brain
- Relax and Randomize : From Value to Algorithms (Great)
- Classification with Deep Invariant Scattering Networks
- Discriminative Learning of Sum-Product Networks
- On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes
- A Unifying Perspective of Parametric Policy Search Methods for Markov Decision Processes
- Regularized Off-Policy TD-Learning
- Multi-Stage Multi-Task Feature Learning
- Graphical Models via Generalized Linear Models (Great)
- No voodoo here! Learning discrete graphical models via inverse covariance estimation (Great)
- Gradient Weights help Nonparametric Regressors
- Dropout: A simple and effective way to improve neural networks (Great)
- Efficient Monte Carlo Counterfactual Regret Minimization in Games with Many Player Actions
- A Better Way to Pre-Train Deep Boltzmann Machines
- Bayesian Optimization and Decision Making
- Practical Bayesian Optimization of Machine Learning Algorithms
- Modern Nonparametric Methods in Machine Learning
- Deep Learning and Unsupervised Feature Learning
Timothy Chklovski at Factual Blog has this cool list of 5 principles for Applying Machine Learning Techniques. His datacentric techniques are:
- Don’t Ignore the Corners – The “Corners” are unusual cases in the Data
- Be Attentive to the Boundaries – If you use a linear discriminant or decision tree, pay special attention to boundary cases.
- Spend Time on Special Cases – i.e. special cases in the data.
- Listen to the Data
- Love Your Data
- Ask for help first.
- The documentation is your best friend.
- Know the ecosystem. (Python, Java/Hadoop/Weka, R, Malab, …)
- Machine Learning applications are mostly the boring stuff. “The majority of the effort is in pre-processing”
- Save the ML for the problems you can’t think to solve in any other way.
- Coding in R makes you feel like a ninja. “The R core library is full of awesome one-liners ….”
In “Machine Learning Techniques for Stock Prediction”, Vatsal H. Shah (2007) evaluates several machine learning techniques applied to stock market prediction. The techniques used are: support vector machines, linear regression, “prediction using decision stumps”, expert weighting, text data mining, and online learning (the code was from YALE/Weka). The main stock features used were moving averages, exponential moving average, rate of change, and relative strength index. He concludes with “Of all the Algorithms we applied, we saw that only Support Vector Machine combined with Boosting gave us satisfactory results.
In “A Review of Studies on Machine Learning Techniques”, Singh, Bhatia, and Sangwan (2007) comment on neural nets, self organizing maps, case based reasoning, classification trees (CART), rule induction, and genetic algorithms. They include a nice chart at the end of the article that could be quite useful for managers.