Support Vector Machines

You are currently browsing the archive for the Support Vector Machines category.

Nuit Blanche‘s article “The Summer of the Deeper Kernels” references the two page paper “Deep Support Vector Machines for Regression Problems” by Schutten, Meijster, and Schomaker (2013).

 

The deep SMV is a pretty cool idea.  A normal support vector machine (SVM) classifier, finds $\alpha_i$ such that

$f(x) = \sum_i \alpha_i K(x_i, x)$ is positive for one class of $x_i$ and negative for the other class (sometimes allowing exceptions).  ($K(x,y)$ is called the kernel function which is in the simplest case just the dot product of $x$ and $y$.)  SVM’s are great because they are fast and the solution is sparse (i.e. most of the $\alpha_i$ are zero).

Schutten, Meijster, and Schomaker apply the ideas of deep neural nets to SVMs.

They construct $d$ SVMs of the form

$f_a(x) = \sum_i \alpha_i(a) K(x_i, x)+b_a$

and then compute a more complex two layered SVM

$g(x) = \sum_i \alpha_i  K(f(x_i), f(x))+b$

where $f(x) = (f_1(x), f_2(x), \ldots, f_d(x))$.  They use a simple gradient descent algorithm to optimize the alphas and obtain numerical results on ten different data sets comparing the mean squared error to a standard SVM.

In the seminal paper “Gene Selection for Cancer Classification using Support Vector Machines“, Guyon, Weston, Barnhill, and Vapnik (2002) use Recursive Feature Elimination to find the genes which are the most predictive of cancer. Recursive Feature Elimination repeatedly ranks the features and eliminates the worst feature until only a small subset of the original set of features remains. Although several feature ranking methods were explored, the main method was a soft margin SVM classifier with which the authors found 8 key colon cancer genes out of 7000.

Jacek Gondzio has some nice slides (2009) on interior point methods for large scale support vector machines.  He focuses on the primal dual logarithmic barrier methods (see e.g. Wright 1987) for softer classification.  Great explanations, diagrams, and numerical results are provided.  Kristian Woodsend wrote his 2009 Ph.D. thesis on the same subject.  Woodsend applies the interior point methods and low rank approximations of the SVM kernel to reduce the computational cost to order $n$ where $n$ is the number of data points.  He compares this approach to active set methods, gradient projection algorithms, and cutting-plane algorithms and concludes with numerical results.