Deep Belief Networks

You are currently browsing the archive for the Deep Belief Networks category.

Assorted Links Feb 2014

February 19, 2014 in Assorted Links, Deep Belief Networks, General ML by hundalhh | Permalink

Enjoying John Baez’s blog Azimuth. Especially the posts on good research practices and an older post on levels of mathematical understanding.
García-Pérez, Serrano, and Boguñá wrote a cool paper on primes, probability, and integers as a bipartite network.
Loved the idea behind the game theoretical book “Survival of the Nicest” (see Yes Magazine for a two page introduction).
Scott Young is learning Chinese quickly.
Cyber warriors to the rescue.
Mao, Fluxx, and Douglas Hofstadter‘s Nomic are fun games.
Healy and Caudell are applying category theory to semantic and neural networks.
Some MOOCs for data science and machine learning.
Here an old but good free online course on the Computational Complexity of Machine Learning.
Great TeX graphics.
Watch this Ted Video to learn anything in 20 hours (YMMV).
Where are all the Steeler fans? Cowboy fans? ….
Productivity Hints.
Copper + Magnets = Fun
Stray dogs on the subway.
Deep learning on NPR.
Happy 40th birthday D&D
Google is applying deep learning to images of house numbers
Deep learning in your browser.
How to write a great research paper.
Do Deep Nets Really Need to be Deep?
A variation on neural net dropout.
Provable algorithms for Machine Learning
100 Numpy Exercises
Learn and Practice Applied Machine Learning | Machine Learning Mastery

“Deep Learning 101″

December 14, 2013 in Deep Belief Networks, Graphical Models by hundalhh | Permalink

Check out Markus Beissinger’s blog post “Deep Learning 101″. Markus reviews a lot of deep learning basics derived from the papers “Representation Learning: A Review and New Perspectives” (Bengio, Courville, Vincen 2012) and “Deep Learning of Representations: Looking Forward” (Bengio 2013). Beissinger covers the following topics:

An easy intro to Deep Learing
The Current State of Deep Learing
Probabilistic Graphical Models
Principal Component Analysis
Restricted Boltzman Machines
Auto-Encoders
“Challenges Looking Ahead”

This is a great intro and I highly recommend it.

If you want more information, check out Ng’s lecture notes, Honglak Lee’s 2010 NIPS slides, and Hinton’s Videos ([2009] [2013]).

“13 NIPS Papers that caught our eye”

December 12, 2013 in Deep Belief Networks, General ML by hundalhh | 2 comments

Zygmunt Zając at fastml.com has a great post titled “13 NIPS Papers that caught our eye.” Zajac provides a short readable summary of his favorite NIPS 2013 papers. (NIPS 2013 which just ended last week.) The papers are:

Understanding Dropout by Baldi and Sadowski
Training and Analysing Deep Recurrent Neural Networks by Hermans and Schrauwen
RNADE: The real-valued neural autoregressive density-estimator by Uria, Murray, and Larochelle
Predicting Parameters in Deep Learning by Denil, Shakibi, Dinh, Ranzato, and Freitas
Pass-efficient unsupervised feature selection by Maung and Schweitzer
Multi-Prediction Deep Boltzmann Machines by Goodfellow, Mirza, Courville, and Bengio
Memoized Online Variational Inference for Dirichlet Process Mixture Models by Hughes, and Sudderth
Learning word embeddings efficiently with noise-contrastive estimation by Andriy , and Kavukcuoglu
Learning Stochastic Feedforward Neural Networks by Tang and Salakhutdinov
Distributed Representations of Words and Phrases and their Compositionality by Mikolov, Sutskever, Chen, Corrado, and Dean
Correlated random features for fast semi-supervised learning by McWilliams, Balduzzi, and Buhmann
Convex Two-Layer Modeling by Aslan, Cheng, Zhang, and Schuurmans, and
Approximate inference in latent Gaussian-Markov models from continuous time observations by Cseke, Opper, and Sanguinetti

I suggest reading Zajac’s summaries before diving in.

Linear-Nonlinear-Poisson Neurons

July 16, 2013 in Deep Belief Networks, Neural Nets by hundalhh | Permalink

The linear-nonlinear-Poisson (LNP) cascade model is a standard model of neuron responses. Louis Shao has recently shown that an artificial neural net consisting of LNP neurons can simulate any Boltzmann machine and perform “a semi-stochastic Bayesian inference algorithm lying between Gibbs sampling and variational inference.” In his paper he notes that the “properties of visual area V2 are found to be comparable to those on the sparse autoencoder networks [3]; the sparse coding learning algorithm [4] is originated directly from neuroscience observations; also psychological phenomenon such as end-stopping is observed in sparse coding experiments [5].”

“Deep Support Vector Machines for Regression Problems”

July 9, 2013 in Deep Belief Networks, Sparsity, Support Vector Machines by hundalhh | Permalink

Nuit Blanche‘s article “The Summer of the Deeper Kernels” references the two page paper “Deep Support Vector Machines for Regression Problems” by Schutten, Meijster, and Schomaker (2013).

The deep SMV is a pretty cool idea. A normal support vector machine (SVM) classifier, finds $\alpha_i$ such that

$f(x) = \sum_i \alpha_i K(x_i, x)$ is positive for one class of $x_i$ and negative for the other class (sometimes allowing exceptions). ($K(x,y)$ is called the kernel function which is in the simplest case just the dot product of $x$ and $y$.) SVM’s are great because they are fast and the solution is sparse (i.e. most of the $\alpha_i$ are zero).

Schutten, Meijster, and Schomaker apply the ideas of deep neural nets to SVMs.

They construct $d$ SVMs of the form

$f_a(x) = \sum_i \alpha_i(a) K(x_i, x)+b_a$

and then compute a more complex two layered SVM

$g(x) = \sum_i \alpha_i K(f(x_i), f(x))+b$

where $f(x) = (f_1(x), f_2(x), \ldots, f_d(x))$. They use a simple gradient descent algorithm to optimize the alphas and obtain numerical results on ten different data sets comparing the mean squared error to a standard SVM.

Video: “Deep Belief Networks for Speech”

April 16, 2013 in Deep Belief Networks by hundalhh | Permalink

Here’s a pretty cool video by Alex Acero (Microsoft) titled

$\ $

“Deep Belief Networks for Speech“.

$\ $

Check out minutes 47 to 50 where he says that the deep belief network approach created a 30% improvement over the state of the art speech recognition systems.

“Resurgence in Neural Networks”

April 6, 2013 in Deep Belief Networks, Neural Nets by hundalhh | Permalink

T Jake Luciani wrote a nice, easy to read blog post on the recent developments in neural networks.

“Autoencoders, MDL, and Helmholtz Free Energy”

April 4, 2013 in Clustering, Deep Belief Networks, Information Theory, Neural Nets, Sparsity by hundalhh | Permalink

In “Autoencoders, MDL, and Helmholtz Free Energy“, Hinton and Zemel (2001) use Minimum Description Length as an objective function for formulating generative and recognition weights for an autoencoding neural net. They develop a stochastic Vector Quantization method very similar to mixture of Gaussians where each input vector is encoded with

$$E_i = – \log \pi_i – k \log t + {k\over2} \log 2 \pi \sigma^2 + {{d^2} \over{2\sigma^2}}$$

nats (1 nat = 1/log(2) bits = 1.44 bits) where $t$ is the quantization width, $d$ is the Mahalanobis distance to the mean of the Gaussian, $k$ is the dimension of the input space, $\pi_i$ is the weight of the $i$th Gaussian. They call this the “energy” of the code. Encoding only using this scheme wastes bits because, for example, there may be vectors that are equally distant from two Gaussian. The amount wasted is

$$H = -\sum p_i \log p_i$$

where $p_i$ the probability that the code will be assigned to the $i$th Gaussian. So the “true” expected description length is

$$F = \sum_i p_i E_i – H$$

which “has exactly the form of the Helmholtz free energy.” This free energy is minimized by setting

$$p_i = {{e^{-E_i}}\over{\sum_j e^{-E_j}}}.$$

In order to make computation practical, they recommend using a suboptimal distributions “as a Lyapunov function for learning” (see Neal and Hinton 1993). They apply their method to learn factorial codes.

“Temporal Autoencoding Restricted Boltzmann Machine”

February 22, 2013 in Deep Belief Networks by hundalhh | Permalink

In “Temporal Autoencoding Restricted Boltzmann Machine“, Hausler and Susemihl explain how to train a deep belief RBM to learn to recognize patterns in sequences of inputs (mostly video). The resulting networks could recognize the patterns in human motion capture or the non-linear dynamics of a bouncing ball.

Bengio LeCun Deep Learning Video

February 16, 2013 in Deep Belief Networks, Neural Nets by hundalhh | Permalink

Bengio and Lecun created this wonderful video on Deep Neural Networks. Any logical function can be represented by a neural net with 3 layers (one hidden, see e.g. CNF), however simple 4 level logical functions with a small number of nodes may require a large number of nodes in a 3 layer representation. They point to theorems that show that the number of nodes required to represent a k level logical function can require an exponential number of nodes in a k-1 level network. They go on to explain denoising auto encoders for the training of deep neural nets.

« Older entries § Newer entries »

Artificial Intelligence Blog