Rumelhart, Hinton, Williams. Learning internal representations by error propagation.

Blum, Rivest. Training a 3-node neural network is NP-complete.

** Basic energy-based models **

Hopfield. Neural networks and physical systems with emergent collective computational abilities.

Hinton, Sejnowski. Learning and relearning in Boltzmann machines.

Freund, Haussler. Unsupervised learning of distributions on binary vectors using two layer networks.

Long, Servedio. Restricted Boltzmann machines are hard to approximately evaluate or simulate.

** Independent component analysis **

Bell, Sejnowski. The independent components of natural scenes are edge filters.

Lewicki. Efficient coding of natural sounds.

Comon. Independent component analysis, a new concept?

** Sparse coding **

Olshausen, Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images.

Simoncelli and Olshausen. Natural image statistics and neural representation.

Rozell, Johnson, Baraniuk, Olshausen. Sparse coding via thresholding and local competition in neural circuits.

** Gibbs sampling **

Geman, Geman. Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images.

** Deep networks: architectures and algorithms **

Hinton, Osindero, Teh. A fast learning algorithm for deep belief nets.

Hinton, Salakhutdinov. Reducing the dimensionality of data with neural networks.

Ranzato, Boureau, Le Cun. Sparse feature learning for deep belief networks.

Shan, Zhang, Cottrell. Recursive ICA.

Valiant. Memorization and association on a realistic neural model.

Coates and Ng. The importance of encoding versus training with sparse coding and vector quantization.

Coates, Lee, Ng. An analysis of single-layer networks in unsupervised feature learning.

** Deep networks: applications **

Collobert, Weston. A unified architecture for natural language processing.

Ngiam, Chen, Koh, Ng. Learning deep energy models.

Jarrett, Kavukcuoglu, Ranzato, Le Cun. What is the best multistage architecture for object recognition?

Socher, Lin, Ng, Manning. Parsing natural scenes and natural language with recursive neural networks.

** Deep architectures in the brain **

Serre, Kreiman, Kouh, Cadieu, Knoblich, Poggio. A quantitative theory of immediate visual recognition.

** Circuit complexity of deep networks **

Hastad, Goldmann. On the power of small-depth threshold circuits.