With the reinvigoration of neural networks in the 2000s, deep learning has become an extremely active area of research, one that’s paving the way for modern machine learning. In this practical book, author Nikhil Buduma provides examples and clear explanations to guide you through major concepts of this complicated field.
Companies such as Google, Microsoft, and Facebook are actively growing in-house deep-learning teams. For the rest of us, however, deep learning is still a pretty complex and difficult subject to grasp. If you’re familiar with Python, and have a background in calculus, along with a basic understanding of machine learning, this book will get you started.
Examine the foundations of machine learning and neural networksLearn how to train feed-forward neural networksUse TensorFlow to implement your first neural networkManage problems that arise as you begin to make networks deeperBuild neural networks that analyze complex imagesPerform effective dimensionality reduction using autoencodersDive deep into sequence analysis to examine languageUnderstand the fundamentals of reinforcement learning
You are watching: Fundamentals of deep learning: designing next-generation machine intelligence algorithms
Nikhil Buduma is a computer science student at MIT with deep interests in machine learning and the biomedical sciences. He is a two time gold medalist at the International Biology Olympiad, a student researcher, and a â??hacker.â? He was selected as a finalist in the 2012 International BioGENEius Challenge for his research on the pertussis vaccine, and served as the lab manager of the Veregge Lab at San Jose State University at the age of 16. At age 19, he had a first author publication on using protist models for high throughput drug screening using flow cytometry. Nikhil also has a passion for education, regularly writing technical posts on his blog, teaching machine learning tutorials at hackathons, and recently, received the Young Innovator Award from the Gordon and Betty Moore Foundation for re-invisioning the traditional chemistry set using augmented reality.
See more: " South Park Season 19 Ep 10
Table of Contents
Preface ix1 The Neural Network 1Building Intelligent Machines 1The Limits of Traditional Computer Programs 2The Mechanics of Machine Learning 3The Neuron 7Expressing Linear Perceptrons as Neurons 8Feed-Forward Neural Networks 9Linear Neurons and Their Limitations 12Sigmoid, Tanh, and ReLU Neurons 13Softmax Output Layers 15Looking Forward 152 Training Feed-Forward Neural Networks 17The Fast-Food Problem 17Gradient Descent 19The Delta Rule and Learning Rates 21Gradient Descent with Sigmoidal Neurons 22The Backpropagation Algorithm 23Stochastic and Minibatch Gradient Descent 25Test Sets, Validation Sets, and Overfitting 27Preventing Overfitting in Deep Neural Networks 34Summary 373 Implementing Neural Networks in TensorFlow 39What Is TensorFlow? 39How Does TensorFlow Compare to Alternatives? 40Installing TensorFlow 41Creating and Manipulating TensorFlow Variables 43TensorFlow Operations 45Placeholder Tensors 45Sessions in TensorFlow 45Navigating Variable Scopes and Sharing Variables 48Managing Models over the CPU and GPU 51Specifying the Logistic Regression Model in TensorFlow 52Logging and Training the Logistic Regression Model 55Leveraging Tensor Board to Visualize Computation Graphs and Learning 58Building a Multilayer Model for MNIST in TensorFlow 59Summary 624 Beyond Gradient Descent 63The Challenges with Gradient Descent 63Local Minima in the Error Surfaces of Deep Networks 64Model Identifiability 65How Pesky Are Spurious Local Minima in Deep Networks? 66Flat Regions in the Error Surface 69When the Gradient Points in the Wrong Direction 71Momentum-Based Optimization 74A Brief View of Second-Order Methods 77Learning Rate Adaptation 78AdaGrad-Accumulating Historical Gradients 79RMSProp-Exponentially Weighted Moving Average of Gradients 80Adam-Combining Momentum and RMSProp 81The Philosophy Behind Optimizer Selection 83Summary 835 Convolutional Neural Networks 85Neurons in Human Vision 85The Shortcomings of Feature Selection 86Vanilla Deep Neural Networks Don"t Scale 89Filters and Feature Maps 90Full Description of the Convolutional Layer 95Max Pooling 98Full Architectural Description of Convolution Networks 99Closing the Loop on MNIST with Convolutional Networks 101Image Preprocessing Pipelines Enable More Robust Models 103Accelerating Training with Batch Normalization 104Building a Convolutional Network for CIFAR-10 107Visualizing Learning in Convolutional Networks 109Leveraging Convolutional Filters to Replicate Artistic Styles 113Learning Convolutional Filters for Other Problem Domains 114Summary 1156 Embedding and Representation Learning 117Learning Lower-Dimensional Representations 117Principal Component Analysis 118Motivating the Autoencoder Architecture 120Implementing an Autoencoder in TensorFlow 121Denoising to Force Robust Representations 134Sparsity in Autoencoders 137When Context Is More Informative than the Input Vector 140The Word2Vec Framework 143Implementing the Skip-Gram Architecture 146Summary 1527 Models for Sequence Analysis 153Analyzing Variable-Length Inputs 153Tackling seq2seq with Neural N-Grams 155Implementing a Part-of-Speech Tagger 156Dependency Parsing and SyntaxNet 164Beam Search and Global Normalization 168A Case for Stateful Deep Learning Models 172Recurrent Neural Networks 173The Challenges with Vanishing Gradients 176Long Short-Term Memory (LSTM) Units 178TensorFlow Primitives for RNN Models 183Implementing a Sentiment Analysis Model 185Solving seq2seq Tasks with Recurrent Neural Networks 189Augmenting Recurrent Networks with Attention 191Dissecting a Neural Translation Network 194Summary 2178 Memory Augmented Neural Networks 219Neural Turing Machines 219Attention-Based Memory Access 221NTM Memory Addressing Mechanisms 223Differentiable Neural Computers 226Interference-Free Writing in DNCs 229DNC Memory Reuse 230Temporal Linking of DNC Writes 231Understanding the DNC Read Head 232The DNC Controller Network 232Visualizing the DNC in Action 234Implementing the DNC in TensorFlow 237Teaching a DNC to Read and Comprehend 242Summary 2449 Deep Reinforcement Learning 245Deep Reinforcement Learning Masters Atari Games 245What Is Reinforcement Learning? 247Markov Decision Processes (MDP) 248Policy 249Future Return 250Discounted Future Return 251Explore Versus Exploit 251Policy Versus Value Learning 253Policy Learning via Policy Gradients 254Pole-Cart with Policy Gradients 254OpenAI Gym 254Creating an Agent 255Building the Model and Optimizer 257Sampling Actions 257Keeping Track of History 257Policy Gradient Main Function 258PGAgent Performance on Pole-Cart 260Q-Learning and Deep Q-Networks 261The Bellman Equation 261Issues with Value Iteration 262Approximating the Q-Function 262Deep Q-Network (DQN) 263Training DQN 263Learning Stability 263Target Q-Network 264Experience Replay 264From Q-Function to Policy 264DQN and the Markov Assumption 265DQN"s Solution to the Markov Assumption 265Playing Breakout wth DQN 265Building Our Architecture 268Stacking Frames 268Setting Lip Training Operations 268Updating Our Target Q-Network 269Implementing Experience Replay 269DQN Main Loop 270DQNAgent Results on Breakout 272Improving and Moving Beyond DQN 273Deep Recurrent Q-Networks (DRQN) 273Asynchronous Advantage Actor-Critic Agent (A3C) 274Unsupervised REinforcement and Auxiliary Learning (UNREAL) 275Summary 276Index 277