ORPP logo
Image from Google Jackets

Reinforcement Learning with TensorFlow : A Beginner's Guide to Designing Self-Learning Systems with TensorFlow and OpenAI Gym.

By: Material type: TextTextPublisher: Birmingham : Packt Publishing, Limited, 2018Copyright date: ©2018Edition: 1st edDescription: 1 online resource (327 pages)Content type:
  • text
Media type:
  • computer
Carrier type:
  • online resource
ISBN:
  • 9781788830713
Subject(s): Genre/Form: Additional physical formats: Print version:: Reinforcement Learning with TensorFlowDDC classification:
  • 006.31
LOC classification:
  • Q325.6 .D888 2018
Online resources:
Contents:
Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: Deep Learning - Architectures and Frameworks -- Deep learning -- Activation functions for deep learning -- The sigmoid function -- The tanh function -- The softmax function -- The rectified linear unit function -- How to choose the right activation function -- Logistic regression as a neural network -- Notation -- Objective -- The cost function -- The gradient descent algorithm -- The computational graph -- Steps to solve logistic regression using gradient descent -- What is xavier initialization? -- Why do we use xavier initialization? -- The neural network model -- Recurrent neural networks -- Long Short Term Memory Networks -- Convolutional neural networks -- The LeNet-5 convolutional neural network -- The AlexNet model -- The VGG-Net model -- The Inception model -- Limitations of deep learning -- The vanishing gradient problem -- The exploding gradient problem -- Overcoming the limitations of deep learning -- Reinforcement learning -- Basic terminologies and conventions -- Optimality criteria -- The value function for optimality -- The policy model for optimality -- The Q-learning approach to reinforcement learning -- Asynchronous advantage actor-critic -- Introduction to TensorFlow and OpenAI Gym -- Basic computations in TensorFlow -- An introduction to OpenAI Gym -- The pioneers and breakthroughs in reinforcement learning -- David Silver -- Pieter Abbeel -- Google DeepMind -- The AlphaGo program -- Libratus -- Summary -- Chapter 2: Training Reinforcement Learning Agents Using OpenAI Gym -- The OpenAI Gym -- Understanding an OpenAI Gym environment -- Programming an agent using an OpenAI Gym environment -- Q-Learning -- The Epsilon-Greedy approach -- Using the Q-Network for real-world applications -- Summary.
Chapter 3: Markov Decision Process -- Markov decision processes -- The Markov property -- The S state set -- Actions -- Transition model -- Rewards -- Policy -- The sequence of rewards - assumptions -- The infinite horizons -- Utility of sequences -- The Bellman equations -- Solving the Bellman equation to find policies -- An example of value iteration using the Bellman equation -- Policy iteration -- Partially observable Markov decision processes -- State estimation -- Value iteration in POMDPs -- Training the FrozenLake-v0 environment using MDP -- Summary -- Chapter 4: Policy Gradients -- The policy optimization method -- Why policy optimization methods? -- Why stochastic policy? -- Example 1 - rock, paper, scissors -- Example 2 - state aliased grid-world -- Policy objective functions -- Policy Gradient Theorem -- Temporal difference rule -- TD(1) rule -- TD(0) rule -- TD() rule -- Policy gradients -- The Monte Carlo policy gradient -- Actor-critic algorithms -- Using a baseline to reduce variance -- Vanilla policy gradient -- Agent learning pong using policy gradients -- Summary -- Chapter 5: Q-Learning and Deep Q-Networks -- Why reinforcement learning? -- Model based learning and model free learning -- Monte Carlo learning -- Temporal difference learning -- On-policy and off-policy learning -- Q-learning -- The exploration exploitation dilemma -- Q-learning for the mountain car problem in OpenAI gym -- Deep Q-networks -- Using a convolution neural network instead of a single layer neural network -- Use of experience replay -- Separate target network to compute the target Q-values -- Advancements in deep Q-networks and beyond -- Double DQN -- Dueling DQN -- Deep Q-network for mountain car problem in OpenAI gym -- Deep Q-network for Cartpole problem in OpenAI gym -- Deep Q-network for Atari Breakout in OpenAI gym.
The Monte Carlo tree search algorithm -- Minimax and game trees -- The Monte Carlo Tree Search -- The SARSA algorithm -- SARSA algorithm for mountain car problem in OpenAI gym -- Summary -- Chapter 6: Asynchronous Methods -- Why asynchronous methods? -- Asynchronous one-step Q-learning -- Asynchronous one-step SARSA -- Asynchronous n-step Q-learning -- Asynchronous advantage actor critic -- A3C for Pong-v0 in OpenAI gym -- Summary -- Chapter 7: Robo Everything - Real Strategy Gaming -- Real-time strategy games -- Reinforcement learning and other approaches -- Online case-based planning -- Drawbacks to real-time strategy games -- Why reinforcement learning? -- Reinforcement learning in RTS gaming -- Deep autoencoder -- How is reinforcement learning better? -- Summary -- Chapter 8: AlphaGo - Reinforcement Learning at Its Best -- What is Go? -- Go versus chess -- How did DeepBlue defeat Gary Kasparov? -- Why is the game tree approach no good for Go? -- AlphaGo - mastering Go -- Monte Carlo Tree Search -- Architecture and properties of AlphaGo -- Energy consumption analysis - Lee Sedol versus AlphaGo -- AlphaGo Zero -- Architecture and properties of AlphaGo Zero -- Training process in AlphaGo Zero -- Summary -- Chapter 9: Reinforcement Learning in Autonomous Driving -- Machine learning for autonomous driving -- Reinforcement learning for autonomous driving -- Creating autonomous driving agents -- Why reinforcement learning ? -- Proposed frameworks for autonomous driving -- Spatial aggregation -- Sensor fusion -- Spatial features -- Recurrent temporal aggregation -- Planning -- DeepTraffic - MIT simulator for autonomous driving -- Summary -- Chapter 10: Financial Portfolio Management -- Introduction -- Problem definition -- Data preparation -- Reinforcement learning -- Further improvements -- Summary -- Chapter 11: Reinforcement Learning in Robotics.
Reinforcement learning in robotics -- Evolution of reinforcement learning -- Challenges in robot reinforcement learning -- High dimensionality problem -- Real-world challenges -- Issues due to model uncertainty -- What's the final objective a robot wants to achieve? -- Open questions and practical challenges -- Open questions -- Practical challenges for robotic reinforcement learning -- Key takeaways -- Summary -- Chapter 12: Deep Reinforcement Learning in Ad Tech -- Computational advertising challenges and bidding strategies -- Business models used in advertising -- Sponsored-search advertisements -- Search-advertisement management -- Adwords -- Bidding strategies of advertisers -- Real-time bidding by reinforcement learning in display advertising -- Summary -- Chapter 13: Reinforcement Learning in Image Processing -- Hierarchical object detection with deep reinforcement learning -- Related works -- Region-based convolution neural networks -- Spatial pyramid pooling networks -- Fast R-CNN -- Faster R-CNN -- You Look Only Once -- Single Shot Detector -- Hierarchical object detection model -- State -- Actions -- Reward -- Model and training -- Training specifics -- Summary -- Chapter 14: Deep Reinforcement Learning in NLP -- Text summarization -- Deep reinforced model for Abstractive Summarization -- Neural intra-attention model -- Intra-temporal attention on input sequence while decoding -- Intra-decoder attention -- Token generation and pointer -- Hybrid learning objective -- Supervised learning with teacher forcing -- Policy learning -- Mixed training objective function -- Text question answering -- Mixed objective and deep residual coattention for Question Answering -- Deep residual coattention encoder -- Mixed objective using self-critical policy learning -- Summary -- Appendix : Further topics in Reinforcement Learning.
Continuous action space algorithms -- Trust region policy optimization -- Deterministic policy gradients -- Scoring mechanism in sequential models in NLP -- BLEU -- What is BLEU score and what does it do? -- ROUGE -- Summary -- Other Books You May Enjoy -- Index.
Summary: Reinforcement learning allows you to develop intelligent, self-learning systems. This book shows you how to put the concepts of Reinforcement Learning to train efficient models.You will use popular reinforcement learning algorithms to implement use-cases in image processing and NLP, by combining the power of TensorFlow and OpenAI Gym.
Tags from this library: No tags from this library for this title. Log in to add tags.
Star ratings
    Average rating: 0.0 (0 votes)
No physical items for this record

Cover -- Title Page -- Copyright and Credits -- Packt Upsell -- Contributors -- Table of Contents -- Preface -- Chapter 1: Deep Learning - Architectures and Frameworks -- Deep learning -- Activation functions for deep learning -- The sigmoid function -- The tanh function -- The softmax function -- The rectified linear unit function -- How to choose the right activation function -- Logistic regression as a neural network -- Notation -- Objective -- The cost function -- The gradient descent algorithm -- The computational graph -- Steps to solve logistic regression using gradient descent -- What is xavier initialization? -- Why do we use xavier initialization? -- The neural network model -- Recurrent neural networks -- Long Short Term Memory Networks -- Convolutional neural networks -- The LeNet-5 convolutional neural network -- The AlexNet model -- The VGG-Net model -- The Inception model -- Limitations of deep learning -- The vanishing gradient problem -- The exploding gradient problem -- Overcoming the limitations of deep learning -- Reinforcement learning -- Basic terminologies and conventions -- Optimality criteria -- The value function for optimality -- The policy model for optimality -- The Q-learning approach to reinforcement learning -- Asynchronous advantage actor-critic -- Introduction to TensorFlow and OpenAI Gym -- Basic computations in TensorFlow -- An introduction to OpenAI Gym -- The pioneers and breakthroughs in reinforcement learning -- David Silver -- Pieter Abbeel -- Google DeepMind -- The AlphaGo program -- Libratus -- Summary -- Chapter 2: Training Reinforcement Learning Agents Using OpenAI Gym -- The OpenAI Gym -- Understanding an OpenAI Gym environment -- Programming an agent using an OpenAI Gym environment -- Q-Learning -- The Epsilon-Greedy approach -- Using the Q-Network for real-world applications -- Summary.

Chapter 3: Markov Decision Process -- Markov decision processes -- The Markov property -- The S state set -- Actions -- Transition model -- Rewards -- Policy -- The sequence of rewards - assumptions -- The infinite horizons -- Utility of sequences -- The Bellman equations -- Solving the Bellman equation to find policies -- An example of value iteration using the Bellman equation -- Policy iteration -- Partially observable Markov decision processes -- State estimation -- Value iteration in POMDPs -- Training the FrozenLake-v0 environment using MDP -- Summary -- Chapter 4: Policy Gradients -- The policy optimization method -- Why policy optimization methods? -- Why stochastic policy? -- Example 1 - rock, paper, scissors -- Example 2 - state aliased grid-world -- Policy objective functions -- Policy Gradient Theorem -- Temporal difference rule -- TD(1) rule -- TD(0) rule -- TD() rule -- Policy gradients -- The Monte Carlo policy gradient -- Actor-critic algorithms -- Using a baseline to reduce variance -- Vanilla policy gradient -- Agent learning pong using policy gradients -- Summary -- Chapter 5: Q-Learning and Deep Q-Networks -- Why reinforcement learning? -- Model based learning and model free learning -- Monte Carlo learning -- Temporal difference learning -- On-policy and off-policy learning -- Q-learning -- The exploration exploitation dilemma -- Q-learning for the mountain car problem in OpenAI gym -- Deep Q-networks -- Using a convolution neural network instead of a single layer neural network -- Use of experience replay -- Separate target network to compute the target Q-values -- Advancements in deep Q-networks and beyond -- Double DQN -- Dueling DQN -- Deep Q-network for mountain car problem in OpenAI gym -- Deep Q-network for Cartpole problem in OpenAI gym -- Deep Q-network for Atari Breakout in OpenAI gym.

The Monte Carlo tree search algorithm -- Minimax and game trees -- The Monte Carlo Tree Search -- The SARSA algorithm -- SARSA algorithm for mountain car problem in OpenAI gym -- Summary -- Chapter 6: Asynchronous Methods -- Why asynchronous methods? -- Asynchronous one-step Q-learning -- Asynchronous one-step SARSA -- Asynchronous n-step Q-learning -- Asynchronous advantage actor critic -- A3C for Pong-v0 in OpenAI gym -- Summary -- Chapter 7: Robo Everything - Real Strategy Gaming -- Real-time strategy games -- Reinforcement learning and other approaches -- Online case-based planning -- Drawbacks to real-time strategy games -- Why reinforcement learning? -- Reinforcement learning in RTS gaming -- Deep autoencoder -- How is reinforcement learning better? -- Summary -- Chapter 8: AlphaGo - Reinforcement Learning at Its Best -- What is Go? -- Go versus chess -- How did DeepBlue defeat Gary Kasparov? -- Why is the game tree approach no good for Go? -- AlphaGo - mastering Go -- Monte Carlo Tree Search -- Architecture and properties of AlphaGo -- Energy consumption analysis - Lee Sedol versus AlphaGo -- AlphaGo Zero -- Architecture and properties of AlphaGo Zero -- Training process in AlphaGo Zero -- Summary -- Chapter 9: Reinforcement Learning in Autonomous Driving -- Machine learning for autonomous driving -- Reinforcement learning for autonomous driving -- Creating autonomous driving agents -- Why reinforcement learning ? -- Proposed frameworks for autonomous driving -- Spatial aggregation -- Sensor fusion -- Spatial features -- Recurrent temporal aggregation -- Planning -- DeepTraffic - MIT simulator for autonomous driving -- Summary -- Chapter 10: Financial Portfolio Management -- Introduction -- Problem definition -- Data preparation -- Reinforcement learning -- Further improvements -- Summary -- Chapter 11: Reinforcement Learning in Robotics.

Reinforcement learning in robotics -- Evolution of reinforcement learning -- Challenges in robot reinforcement learning -- High dimensionality problem -- Real-world challenges -- Issues due to model uncertainty -- What's the final objective a robot wants to achieve? -- Open questions and practical challenges -- Open questions -- Practical challenges for robotic reinforcement learning -- Key takeaways -- Summary -- Chapter 12: Deep Reinforcement Learning in Ad Tech -- Computational advertising challenges and bidding strategies -- Business models used in advertising -- Sponsored-search advertisements -- Search-advertisement management -- Adwords -- Bidding strategies of advertisers -- Real-time bidding by reinforcement learning in display advertising -- Summary -- Chapter 13: Reinforcement Learning in Image Processing -- Hierarchical object detection with deep reinforcement learning -- Related works -- Region-based convolution neural networks -- Spatial pyramid pooling networks -- Fast R-CNN -- Faster R-CNN -- You Look Only Once -- Single Shot Detector -- Hierarchical object detection model -- State -- Actions -- Reward -- Model and training -- Training specifics -- Summary -- Chapter 14: Deep Reinforcement Learning in NLP -- Text summarization -- Deep reinforced model for Abstractive Summarization -- Neural intra-attention model -- Intra-temporal attention on input sequence while decoding -- Intra-decoder attention -- Token generation and pointer -- Hybrid learning objective -- Supervised learning with teacher forcing -- Policy learning -- Mixed training objective function -- Text question answering -- Mixed objective and deep residual coattention for Question Answering -- Deep residual coattention encoder -- Mixed objective using self-critical policy learning -- Summary -- Appendix : Further topics in Reinforcement Learning.

Continuous action space algorithms -- Trust region policy optimization -- Deterministic policy gradients -- Scoring mechanism in sequential models in NLP -- BLEU -- What is BLEU score and what does it do? -- ROUGE -- Summary -- Other Books You May Enjoy -- Index.

Reinforcement learning allows you to develop intelligent, self-learning systems. This book shows you how to put the concepts of Reinforcement Learning to train efficient models.You will use popular reinforcement learning algorithms to implement use-cases in image processing and NLP, by combining the power of TensorFlow and OpenAI Gym.

Description based on publisher supplied metadata and other sources.

Electronic reproduction. Ann Arbor, Michigan : ProQuest Ebook Central, 2024. Available via World Wide Web. Access may be limited to ProQuest Ebook Central affiliated libraries.

There are no comments on this title.

to post a comment.

© 2024 Resource Centre. All rights reserved.