E-Book, Englisch, 196 Seiten
Reihe: Springer Theses
Gatti Design of Experiments for Reinforcement Learning
2015
ISBN: 978-3-319-12197-0
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
E-Book, Englisch, 196 Seiten
Reihe: Springer Theses
ISBN: 978-3-319-12197-0
Verlag: Springer Nature Switzerland
Format: PDF
Kopierschutz: 1 - PDF Watermark
This thesis takes an empirical approach to understanding of the behavior and interactions between the two main components of reinforcement learning: the learning algorithm and the functional representation of learned knowledge. The author approaches these entities using design of experiments not commonly employed to study machine learning methods. The results outlined in this work provide insight as to what enables and what has an effect on successful reinforcement learning implementations so that this learning method can be applied to more challenging problems.
Christopher Gatti received his PhD in Decision Sciences and Engineering Systems from Rensselaer Polytechnic Institute (RPI). During his time at RPI, his work focused on machine learning and statistics, with applications in reinforcement learning, graph search, stem cell RNA analysis, and neuro-electrophysiological signal analysis. Prior to beginning his graduate work at RPI, he received a BSE in mechanical engineering and an MSE in biomedical engineering, both from the University of Michigan. He then continued to work at the University of Michigan for three years doing computational biomechanics focusing on the shoulder and knee. He has been a gymnast since he was a child and is currently an acrobat for Cirque du Soleil.
Autoren/Hrsg.
Weitere Infos & Material
1;Foreword;6
2;Acknowledgment;8
3;Book Note;9
3.1;Parts of this Thesis Have Been Published As;9
4;Contents;10
5;Chapter 1 Introduction;13
5.1;References;16
6;Chapter 2 Reinforcement Learning;18
6.1;2.1 Applications of Reinforcement Learning;22
6.1.1;2.1.1 Benchmark Problems;22
6.1.2;2.1.2 Games;25
6.1.3;2.1.3 Real-World Applications;25
6.1.4;2.1.4 Generalized Domains;27
6.2;2.2 Components of Reinforcement Learning;28
6.2.1;2.2.1 Domains;28
6.2.1.1;2.2.1.1 General Characteristics;29
6.2.1.2;2.2.1.2 State Space Dimensions;31
6.2.1.3;2.2.1.3 Action Space Dimensions;33
6.2.1.4;2.2.1.4 Reward Dimension;33
6.2.1.5;2.2.1.5 State Encoding-Dependent Characteristics;34
6.2.2;2.2.2 Representations;34
6.2.2.1;2.2.2.1 Look-up Tables;35
6.2.2.2;2.2.2.2 Linear Methods;36
6.2.2.3;2.2.2.3 Neural Networks;37
6.2.3;2.2.3 Learning Algorithms;40
6.2.3.1;2.2.3.1 Policy Evaluation Approaches;47
6.2.3.2;2.2.3.2 Learning Algorithm Convergence;48
6.2.3.3;2.2.3.3 Additional Reinforcement Learning Algorithms;48
6.3;2.3 Heuristics and Performance Effectors;49
6.3.1;2.3.1 Heuristics for Reinforcement Learning;49
6.3.1.1;2.3.1.1 Effectors of Reinforcement Learning Performance;50
6.4;References;53
7;Chapter 3 Design of Experiments;64
7.1;3.1 Classical Design of Experiments;66
7.2;3.2 Contemporary Design of Experiments;70
7.3;3.3 Design of Experiments for Empirical Algorithm Analysis;74
7.4;References;75
8;Chapter 4 Methodology;78
8.1;4.1 Sequential CART;78
8.1.1;4.1.1 CART Modeling;79
8.1.2;4.1.2 Sequential CART Modeling;80
8.1.3;4.1.3 Analysis of Sequential CART;86
8.1.4;4.1.4 Empirical Convergence Criteria;87
8.1.5;4.1.5 Example: 2-D 6-hump Camelback Function;89
8.2;4.2 Kriging Metamodeling;93
8.2.1;4.2.1 Kriging;94
8.2.2;4.2.2 Deterministic Kriging;95
8.2.3;4.2.3 Stochastic Kriging;96
8.2.4;4.2.4 Covariance Function;97
8.2.5;4.2.5 Implementation;99
8.2.6;4.2.6 Analysis of Kriging Metamodels;100
8.3;References;103
9;Chapter 5 The Mountain Car Problem;105
9.1;5.1 Reinforcement Learning Implementation;105
9.2;5.2 Sequential CART;107
9.2.1;5.2.1 Convergent Subregions;108
9.3;5.3 Response Surface Metamodeling;111
9.4;5.4 Discussion;117
9.5;References;119
10;Chapter 6 The Truck Backer-upper Problem;120
10.1;6.1 Reinforcement Learning Implementation;121
10.2;6.2 Sequential CART;123
10.2.1;6.2.1 Convergent Subregions;125
10.3;6.3 Response Surface Metamodeling;129
10.4;6.4 Discussion;131
10.5;References;135
11;Chapter 7 The Tandem Truck Backer-Upper Problem;137
11.1;7.1 Reinforcement Learning Implementation;139
11.2;7.2 Sequential CART;141
11.2.1;7.2.1 Convergent Subregions;142
11.3;7.3 Discussion;145
11.4;References;147
12;Chapter 8 Discussion;148
12.1;8.1 Reinforcement Learning;148
12.1.1;8.1.1 Parameter Effects;149
12.1.2;8.1.2 Neural Network;152
12.2;8.2 Experimentation;153
12.2.1;8.2.1 Sequential CART;155
12.2.2;8.2.2 Stochastic Kriging;156
12.3;8.3 Innovations;157
12.4;8.4 Future Work;159
12.5;References;161
13;Appendix A Parameter Effects in the Game of Chung Toi;164
13.1;A.1 Introduction;164
13.2;A.2 Methodology;165
13.2.1;A.2.1 Chung Toi;165
13.2.2;A.2.2 The Reinforcement Learning Method;165
13.2.3;A.2.3 The Environment Model;166
13.2.4;A.2.4 The Agent Model;167
13.2.5;A.2.5 Training and Performance Evaluation Methods;168
13.2.6;A.2.6 Experiments;169
13.3;A.3 Results;171
13.3.1;A.3.1 Individual Experiments;171
13.3.2;A.3.2 Optimal Experiments;174
13.4;A.4 Discussion;175
13.5;A.5 Conclusion;176
13.6;References;176
14;Appendix B Design of Experiments for the Mountain Car Problem;178
14.1;B.1 Introduction;178
14.2;B.2 Methodology;179
14.2.1;B.2.1 Mountain Car Domain;179
14.2.2;B.2.2 Agent Representation;179
14.2.3;B.2.3 Experimental Design and Analysis;181
14.3;B.3 Results;181
14.4;B.4 Discussion;183
14.5;References;184
15;Appendix C Supporting Tables;185
16;Glossary;194




