E-Book, Englisch, 424 Seiten
Zhang / Liu / Luo Adaptive Dynamic Programming for Control
1. Auflage 2012
ISBN: 978-1-4471-4757-2
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
Algorithms and Stability
E-Book, Englisch, 424 Seiten
Reihe: Communications and Control Engineering
ISBN: 978-1-4471-4757-2
Verlag: Springer
Format: PDF
Kopierschutz: 1 - PDF Watermark
There are many methods of stable controller design for nonlinear systems. In seeking to go beyond the minimum requirement of stability, Adaptive Dynamic Programming in Discrete Time approaches the challenging topic of optimal control for nonlinear systems using the tools of adaptive dynamic programming (ADP). The range of systems treated is extensive; affine, switched, singularly perturbed and time-delay nonlinear systems are discussed as are the uses of neural networks and techniques of value and policy iteration. The text features three main aspects of ADP in which the methods proposed for stabilization and for tracking and games benefit from the incorporation of optimal control methods:
• infinite-horizon control for which the difficulty of solving partial differential Hamilton-Jacobi-Bellman equations directly is overcome, and proof provided that the iterative value function updating sequence converges to the infimum of all the value functions obtained by admissible control law sequences;
• finite-horizon control, implemented in discrete-time nonlinear systems showing the reader how to obtain suboptimal control solutions within a fixed number of control steps and with results more easily applied in real systems than those usually gained from infinite-horizon control;
• nonlinear games for which a pair of mixed optimal policies are derived for solving games both when the saddle point does not exist, and, when it does, avoiding the existence conditions of the saddle point.
Non-zero-sum games are studied in the context of a single network scheme in which policies are obtained guaranteeing system stability and minimizing the individual performance function yielding a Nash equilibrium.
In order to make the coverage suitable for the student as well as for the expert reader, Adaptive Dynamic Programming in Discrete Time:
• establishes the fundamental theory involved clearly with each chapter devoted to a clearly identifiable control paradigm;
• demonstrates convergence proofs of the ADP algorithms to deepen understanding of the derivation of stability and convergence with the iterative computational methods used; and
• shows how ADP methods can be put to use both in simulation and in real applications.
This text will be of considerable interest to researchers interested in optimal control and its applications in operations research, applied mathematics computational intelligence and engineering. Graduate students working in control and operations research will also find the ideas presented here to be a source of powerful methods for furthering their study.
Autoren/Hrsg.
Weitere Infos & Material
1;Adaptive Dynamic Programming for Control;2
1.1;Preface;4
1.1.1;Background of This Book;4
1.1.2;Why This Book?;5
1.1.3;The Content of This Book;5
1.1.4;Acknowledgments;8
1.2;Contents;10
2;Chapter 1: Overview;15
2.1;1.1 Challenges of Dynamic Programming;15
2.2;1.2 Background and Development of Adaptive Dynamic Programming;17
2.2.1;1.2.1 Basic Structures of ADP;18
2.2.1.1;1.2.1.1 Heuristic Dynamic Programming (HDP);18
2.2.1.2;1.2.1.2 Dual Heuristic Programming (DHP);19
2.2.2;1.2.2 Recent Developments of ADP;20
2.2.2.1;1.2.2.1 Development of ADP Structures;20
2.2.2.2;1.2.2.2 Development of Algorithms and Convergence Analysis;23
2.2.2.3;1.2.2.3 Applications of ADP Algorithms;24
2.3;1.3 Feedback Control Based on Adaptive Dynamic Programming;25
2.4;1.4 Non-linear Games Based on Adaptive Dynamic Programming;31
2.5;1.5 Summary;33
2.6;References;33
3;Chapter 2: Optimal State Feedback Control for Discrete-Time Systems;40
3.1;2.1 Introduction;40
3.2;2.2 In nite-Horizon Optimal State Feedback Control Based on DHP;40
3.2.1;2.2.1 Problem Formulation;41
3.2.2;2.2.2 In nite-Horizon Optimal State Feedback Control via DHP;43
3.2.3;2.2.3 Simulations;57
3.3;2.3 In nite-Horizon Optimal State Feedback Control Based on GDHP;65
3.3.1;2.3.1 Problem Formulation;65
3.3.2;2.3.2 In nite-Horizon Optimal State Feedback Control Based on GDHP;67
3.3.2.1;2.3.2.1 NN Identi cation of the Unknown Nonlinear System;67
3.3.2.2;2.3.2.2 Derivation of the Iterative ADP Algorithm;70
3.3.2.3;2.3.2.3 Convergence Analysis of the Iterative ADP Algorithm;71
3.3.2.4;2.3.2.4 NN Implementation of the Iterative ADP Algorithm Using GDHP Technique;77
3.3.3;2.3.3 Simulations;80
3.4;2.4 In nite-Horizon Optimal State Feedback Control Based on GHJB Algorithm;84
3.4.1;2.4.1 Problem Formulation;84
3.4.2;2.4.2 Constrained Optimal Control Based on GHJB Equation;86
3.4.3;2.4.3 Simulations;91
3.5;2.5 Finite-Horizon Optimal State Feedback Control Based on HDP;93
3.5.1;2.5.1 Problem Formulation;95
3.5.2;2.5.2 Finite-Horizon Optimal State Feedback Control Based on HDP;97
3.5.2.1;2.5.2.1 Derivation and Properties of the Iterative ADP Algorithm;97
3.5.2.2;2.5.2.2 The epsilon-Optimal Control Algorithm;104
3.5.3;2.5.3 Simulations;115
3.6;2.6 Summary;119
3.7;References;119
4;Chapter 3: Optimal Tracking Control for Discrete-Time Systems;121
4.1;3.1 Introduction;121
4.2;3.2 In nite-Horizon Optimal Tracking Control Based on HDP;121
4.2.1;3.2.1 Problem Formulation;122
4.2.2;3.2.2 In nite-Horizon Optimal Tracking Control Based on HDP;123
4.2.2.1;3.2.2.1 System Transformation;123
4.2.2.2;3.2.2.2 Derivation of the Iterative HDP Algorithm;124
4.2.2.3;3.2.2.3 Summary of the Algorithm;129
4.2.2.4;3.2.2.4 Neural-Network Implementation for the Tracking Control Scheme;130
4.2.3;3.2.3 Simulations;130
4.3;3.3 In nite-Horizon Optimal Tracking Control Based on GDHP;132
4.3.1;3.3.1 Problem Formulation;135
4.3.2;3.3.2 In nite-Horizon Optimal Tracking Control Based on GDHP;138
4.3.2.1;3.3.2.1 Design and Implementation of Feedforward Controller;138
4.3.2.2;3.3.2.2 Design and Implementation of Optimal Feedback Controller;139
4.3.2.3;3.3.2.3 Convergence Characteristics of the Neural-Network Approximation Process;147
4.3.3;3.3.3 Simulations;149
4.4;3.4 Finite-Horizon Optimal Tracking Control Based on ADP;150
4.4.1;3.4.1 Problem Formulation;153
4.4.2;3.4.2 Finite-Horizon Optimal Tracking Control Based on ADP;156
4.4.2.1;3.4.2.1 Derivation of the Iterative ADP Algorithm;156
4.4.2.2;3.4.2.2 Convergence Analysis of the Iterative ADP Algorithm;158
4.4.2.3;3.4.2.3 The epsilon-Optimal Control Algorithm;162
4.4.2.4;3.4.2.4 Summary of the Algorithm;163
4.4.2.5;3.4.2.5 Neural-Network Implementation of the Iterative ADP Algorithm via HDP Technique;163
4.4.3;3.4.3 Simulations;166
4.5;3.5 Summary;170
4.6;References;171
5;Chapter 4: Optimal State Feedback Control of Nonlinear Systems with Time Delays;173
5.1;4.1 Introduction;173
5.2;4.2 In nite-Horizon Optimal State Feedback Control via Delay Matrix;174
5.2.1;4.2.1 Problem Formulation;174
5.2.2;4.2.2 Optimal State Feedback Control Using Delay Matrix;175
5.2.2.1;4.2.2.1 Model Network;184
5.2.2.2;4.2.2.2 The M Network;185
5.2.2.3;4.2.2.3 Critic Network;185
5.2.2.4;4.2.2.4 Action Network;186
5.2.3;4.2.3 Simulations;187
5.3;4.3 In nite-Horizon Optimal State Feedback Control via HDP;189
5.3.1;4.3.1 Problem Formulation;189
5.3.2;4.3.2 Optimal Control Based on Iterative HDP;192
5.3.3;4.3.3 Simulations;198
5.4;4.4 Finite-Horizon Optimal State Feedback Control for a Class of Nonlinear Systems with Time Delays;200
5.4.1;4.4.1 Problem Formulation;200
5.4.2;4.4.2 Optimal Control Based on Improved Iterative ADP;202
5.4.3;4.4.3 Simulations;208
5.5;4.5 Summary;209
5.6;References;210
6;Chapter 5: Optimal Tracking Control of Nonlinear Systems with Time Delays;212
6.1;5.1 Introduction;212
6.2;5.2 Problem Formulation;212
6.3;5.3 Optimal Tracking Control Based on Improved Iterative ADP Algorithm;213
6.4;5.4 Simulations;224
6.5;5.5 Summary;231
6.6;References;231
7;Chapter 6: Optimal Feedback Control for Continuous-Time Systems via ADP;233
7.1;6.1 Introduction;233
7.2;6.2 Optimal Robust Feedback Control for Unknown General Nonlinear Systems;233
7.2.1;6.2.1 Problem Formulation;234
7.2.2;6.2.2 Data-Based Robust Approximate Optimal Tracking Control;234
7.2.3;6.2.3 Simulations;246
7.3;6.3 Optimal Feedback Control for Nonaf ne Nonlinear Systems;252
7.3.1;6.3.1 Problem Formulation;252
7.3.2;6.3.2 Robust Approximate Optimal Control Based on ADP Algorithm;253
7.3.3;6.3.3 Simulations;260
7.4;6.4 Summary;263
7.5;References;264
8;Chapter 7: Several Special Optimal Feedback Control Designs Based on ADP;266
8.1;7.1 Introduction;266
8.2;7.2 Optimal Feedback Control for a Class of Switched Systems;267
8.2.1;7.2.1 Problem Description;267
8.2.2;7.2.2 Optimal Feedback Control Based on Two-Stage ADP Algorithm;268
8.2.3;7.2.3 Simulations;277
8.3;7.3 Optimal Feedback Control for a Class of Descriptor Systems;280
8.3.1;7.3.1 Problem Formulation;280
8.3.2;7.3.2 Optimal Controller Design for a Class of Descriptor Systems;282
8.3.3;7.3.3 Simulations;288
8.4;7.4 Optimal Feedback Control for a Class of Singularly Perturbed Systems;290
8.4.1;7.4.1 Problem Formulation;290
8.4.2;7.4.2 Optimal Controller Design for Singularly Perturbed Systems;292
8.4.2.1;7.4.2.1 Algorithm Design;292
8.4.2.2;7.4.2.2 Neural Network Approximation;295
8.4.3;7.4.3 Simulations;297
8.5;7.5 Optimal Feedback Control for a Class of Constrained Systems Via SNAC;297
8.5.1;7.5.1 Problem Formulation;297
8.5.2;7.5.2 Optimal Controller Design for Constrained Systems via SNAC;301
8.5.3;7.5.3 Simulations;308
8.6;7.6 Summary;315
8.7;References;315
9;Chapter 8: Zero-Sum Games for Discrete-Time Systems Based on Model-Free ADP;317
9.1;8.1 Introduction;317
9.2;8.2 Zero-Sum Differential Games for a Class of Discrete-Time 2-D Systems;317
9.2.1;8.2.1 Problem Formulation;318
9.2.2;8.2.2 Data-Based Optimal Control via Iterative ADP Algorithm;325
9.2.2.1;8.2.2.1 The Derivation of Data-Based Iterative ADP Algorithm;326
9.2.2.2;8.2.2.2 Properties of Data-Based Iterative ADP Algorithm;327
9.2.2.3;8.2.2.3 Neural Network Implementation ;334
9.2.2.4;8.2.2.4 Critic Network;334
9.2.2.5;8.2.2.5 Action Networks;335
9.2.3;8.2.3 Simulations;336
9.3;8.3 Zero-Sum Games for a Class of Discrete-Time Systems via Model-Free ADP;339
9.3.1;8.3.1 Problem Formulation;340
9.3.2;8.3.2 Data-Based Optimal Output Feedback Control via ADP Algorithm;342
9.3.3;8.3.3 Simulations;349
9.4;8.4 Summary;351
9.5;References;351
10;Chapter 9: Nonlinear Games for a Class of Continuous-Time Systems Based on ADP;353
10.1;9.1 Introduction;353
10.2;9.2 In nite Horizon Zero-Sum Games for a Class of Af ne Nonlinear Systems;354
10.2.1;9.2.1 Problem Formulation;354
10.2.2;9.2.2 Zero-Sum Differential Games Based on Iterative ADP Algorithm;355
10.2.2.1;9.2.2.1 Derivation of the Iterative ADP Method;355
10.2.2.2;9.2.2.2 The Iterative ADP Algorithm;357
10.2.2.3;9.2.2.3 Properties of the Iterative ADP Algorithm;358
10.2.3;9.2.3 Simulations;363
10.3;9.3 Finite Horizon Zero-Sum Games for a Class of Nonlinear Systems;366
10.3.1;9.3.1 Problem Formulation;368
10.3.2;9.3.2 Finite Horizon Optimal Control of Nonaf ne Nonlinear Zero-Sum Games;370
10.3.3;9.3.3 Simulations;378
10.4;9.4 Non-Zero-Sum Games for a Class of Nonlinear Systems Based on ADP;380
10.4.1;9.4.1 Problem Formulation of Non-Zero-Sum Games;381
10.4.2;9.4.2 Optimal Control of Nonlinear Non-Zero-Sum Games Based on ADP;384
10.4.3;9.4.3 Simulations;395
10.5;9.5 Summary;399
10.6;References;400
11;Chapter 10: Other Applications of ADP;402
11.1;10.1 Introduction;402
11.2;10.2 Self-Learning Call Admission Control for CDMA Cellular Networks Using ADP;403
11.2.1;10.2.1 Problem Formulation;403
11.2.2;10.2.2 A Self-Learning Call Admission Control Scheme for CDMA Cellular Networks;405
11.2.2.1;10.2.2.1 Adaptive Critic Designs for Problems with Finite Action Space;405
11.2.2.2;10.2.2.2 Self-learning Call Admission Control for CDMA Cellular Networks;409
11.2.3;10.2.3 Simulations;413
11.3;10.3 Engine Torque and Air-Fuel Ratio Control Based on ADP;419
11.3.1;10.3.1 Problem Formulation;419
11.3.2;10.3.2 Self-learning Neural Network Control for Both Engine Torque and Exhaust Air-Fuel Ratio;420
11.3.3;10.3.3 Simulations;422
11.3.3.1;10.3.3.1 Critic Network;422
11.3.3.2;10.3.3.2 Controller/Action Network;424
11.3.3.3;10.3.3.3 Simulation Results;424
11.4;10.4 Summary;426
11.5;References;427
12;Index;430




