Sequential Inverse Optimal Control of Discrete-Time Systems

Sheng Cao; Zhiwei Luo; Changqin Quan

doi:10.1109/JAS.2023.123762

Volume 11 Issue 3

Mar. 2024

IEEE/CAA Journal of Automatica Sinica

JCR Impact Factor: 15.3, Top 1 (SCI Q1)

CiteScore: 23.5, Top 2% (Q1)
Google Scholar h5-index: 77， TOP 5

Turn off MathJax

Article Contents

Article Navigation > IEEE/CAA Journal of Automatica Sinica > 2024 > 11(3): 608-621

S. Cao, Z. Luo, and C. Quan, “Sequential inverse optimal control of discrete-time systems,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 608–621, Mar. 2024. doi: 10.1109/JAS.2023.123762

Citation:

S. Cao, Z. Luo, and C. Quan, “Sequential inverse optimal control of discrete-time systems,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 608–621, Mar. 2024. doi: 10.1109/JAS.2023.123762

S. Cao, Z. Luo, and C. Quan, “Sequential inverse optimal control of discrete-time systems,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 608–621, Mar. 2024. doi: 10.1109/JAS.2023.123762

Citation:

S. Cao, Z. Luo, and C. Quan, “Sequential inverse optimal control of discrete-time systems,” IEEE/CAA J. Autom. Sinica, vol. 11, no. 3, pp. 608–621, Mar. 2024. doi: 10.1109/JAS.2023.123762

PDF( 1554 KB)

Sequential Inverse Optimal Control of Discrete-Time Systems

doi: 10.1109/JAS.2023.123762

More Information

Abstract

Abstract

This paper presents a novel sequential inverse optimal control (SIOC) method for discrete-time systems, which calculates the unknown weight vectors of the cost function in real time using the input and output of an optimally controlled discrete-time system. The proposed method overcomes the limitations of previous approaches by eliminating the need for the invertible Jacobian assumption. It calculates the possible-solution spaces and their intersections sequentially until the dimension of the intersection space decreases to one. The remaining one-dimensional vector of the possible-solution space’s intersection represents the SIOC solution. The paper presents clear conditions for convergence and addresses the issue of noisy data by clarifying the conditions for the singular values of the matrices that relate to the possible-solution space. The effectiveness of the proposed method is demonstrated through simulation results.
- Inverse optimal control,
- promised calculation step,
- sequential calculation

FullText(HTML)

References(31)

References

[1]	Y. Li, K. P. Tee, R. Yan, W. L. Chan, and Y. Wu, “A framework of human-robot coordination based on game theory and policy iteration,” IEEE Trans. Robot., vol. 32, no. 6, pp. 1408–1418, Dec. 2016. doi: 10.1109/TRO.2016.2597322
[2]	N. Aghasadeghi and T. Bretl, “Inverse optimal control for differentially flat systems with application to locomotion modeling,” in Proc. IEEE Int. Conf. Robotics and Automation, Hong Kong, China, 2014, pp. 6018–6025.
[3]	K. Mombaur, A. Truong, and J.-P. Laumond, “From human to humanoid locomotion—An inverse optimal control approach,” Auton. Robots, vol. 28, no. 3, pp. 369–383, Apr. 2010. doi: 10.1007/s10514-009-9170-7
[4]	J. Saez-Gallego, J. M. Morales, M. Zugno, and H. Madsen, “A data-driven bidding model for a cluster of price-responsive consumers of electricity,” IEEE Trans. Power Syst., vol. 31, no. 6, pp. 5001–5011, Nov. 2016. doi: 10.1109/TPWRS.2016.2530843
[5]	B. D. Ziebart, A. Maas, J. A. Bagnell, and A. K. Dey, “Human behavior modeling with maximum entropy inverse optimal control,” in Proc. AAAI Spring Symp.-Human Behavior Modeling, Stanford, USA, 2009.
[6]	B. Berret, E. Chiovetto, F. Nori, and T. Pozzo, “Evidence for composite cost functions in arm movement planning: An inverse optimal control approach,” PLoS Comput. Biol., vol. 7, no. 10, p. e1002183, Oct. 2011. doi: 10.1371/journal.pcbi.1002183
[7]	H. El-Hussieny, A. A. Abouelsoud, S. F. M. Assal, and S. M. Megahed, “Adaptive learning of human motor behaviors: An evolving inverse optimal control approach,” Eng. Appl. Artif. Intell., vol. 50, pp. 115–124, Apr. 2016. doi: 10.1016/j.engappai.2016.01.024
[8]	B. D. Ziebart, A. L. Maas, J. A. Bagnell, and A. K. Dey, “Maximum entropy inverse reinforcement learning,” in Proc. 23rd Nat. Conf. Artificial Intelligence, Chicago, USA, 2008, pp. 1433–1438.
[9]	S. Dempe, Foundations of Bilevel Programming. New York, USA: Springer, 2002.
[10]	K. Hatz, J. P. Schlöder, and H. G. Bock, “Estimating parameters in optimal control problems,” SIAM J. Sci. Comput., vol. 34, no. 3, pp. A1707–A1728, Jan. 2012. doi: 10.1137/110823390
[11]	S. Albrecht and M. Ulbrich, “Mathematical programs with complementarity constraints in the context of inverse optimal control for locomotion,” Optim. Methods Software, vol. 32, no. 4, pp. 670–698, Aug. 2017. doi: 10.1080/10556788.2016.1225212
[12]	S. Dempe, F. Harder, P. Mehlitz, and G. Wachsmuth, “Solving inverse optimal control problems via value functions to global optimality,” J. Glob. Optim., vol. 74, no. 2, pp. 297–325, Jun. 2019. doi: 10.1007/s10898-019-00758-1
[13]	A. E. Bryson and Y. C. Ho, Applied Optimal Control: Optimization, Estimation and Control. Routledge, 2018.
[14]	M. Johnson, N. Aghasadeghi, and T. Bretl, “Inverse optimal control for deterministic continuous-time nonlinear systems,” in Proc. 52nd IEEE Conf. Decision and Control, Firenze, Italy, 2013, pp. 2906–2913.
[15]	T. L. Molloy, J. J. Ford, and T. Perez, “Finite-horizon inverse optimal control for discrete-time nonlinear systems,” Automatica, vol. 87, pp. 442–446, Jan. 2018. doi: 10.1016/j.automatica.2017.09.023
[16]	T. L. Molloy, J. J. Ford, and T. Perez, “Online inverse optimal control for control-constrained discrete-time systems on finite and infinite horizons,” Automatica, vol. 120, p. 109109, Oct. 2020. doi: 10.1016/j.automatica.2020.109109
[17]	M. Almobaied, I. Eksin, and M. Guzelkaya, “A new inverse optimal control method for discrete-time systems,” in Proc. 12th Int. Conf. Informatics in Control, Automation and Robotics, Colmar, France, 2015, pp. 275–280.
[18]	M. Almobaied, I. Eksin, and M. Guzelkaya, “Inverse optimal controller based on extended Kalman filter for discrete-time nonlinear systems,” Optim. Control Appl. Methods, vol. 39, no. 1, pp. 19–34, Jan.–Feb. 2018. doi: 10.1002/oca.2331
[19]	P. Prasanna, J. Jacob, and M. P. Nandakumar, “Inverse optimal control of a class of affine nonlinear systems,” Trans. Inst. Meas. Control, vol. 41, no. 9, pp. 2637–2650, Feb. 2019. doi: 10.1177/0142331218806338
[20]	B. Huang, X. Ma, and U. Vaidya, “Data-driven nonlinear stabilization using Koopman operator,” in The Koopman Operator in Systems and Control: Concepts, Methodologies, and Applications, A. Mauroy, I. Mezić, and Y. Susuki, Eds. Cham, Switzerland: Springer, 2020, pp. 313–334.
[21]	S. M. Khansari-Zadeh and A. Billard, “Learning control Lyapunov function to ensure stability of dynamical system-based robot reaching motions,” Rob. Auton. Syst., vol. 62, no. 6, pp. 752–765, Jun. 2014. doi: 10.1016/j.robot.2014.03.001
[22]	H. Ravanbakhsh and S. Sankaranarayanan, “Learning control Lyapunov functions from counterexamples and demonstrations,” Auton. Robots, vol. 43, no. 2, pp. 275–307, Feb. 2019. doi: 10.1007/s10514-018-9791-9
[23]	Y. C. Chang, N. Roohi, and S. Gao, “Neural Lyapunov control,” Advances in Neural Information Processing Systems, vol. 32, 2019.
[24]	H. Zhang, J. Umenberger, and X. Hu, “Inverse optimal control for discrete-time finite-horizon linear quadratic regulators,” Automatica, vol. 110, p. 108593, Dec. 2019. doi: 10.1016/j.automatica.2019.108593
[25]	C. Yu, Y. Li, H. Fang, and J. Chen, “System identification approach for inverse optimal control of finite-horizon linear quadratic regulators,” Automatica, vol. 129, p. 109636, Jul. 2021. doi: 10.1016/j.automatica.2021.109636
[26]	A. Keshavarz, Y. Wang, and S. Boyd, “Imputing a convex objective function,” in Proc. IEEE Int. Symp. Intelligent Control, Denver, USA, 2011, pp. 613–619.
[27]	M. Schultheis, D. Straub, and C. A. Rothkopf, “Inverse optimal control adapted to the noise characteristics of the human sensorimotor system,” Advances in Neural Information Processing Systems, vol. 34, pp. 9429–9442, 2021.
[28]	J. Blot, and N. Hayek, Infinite-Horizon Optimal Control in the Discrete-Time Framework. New York, USA: Springer, 2014.
[29]	D. P. Bertsekas, Dynamic Programming and Optimal Control, Vol. 1. 2nd ed. Belmont, USA: Athena Scientific, 2000.
[30]	J. Blot and H. Chebbi, “Discrete time Pontryagin principles with infinite horizon,” J. Math. Anal. Appl., vol. 246, no. 1, pp. 265–279, Jun. 2000. doi: 10.1006/jmaa.2000.6797
[31]	W. Jin, D. Kulić, J. F.-S. Lin, S. Mou, and S. Hirche, “Inverse optimal control for multiphase cost functions,” IEEE Trans. Robot., vol. 35, no. 6, pp. 1387–1398, Dec. 2019. doi: 10.1109/TRO.2019.2926388

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(10)

Get Citation

PDF

XML

Article Metrics

Article views (892) PDF downloads(128)

Highlights

Sequential Inverse Optimal Control (SIOC) method proposed, capable of real-time cost function recovery
The SIOC method is applicable even without the invertible Jacobian assumption
The convergence condition for each step of the SIOC is established
Effective strategy introduced to tackle noisy data issues in SIOC calculation
SIOC method advantages highlighted: fewer calculation steps, robust to noise, and adaptable to cost weight changes

Sequential Inverse Optimal Control of Discrete-Time Systems

doi: 10.1109/JAS.2023.123762

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Highlights

Export File

Citation

Format

Content