Q-learning for continuous-time linear systems: a data-driven implementation of the Kleinman algorithm

IRIS

A data-driven strategy to estimate the optimal feedback and the value function in an infinite-horizon, continuous-time, linear-quadratic optimal control problem for an unknown system is proposed. The method permits the construction of the optimal policy without any knowledge of the model, without requiring that the time derivatives of the state are available for the design, and without even assuming that an initial stabilizing feedback policy is available. Two alternative architectures are discussed: the first scheme revolves around the periodic computation of some matrix inversions involving the Q-function, whereas the second approach relies on a purely continuous-time implementation of some dynamic systems whose trajectories are uniformly attracted by the solutions to the above algebraic equations. Interestingly, the proposed strategy essentially constitutes a (direct) data-driven implementation of the celebrated Kleinman algorithm, hence subsuming the particularly appealing features of the latter, such as quadratic monotone convergence to the optimal solution. The theory is then validated by the means of practically motivated applications.

Possieri, C., Sassano, M. (2022). Q-learning for continuous-time linear systems: a data-driven implementation of the Kleinman algorithm. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS. SYSTEMS, 52(10), 6487-6497 [10.1109/TSMC.2022.3145693].

Q-learning for continuous-time linear systems: a data-driven implementation of the Kleinman algorithm

Possieri C.;Sassano M.

2022-01-01

Abstract

A data-driven strategy to estimate the optimal feedback and the value function in an infinite-horizon, continuous-time, linear-quadratic optimal control problem for an unknown system is proposed. The method permits the construction of the optimal policy without any knowledge of the model, without requiring that the time derivatives of the state are available for the design, and without even assuming that an initial stabilizing feedback policy is available. Two alternative architectures are discussed: the first scheme revolves around the periodic computation of some matrix inversions involving the Q-function, whereas the second approach relies on a purely continuous-time implementation of some dynamic systems whose trajectories are uniformly attracted by the solutions to the above algebraic equations. Interestingly, the proposed strategy essentially constitutes a (direct) data-driven implementation of the celebrated Kleinman algorithm, hence subsuming the particularly appealing features of the latter, such as quadratic monotone convergence to the optimal solution. The theory is then validated by the means of practically motivated applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				2022
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1109/TSMC.2022.3145693
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Esperti anonimi
			
	Settore disciplinare dell'articolo (valido fino a 24/06/2024)
	
				Settore ING-INF/04 - AUTOMATICA
			
	Settore disciplinare dell'articolo (valido dal 09/05/2024)
	
				Settore IINF-04/A - Automatica
			
	Lingua del contenuto
	
				English
			
	Parole chiave
	
				Convergence; Costs; Linear systems; Optimal control; Q-learning; Reinforcement learning; Riccati equations; Symmetric matrices; Trajectory; Uncertain/unknown systems
			
	Citazione
	
				Possieri, C., Sassano, M. (2022). Q-learning for continuous-time linear systems: a data-driven implementation of the Kleinman algorithm. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS. SYSTEMS, 52(10), 6487-6497 [10.1109/TSMC.2022.3145693].
			
	Tutti gli autori
	
						Possieri, C; Sassano, M
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Q-Learning_for_Continuous-Time_Linear_Systems_A_Data-Driven_Implementation_of_the_Kleinman_Algorithm.pdf solo utenti autorizzati Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 1.52 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.52 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/294506

Citazioni

ND

12

11

social impact