Dynamic Multi-metric Thresholds for Scaling Applications Using Reinforcement Learning

IRIS

Cloud-native applications increasingly adopt the microservices architecture, which favors elasticity to satisfy the application performance requirements in face of variable workloads. To simplify the elasticity management, the trend is to create an auto-scaler instance per microservice, which controls its horizontal scalability by using the classic threshold-based policy. Although easy to implement, setting manually the scaling thresholds, which are usually statically-defined on a single metric, may lead to poor scaling decisions when applications are heterogeneous in terms of resource consumption. In this paper, we study dynamic multi-metric threshold-based scaling policies, that exploit Reinforcement Learning (RL) to autonomously update the scaling thresholds, one per controlled resource (CPU and memory). The proposed RL approaches (i.e., QL, MB, and DQL Threshold) use different degrees of knowledge about the system dynamics. To model the thresholds adaptation actions, we consider two RL-based architectures. In the single-agent architecture, one agent drives the updates of both scaling thresholds. To speed-up the learning, the multi-agent architecture adopts a distinct agent per threshold. Simulation- and prototype-based results show the benefits of the proposed solutions when compared to the state-of-the-art policies and highlight the advantages of multi-agent MB Threshold and DQL Threshold approaches, in terms of deployment objectives and execution times.

Rossi, F., Cardellini, V., Lo Presti, F., Nardelli, M. (2023). Dynamic Multi-metric Thresholds for Scaling Applications Using Reinforcement Learning. IEEE TRANSACTIONS ON CLOUD COMPUTING, 11(2), 1807-1821 [10.1109/TCC.2022.3163357].

Dynamic Multi-metric Thresholds for Scaling Applications Using Reinforcement Learning

Rossi, Fabiana;Cardellini, Valeria;Lo Presti, Francesco;Nardelli, Matteo

2023-03-01

Abstract

Cloud-native applications increasingly adopt the microservices architecture, which favors elasticity to satisfy the application performance requirements in face of variable workloads. To simplify the elasticity management, the trend is to create an auto-scaler instance per microservice, which controls its horizontal scalability by using the classic threshold-based policy. Although easy to implement, setting manually the scaling thresholds, which are usually statically-defined on a single metric, may lead to poor scaling decisions when applications are heterogeneous in terms of resource consumption. In this paper, we study dynamic multi-metric threshold-based scaling policies, that exploit Reinforcement Learning (RL) to autonomously update the scaling thresholds, one per controlled resource (CPU and memory). The proposed RL approaches (i.e., QL, MB, and DQL Threshold) use different degrees of knowledge about the system dynamics. To model the thresholds adaptation actions, we consider two RL-based architectures. In the single-agent architecture, one agent drives the updates of both scaling thresholds. To speed-up the learning, the multi-agent architecture adopts a distinct agent per threshold. Simulation- and prototype-based results show the benefits of the proposed solutions when compared to the state-of-the-art policies and highlight the advantages of multi-agent MB Threshold and DQL Threshold approaches, in terms of deployment objectives and execution times.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				mar-2023
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1109/TCC.2022.3163357
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Esperti anonimi
			
	Settore disciplinare dell'articolo (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
			
	Lingua del contenuto
	
				English
			
	Impact Factor ISI
	
				Con Impact Factor ISI
			
	Parole chiave
	
				Elasticity; Self-adaptation; Reinforcement Learning; Deep Q-Learning; Microservice Architecture
			
	URL alternativo
	
				https://ieeexplore.ieee.org/document/9744560
			
	Citazione
	
				Rossi, F., Cardellini, V., Lo Presti, F., Nardelli, M. (2023). Dynamic Multi-metric Thresholds for Scaling Applications Using Reinforcement Learning. IEEE TRANSACTIONS ON CLOUD COMPUTING, 11(2), 1807-1821 [10.1109/TCC.2022.3163357].
			
	Tutti gli autori
	
						Rossi, F; Cardellini, V; Lo Presti, F; Nardelli, M
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
tcc2022.pdf solo utenti autorizzati Tipologia: Documento in Post-print Licenza: Copyright dell'editore Dimensione 1.84 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.84 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/295811

Citazioni

ND

11

9

social impact