Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources

IRIS

Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in the face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most of the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments.We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization (BO) and reinforcement learning (RL) to devise policies. The evaluation shows that our approach is able to meet users' requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.

RUSSO RUSSO, G., Cardellini, V., LO PRESTI, F. (2023). Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 18(4), 1-44 [10.1145/3597435].

Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources

Russo Russo Gabriele;Cardellini Valeria;Lo Presti Francesco

2023-12-01

Abstract

Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in the face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most of the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments.We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization (BO) and reinforcement learning (RL) to devise policies. The evaluation shows that our approach is able to meet users' requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data di pubblicazione
	
				dic-2023
			
	Status di pubblicazione
	
				Pubblicato
			
	DOI dell'articolo
	
				https://dx.doi.org/10.1145/3597435
			
	Rilevanza
	
				Rilevanza internazionale
			
	Tipo
	
				Articolo
			
	Referee
	
				Esperti anonimi
			
	Settore disciplinare dell'articolo (valido fino a 24/06/2024)
	
				Settore ING-INF/05
			
	Settore disciplinare dell'articolo (valido dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Lingua del contenuto
	
				English
			
	Impact Factor ISI
	
				Con Impact Factor ISI
			
	Parole chiave
	
				Auto-scaling
Data Stream Processing
reinforcement learning
resource management
			
	Citazione
	
				RUSSO RUSSO, G., Cardellini, V., LO PRESTI, F. (2023). Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 18(4), 1-44 [10.1145/3597435].
			
	Tutti gli autori
	
						RUSSO RUSSO, G; Cardellini, V; LO PRESTI, F
					
	Tipologia
	
				Articolo su rivista
			
	Appare nelle tipologie:
	
				01 - Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
ACM.pdf solo utenti autorizzati Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 2.68 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.68 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/345524

Citazioni

ND

23

14

social impact