MEAD: Model-based vertical auto-scaling for data stream processing

IRIS

The unpredictable variability of Data Stream Processing (DSP) application workloads calls for advanced mechanisms and policies for elastically scaling the processing capacity of DSP operators. Whilst many different approaches have been used to devise policies, most of the solutions have focused on data arrival rate and operator resource utilization as key metrics for auto-scaling. We here show that, under burstiness in the data flows, overly simple characterizations of the input stream can yet lead to very inaccurate performance estimations that affect such policies, resulting in sub-optimal resource allocation.We then present MEAD, a vertical auto-scaling solution that relies on online state-based representation of burstiness to drive resource allocation. We use in particular Markovian Arrival Processes (MAPs), which are composable with analytical queueing models, allowing us to efficiently predict performance at run-time under burstiness. We integrate MEAD in Apache Flink, and evaluate its benefits over simpler yet popular auto-scaling solutions, using both synthetic and real-world workloads. Differently from existing approaches, MEAD satisfies response time requirements under burstiness, while saving up to 50% CPU resources with respect to a static allocation.

RUSSO RUSSO, G., Cardellini, V., Casale, G., LO PRESTI, F. (2021). MEAD: Model-based vertical auto-scaling for data stream processing. In 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) (pp.314-323). IEEE [10.1109/CCGrid51090.2021.00041].

MEAD: Model-based vertical auto-scaling for data stream processing

Russo Gabriele;Cardellini Valeria;Casale Giuliano;Lo Presti Francesco

2021-07-01

Abstract

The unpredictable variability of Data Stream Processing (DSP) application workloads calls for advanced mechanisms and policies for elastically scaling the processing capacity of DSP operators. Whilst many different approaches have been used to devise policies, most of the solutions have focused on data arrival rate and operator resource utilization as key metrics for auto-scaling. We here show that, under burstiness in the data flows, overly simple characterizations of the input stream can yet lead to very inaccurate performance estimations that affect such policies, resulting in sub-optimal resource allocation.We then present MEAD, a vertical auto-scaling solution that relies on online state-based representation of burstiness to drive resource allocation. We use in particular Markovian Arrival Processes (MAPs), which are composable with analytical queueing models, allowing us to efficiently predict performance at run-time under burstiness. We integrate MEAD in Apache Flink, and evaluate its benefits over simpler yet popular auto-scaling solutions, using both synthetic and real-world workloads. Differently from existing approaches, MEAD satisfies response time requirements under burstiness, while saving up to 50% CPU resources with respect to a static allocation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Nome del convegno
	
				2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid)
			
	Anno del convegno
	
				2021
			
	Rilevanza del convegno
	
				Rilevanza internazionale
			
	Data dell'intervento
	
				mag-2021
			
	Data di pubblicazione
	
				lug-2021
			
	DOI dell'intervento
	
				https://dx.doi.org/10.1109/CCGrid51090.2021.00041
			
	Settore disciplinare dell'intervento (valido fino a 24/06/2024)
	
				Settore ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI
			
	Settore disciplinare dell'intervento (valido dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	Lingua del contenuto
	
				English
			
	URL alternativo
	
				https://ieeexplore.ieee.org/document/9499481
			
	Tipologia
	
				Intervento a convegno
			
	Citazione
	
				RUSSO RUSSO, G., Cardellini, V., Casale, G., LO PRESTI, F. (2021). MEAD: Model-based vertical auto-scaling for data stream processing. In 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid) (pp.314-323). IEEE [10.1109/CCGrid51090.2021.00041].
			
	Tutti gli autori
	
						RUSSO RUSSO, G; Cardellini, V; Casale, G; LO PRESTI, F
					
	Appare nelle tipologie:
	
				02 - Intervento a convegno

File in questo prodotto:

File	Dimensione	Formato
IEEE 3.pdf solo utenti autorizzati Tipologia: Versione Editoriale (PDF) Licenza: Copyright dell'editore Dimensione 1.65 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.65 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/279169

Citazioni

ND

9

9

social impact