Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in the face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most of the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments.We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization (BO) and reinforcement learning (RL) to devise policies. The evaluation shows that our approach is able to meet users' requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.

RUSSO RUSSO, G., Cardellini, V., LO PRESTI, F. (2023). Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 18(4), 1-44 [10.1145/3597435].

Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources

Russo Russo Gabriele;Cardellini Valeria;Lo Presti Francesco
2023-12-01

Abstract

Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in the face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most of the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments.We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization (BO) and reinforcement learning (RL) to devise policies. The evaluation shows that our approach is able to meet users' requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.
dic-2023
Pubblicato
Rilevanza internazionale
Articolo
Esperti anonimi
Settore ING-INF/05
English
Con Impact Factor ISI
Auto-scaling
Data Stream Processing
reinforcement learning
resource management
https://dl.acm.org/doi/10.1145/3597435
RUSSO RUSSO, G., Cardellini, V., LO PRESTI, F. (2023). Hierarchical Auto-scaling Policies for Data Stream Processing on Heterogeneous Resources. ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 18(4), 1-44 [10.1145/3597435].
RUSSO RUSSO, G; Cardellini, V; LO PRESTI, F
Articolo su rivista
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2108/345524
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact