An increasing number of data-driven applications rely on the ability of processing data flows in a timely manner, exploiting for this purpose Data Stream Processing~(DSP) systems. Elasticity is an essential feature for DSP systems, as workload variability calls for automatic scaling of the application processing capacity, to avoid both overload and resource wastage. In this work, we implement auto-scaling in Pulsar Functions, a function-based streaming framework built on top of Apache Pulsar. The latter is is a distributed publish-subscribe messaging platform that natively supports serverless functions. Considering various state-of-the-art policies, we show that the proposed solution is able to scale application parallelism with minimal overhead.
RUSSO RUSSO, G., Schiazza, A., Cardellini, V. (2021). Elastic pulsar functions for distributed stream processing. In ICPE '21: Companion of the ACM/SPEC International Conference on Performance Engineering (pp.9-16). ACM [10.1145/3447545.3451901].
Elastic pulsar functions for distributed stream processing
Russo Russo Gabriele;Cardellini Valeria
2021-04-01
Abstract
An increasing number of data-driven applications rely on the ability of processing data flows in a timely manner, exploiting for this purpose Data Stream Processing~(DSP) systems. Elasticity is an essential feature for DSP systems, as workload variability calls for automatic scaling of the application processing capacity, to avoid both overload and resource wastage. In this work, we implement auto-scaling in Pulsar Functions, a function-based streaming framework built on top of Apache Pulsar. The latter is is a distributed publish-subscribe messaging platform that natively supports serverless functions. Considering various state-of-the-art policies, we show that the proposed solution is able to scale application parallelism with minimal overhead.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.