Quality model for evaluating and choosing a stream processing framework architecture

Stream processing framework decision tree

Abstract

Today, we have to deal with many data (Big data) and we need to make decisions by choosing an architectural framework to analyze these data coming from different area. Due to this, it becomes problematic when we want to process these data, and even more, when it is continuous data. When you want to process some data, you have to first receive it, store it, and then query it. This is what we call Batch Processing. It works well when you process big amount of data, but it finds its limits when you want to get fast (real-time) processing results, such as financial trades, sensors, user session activity, etc. The solution to this problem is stream processing. Stream processing approach consists of data arriving record by record, and rather than storing it, the processing is done as the data arrive. In this paper, we propose an assessment quality model to evaluate and choose stream processing frameworks. We describe briefly different architectural frameworks such as Spark Streaming, Storm, Flink and Samza that address the stream processing. Using our quality model, we present a decision tree to support engineers to choose a framework following the quality aspects. Finally, we evaluate our model doing a case study to Twitter and Netflix streaming; model that will serve as well for engineers, as for future framework designers.

Publication
16th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2019)