A Systematic Literature Review on Automated Log Abstraction Techniques

Abstract

Context - Logs are often the first and only information available to software engineers to understand and debug their systems. Automated log-analysis techniques help software engineers gain insights into large log data. These techniques have several steps, among which log abstraction is the most important because it transforms raw log-data into high-level information. Thus, log abstraction allows software engineers to perform further analyses. Existing log-abstraction techniques vary significantly in their designs and performances. To the best of our knowledge, there is no study that examines the performances of these techniques with respect to the following seven quality aspects concurrently - mode, coverage, delimiter independence, efficiency,scalability, system knowledge independence, and parameter tuning effort. Objectives - We want (1) to build a quality model for evaluating automated log-abstraction techniques and (2) to evaluate and recommend existing automated log-abstraction techniques using this quality model. Method - We perform a systematic literature review (SLR) of automated log-abstraction techniques. We review 89 research papers out of 2,864 initial papers. Results - Through this SLR, we (1) identify 17 automated log-abstraction techniques, (2) build a quality model composed of seven desirable aspects - coverage, delimiter independence, efficiency, system knowledge independence, mode, parameter tuning effort required, and scalability, and (3) make recommendations for researchers on future research directions. Conclusion - Our quality model and recommendations help researchers learn about the state-of-the-art automated log-abstraction techniques, identify research gaps to enhance existing techniques, and develop new ones. We also support software engineers in understanding the advantages and limitations of existing techniques and in choosing the suitable technique to their unique use cases.

Publication
Journal of Information and Software Technology
Associate Professor

My research interests include mainly Empirical Software Engineering, Software Quality, Debugging, and Software Engineering for Computer Games. I’m the creator of Swarm Debugging.