ClusterLog: Clustering Logs for Effective Log-Based Anomaly Detection
DescriptionWith the increasing prevalence of scalable file systems in the context of HPC, the importance of accurate anomaly detection on runtime logs is increasing. But as it currently stands, many log-based anomaly detection methods have encountered numerous challenges when applied to logs from many parallel file systems (PFSes) due to their irregularity and ambiguity in time-based log sequences. To circumvent these problems, this study proposes ClusterLog, a log pre-processing method to cluster temporal sequence of log keys based on their semantic similarity. By grouping semantically and sentimentally similar logs, it aims to represent log sequences with the smallest amount of unique log keys, intending to improve the ability for a downstream sequence based model to learn the log patterns. The preliminary results indicate not only its effectiveness in reducing the granularity of log sequences without the loss of important sequence information, but also its generalizability to different file systems’ logs.
Event Type
Workshop
TimeMonday, 14 November 20222:35pm - 2:59pm CST
LocationC143-149
Registration Categories
W
Session Formats
Recorded
Back To Top Button