Anomaly Detection in Continuous-Time Temporal Provenance Graphs

Snapshot of a provenance graph cyber attack

Abstract

Recent advances in Graph Neural Networks (GNNs) have matured the field of learning on graphs, making GNNs essential for prediction tasks in complex, interconnected, and evolving systems. In this paper, we focus on self-supervised, inductive learning for continuous-time dynamic graphs. Without compromising generality, we propose an approach to learn representations and mine anomalies in provenance graphs, which are a form of large-scale, heterogeneous, attributed, and continuous-time dynamic graphs used in the cybersecurity domain, syntactically resembling complex temporal knowledge graphs. We modify the Temporal Graph Network (TGN) framework to heterogeneous input data and directed edges, refining it specifically for inductive learning on provenance graphs. We present and release two pioneering large-scale, continuous-time temporal, heterogeneous, attributed benchmark graph datasets. The datasets incorporate expert-labeled anomalies, promoting subsequent research on representation learning and anomaly detection on intricate real-world networks. Comprehensive experimental analyses of modules, datasets, and baselines underscore the effectiveness of TGN-based inductive learning, affirming its practical utility in identifying semantically significant anomalies in real-world systems.

Publication
In Temporal Graph Learning Workshop @ NeurIPS 2023
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.

Jakub Reha
Jakub Reha
PhD Candidate in Machine Learning

My research interests include causality and graph structured data.