FEDAMON: A Forecast-Based, Error-Bounded and Data-Aware Approach to Continuous Distributed Monitoring
Paper i proceeding, 2025
such as data center load balancing, fleet management, and smart
grid energy optimization. Traditional continuous monitoring solu-
tions often require significant communication overhead, straining
network resources. This paper addresses the continuous distributed
monitoring problem, where a central coordinator needs to track
statistics from numerous distributed nodes in real-time. We propose
a novel forecast-based, error-bounded, and data-aware approach
that significantly reduces communication costs while maintaining
accurate monitoring. Instead of transmitting all observed values
to the central coordinator, our event-based monitoring leverages
lightweight forecasting models at edge nodes. Both the coordi-
nator and distributed nodes predict the evolution of local values,
communicating only when deviations exceed a predefined error
threshold. To adapt to dynamically changing trends in data streams,
we introduce a data-aware model selection strategy that optimizes
the balance between communication frequency and monitoring
accuracy. Our solution is evaluated on diverse datasets and results
demonstrate a substantial reduction in communication overhead
with minimal impacts on accuracy, outperforming baseline monitor-
ing regarding communication complexity, e.g., sending, on average,
only 10% of baseline update events while maintaining less than
2% average error across all monitored streams. Furthermore, we
show that our standard parameter solution even surpasses the best
calibrated single models, achieving up to a 17% improvement in
communication overhead with identical guarantees on maximum
error. Optimizing the control factor in data-aware approach leads to
a 13% improvement in performance, reducing error by 1%, without
incurring additional communication costs. We believe our approach
offers a scalable and efficient solution, enabling fully automatic,
real-time monitoring with optimized performance.
network monitor- ing
distributed tracking
data-aware approaches
continuous monitoring
distributed data streams
Författare
Yixing Zhang
Nätverk och System
Romaric Duvignau
Nätverk och System
DEBS 2025 - Proceedings of the 19th ACM International Conference on Distributed and Event-based Systems
39-50
979-8-4007-1332-3 (ISBN)
Gothenburg, Sweden,
READY: Rethinking Monitoring for Large Distributed Systems
Data- och informationsteknik, 2024-03-01 -- 2029-03-01.
Ämneskategorier (SSIF 2025)
Datavetenskap (datalogi)
DOI
10.1145/3701717.3730544