The accuracy and performance of Machine Learning (ML) models can gradually or even suddenly degrade when the underlying statistical distribution of data streams changes over time; this is known as concept drift. This phenomenon could adversely affect the IoT data management and analysis landscape that relies intensely on data-driven cognitive technologies. Therefore, concept drift should be detected immediately, which is challenging due to the increasing number of dimensional features and lack of ground truth. Its adaptive countermeasures also become difficult to design when data streams are being generated frequently and require latency-sensitive responses. The uncertainty and time dependencies characteristics of IoT data streams further intensify the complexity of concept drift management. This work proposes a reactive drift management framework named RADAR for streaming IoT applications that can simultaneously detect and react to concept drift using two novel methods: temporal discrepancy measure, and intensity-aware analyser. Collectively, these methods help to determine the adaptation decision to ensure reliable performance, thereby limiting the scope of the frequent ML model update. Experiments conducted using synthetic and real-world setups comprising end-to-end systems demonstrate that RADAR outperforms other benchmarks in achieving better improvement of the performance with the best F-score of 0.86, and obtaining efficient runtime with large data streams.
History
Start page
1995
End page
2007
Total pages
13
Outlet
International Conference on Data Engineering
Name of conference
2023 IEEE 39th International Conference on Data Engineering (ICDE)