RMIT University
Browse

Data summarization for network traffic monitoring

journal contribution
posted on 2024-11-01, 13:26 authored by Demetris Hoplaros, Zahir TariZahir Tari, Ibrahim KhalilIbrahim Khalil
Network traffic monitoring is a very difficult task, given the amount of network traffic generated even in small networks. One approach to facilitate this task is network traffic summarization. Data summarization is a key concept in data mining. However, no current measures exist in order to facilitate the evaluation of summaries. This paper presents four metrics which can be used to characterize data summarization results. Conciseness and Information Loss have already been defined, but we modified Information Loss, due to the fact that it was biased towards recurring attributes across individual summaries. We also propose two additional metrics, Interestingness and Intelligibility. Using the proposed metrics, we evaluated existing summarization techniques on well known network traffic datasets. We also proposed a summarization technique, based on an existing one but incorporating the proposed metrics as objective function. In order to further demonstrate the usability of the metrics, we performed classification on summarized datasets, showing that the metrics can be used to facilitate the selection of summaries for performing data mining. Using the summarized datasets with a reasonable conciseness, we were able to achieve similar results in terms of accuracy, but at a fraction of the running time, proportional to the conciseness of the summarized dataset.

History

Journal

Journal of Network and Computer Applications

Volume

37

Issue

1

Start page

194

End page

205

Total pages

12

Publisher

Academic Press

Place published

United Kingdom

Language

English

Copyright

© 2013 Published by Elsevier Ltd. All rights reserved.

Former Identifier

2006041052

Esploro creation date

2020-06-22

Fedora creation date

2014-04-01

Usage metrics

    Scholarly Works

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC