We investigate the potential for using neighbourhood attributes alone, to match unidentified entities across networks, and to classify them within networks. The motivation is to identify individuals across the dark social networks that underly recorded networks. We test an Enron email database and show the out-neighbourhoods of email addresses are highly distinctive. Then, using citation databases as proxies, we show that a paper in CiteSeer which is also in DBLP, is highly likely to be matched successfully, based on its (uncertainly labelled) in-neighbours alone. A paper in SPIRES can be classified with 80% accuracy, based on classification ratios in its in-neighbourhood alone.
History
Start page
99
End page
110
Total pages
12
Outlet
Proceedings of the 6th Workshop on Complex Networks Complex Networks VI (CompleNet 2015)
Editors
Giuseppe Mangioni, Filippo Simini, Stephen Miles Uzzo, Dashun Wang