Despite the growing number of tools being used to anneal so-called big data, researchers are only now beginning to find ways to handle big networks. A new approach described in the International Journal of Data Science, takes a local community approach to studying networks that could have applications in understanding how disease outbreaks become pandemics, defeating terrorist networks, thwarting malware, and understanding the effect of influencers and viral advertising on marketing.
Ali Choumane and Abbass Al-Akhrass of the Faculty of Sciences in the LaRIFA Lab at the Lebanese University in Nabatieh, Lebanon, explain analyzing huge networks is computationally very expensive in terms of the time and resources needed to process all the nodes and connections between them in order to find hubs and other interesting features. This is especially the case where a network contains densely connected nodes.
Community detection is one approach to circumventing this mammoth task allowing researchers to find the local connections from the busiest of individual nodes. The team is developing an algorithm to find such local communities in a huge network quickly and at a lower computational cost than earlier approaches. The team explains how they start with a seed node and allow the algorithm to iteratively expand on this to identify a community around that node that most resembles known community structures previously seen in real life. Such communities are likely to be the most realistic, after all.
The expansion process builds using a neural network classifier that can discern which nodes ought to be added to the local community and which ought to be discarded. The classifier can be fine-tuned to adjust resolution so that smaller or larger communities can be found within a huge network without the need to retrain the algorithm each time.
“We trained this classifier using three measures that allowed us to mutually quantify the strength of the relation between nodes and communities,” the team explains.”These measures depend on the proportion of edges that the node has with its community, how much the neighbours of the node are involved in its community and finally the membership degree of the node in the community.”
The researchers add that they used the well-known Lancichinetti–Fortunato–Radicchi (LFR) synthetic networks as a benchmark as well as real-world networks from different application domains to demonstrate experimentally the high performance of their approach.
Choumane, A. and Al-Akhrass, A. (2020) ‘Supervised local community detection algorithm’, Int. J. Data Science, Vol. 5, No. 3, pp.247–261.