K-center: An Approach on the Multi-source Identification of Information Diffusion


The global diffusion of epidemics, computer viruses and rumors causes great damage to our society. It is critical to identify the diffusion sources and timely quarantine them. However, most methods proposed so far are unsuitable for diffusion with multiple sources because of the high computational cost and the complex spatiotemporal diffusion processes. In this paper, based on the knowledge of infected nodes and their connections, we propose a novel method to identify multiple diffusion sources, which can address three main issues in this area: (i) How many sources are there? (ii)Where did the diffusion emerge? (iii) When did the diffusion break out? We firstly derive an optimization formulation for multi-source identification problem. This is based on altering the original network into a new network concerning two key elements: propagation probability and the number of hops between nodes.

Experiments demonstrate that the altered network can accurately reflect the complex diffusion processes with multiple sources. Secondly, we derive a fast method to optimize the formulation. It has been proved that the proposed method is convergent and the computational complexity is O(mn log), where = (m, n) is the slowly growing inverse-Ackermann function, n is the number of infected nodes, and m is the number of edges connecting them. Finally, we introduce an efficient algorithm to estimate the spreading time and the number of diffusion sources. To evaluate the proposed method, we compare the proposed method with many competing methods in various real-world network topologies. Our method shows significant advantages in the estimation of multiple sources and the prediction of spreading time.