Link prediction in social networks is to infer the new links likely to be formed next or to reconstruct the links that are currently missing. Link prediction is of great interest recently since one of the most important goals of social networks is to connect people, so that they can interact with their friends from real world or make new friend through Internet. So the predicted links in social networks can be helpful for people to have connections with each others. Other than the pure topological network structures, social networks also have rich information of social activities of each user, such as tweeting, retweeting, and replying activities.
Social science theories, such as social influence, suggests that the social activities could have potential impacts on the neighbors, and links in social networks are the results of the impacts taking place between different users. It motivates us to perform link prediction by taking advantage of the activity information.
There has been a lot of proposed methods to measure the social influence through user activity information. However, traditional methods assigned some social influence measures to users universally based on their social activities, such as number of retweets or mentions the users have. But the social influence of one user towards others may not always remain the same with respect to different neighbors, which demands a personalized learning schema. Moreover, learning social influence from heterogeneous social activities is a nontrivial problem, since the information carried in the social activities is implicit and sometimes even noisy.
Motivated by time-series analysis, we investigate the potential of modeling influence patterns based on pure timestamps, i.e., we aim to simplify the problem of processing heterogeneous social activities to a sequence of timestamps. Then we use timestamps as an abstraction of each activity to calculate the reduction of uncertainty of one users social activities given the knowledge of another one. The key idea is that, if a user i has impact on another user j, then given the activity timestamps of user i, the uncertainty in user j's activity timestamps could be reduced. The uncertainty is measured by entropy in information theory, which is proven useful to detect the significant influence flow in time-series signals in information-theoretic applications.
By employing the proposed influence metric, we incorporate the social activity information into the network structure, and learn a unified low-dimensional representation for all users. Thus, we could perform link prediction effectively based on the learned representation. Through comprehensive experiments, we demonstrate that the proposed method can perform better than the state-of-the-art methods in different real-world link prediction tasks.
- Hu, James Professor