The properties (or labels) of nodes in networks can often be predicted based on their proximity and their connections to other labeled nodes. So-called “label propagation algorithms” predict the labels of unlabeled nodes by propagating information about local label density iteratively through the network. These algorithms are fast, simple and scale to large networks but nonetheless regularly perform better than slower and much more complex algorithms on benchmark problems. We show here, however, that these algorithms have an intrinsic limitation that prevents them from adapting to some common patterns of network node labeling; we introduce a new algorithm, 3Prop, that retains all their advantages but is much more adaptive. As we show, 3Prop performs very well on node labeling problems ill-suited to label propagation, including predicting gene function in protein and genetic interaction networks and gender in friendship networks, and also performs slightly better on problems already well-suited to label propagation such as labeling blogs and patents based on their citation networks. 3Prop gains its adaptability by assigning separate weights to label information from different steps of the propagation. Surprisingly, we found that for many networks, the third iteration of label propagation receives a negative weight.
The code is available from the authors by request.
Publisher: Public Library of Science
Date Published: 28-December-2012
Author(s): Mostafavi S., Goldenberg A., Morris Q.