Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization
WSDM 2016, 2016/2
データサイエンス (Data Science)
- This paper proposes a novel method for transductive classi-
fication on heterogeneous information networks composed of
multiple types of vertices. Such networks naturally represent
many real-world Web data such as DBLP data (author, paper,
and conference). Given a network where some vertices
are labeled, the classifier aims to predict labels for the remaining
vertices by propagating the labels to the entire network.
In the label propagation process, many studies reduce
the importance of edges connecting to a high-degree vertex.
The assumption is unsatisfactory when reliability of a label
of a vertex cannot be implied from its degree. On the basis
of our intuition that edges bridging across communities are
less trustworthy, we adapt edge betweenness to imply the importance
of edges. Since directly applying the conventional
edge betweenness is inefficient on heterogeneous networks,
we propose two additional refinements. First, the centrality
utilizes the fact that networks contain multiple types of vertices.
Second, the centrality ignores flows originating from
endpoints of considering edges. The experimental results on
real-world datasets show our proposed method is more effective
than a state-of-the-art method, GNetMine. On average,
our method yields 92.79 ± 1.25% accuracy on a DBLP network
even if only 1.92% of vertices are labeled. Our simple
weighting scheme results in more than 5 percentage points
increase in accuracy compared with GNetMine.
Poster Download (904KB)
Slides Download (540KB)
Transductive Classification on Heterogeneous Information Networks with Edge Betweenness-based Normalization（外部サイト／External Site Link）