TY - GEN
T1 - Modeling graphs using a mixture of Kronecker models
AU - Mahapatra, Suchismit
AU - Chandola, Varun
N1 - Publisher Copyright: © 2015 IEEE.
PY - 2015/12/22
Y1 - 2015/12/22
N2 - Generative models for graphs are increasingly becoming a popular tool for researchers to generate realistic approximations of graphs. While in the past, focus was on generating graphs which follow general laws, such as the power law for degree distribution, current models have the ability to learn from observed graphs and generate synthetic approximations. The primary emphasis of existing models has been to closely match different properties of a single observed graph. Such models, though stochastic, tend to generate samples which do not have significant variance in terms of the various graph properties. We argue that in many cases real graphs are sampled drawn from a graph population (e.g., networks sampled at various time points, social networks for individual schools, healthcare networks for different geographic regions, etc.). Such populations typically exhibit significant variance. However, existing models are not designed to model this variance, which could lead to issues such as overfitting. We propose a graph generative model that focuses on matching the properties of real graphs and the natural variance expected for the corresponding population. The proposed model adopts a mixture-model strategy to expand the expressiveness of Kronecker product based graph models (KPGM), while building upon the two strengths of KPGM, viz., ability to model several key properties of graphs and to scale to massive graph sizes using its elegant fractal growth based formulation. The proposed model, called x-Kronecker Product Graph Model, or xKPGM, allows scalable learning from observed graphs and generates samples that match the mean and variance of several salient graph properties. We experimentally demonstrate the capability of the proposed model to capture the inherent variability in real world graphs on a variety of publicly available graph data sets.
AB - Generative models for graphs are increasingly becoming a popular tool for researchers to generate realistic approximations of graphs. While in the past, focus was on generating graphs which follow general laws, such as the power law for degree distribution, current models have the ability to learn from observed graphs and generate synthetic approximations. The primary emphasis of existing models has been to closely match different properties of a single observed graph. Such models, though stochastic, tend to generate samples which do not have significant variance in terms of the various graph properties. We argue that in many cases real graphs are sampled drawn from a graph population (e.g., networks sampled at various time points, social networks for individual schools, healthcare networks for different geographic regions, etc.). Such populations typically exhibit significant variance. However, existing models are not designed to model this variance, which could lead to issues such as overfitting. We propose a graph generative model that focuses on matching the properties of real graphs and the natural variance expected for the corresponding population. The proposed model adopts a mixture-model strategy to expand the expressiveness of Kronecker product based graph models (KPGM), while building upon the two strengths of KPGM, viz., ability to model several key properties of graphs and to scale to massive graph sizes using its elegant fractal growth based formulation. The proposed model, called x-Kronecker Product Graph Model, or xKPGM, allows scalable learning from observed graphs and generates samples that match the mean and variance of several salient graph properties. We experimentally demonstrate the capability of the proposed model to capture the inherent variability in real world graphs on a variety of publicly available graph data sets.
UR - https://www.scopus.com/pages/publications/84963761995
U2 - 10.1109/BigData.2015.7363817
DO - 10.1109/BigData.2015.7363817
M3 - Conference contribution
T3 - Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
SP - 727
EP - 736
BT - Proceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
A2 - Luo, Feng
A2 - Ogan, Kemafor
A2 - Zaki, Mohammed J.
A2 - Haas, Laura
A2 - Ooi, Beng Chin
A2 - Kumar, Vipin
A2 - Rachuri, Sudarsan
A2 - Pyne, Saumyadipta
A2 - Ho, Howard
A2 - Hu, Xiaohua
A2 - Yu, Shipeng
A2 - Hsiao, Morris Hui-I
A2 - Li, Jian
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IEEE International Conference on Big Data, IEEE Big Data 2015
Y2 - 29 October 2015 through 1 November 2015
ER -