Motif Counting via Subgraph Sampling

Sumit Mukherjee – Columbia University

Abstract

Consider the subgraph sampling model, where we observe a random subgraph of a given (possibly non random) large graph $G_n$, by choosing vertices of $G_n$ independently at random with probability $p_n$. In this setting, we study the question of estimating the number of copies $N(H,G_n)$ of a fixed motif/small graph (think of $H$ as edges, two stars, triangles) in the big graph $G_n$. We derive necessary and sufficient conditions for the consistency and the asymptotic normality of a natural Horvitz-Thompson (HT) type estimator.

As it turns out, the asymptotic normality of the HT estimator exhibits an interesting fourth-moment phenomenon, which asserts that the HT estimator (appropriately centered and rescaled) converges in distribution to the standard normal whenever its fourth-moment converges to 3. We apply our results to several natural graph ensembles, such as sparse graphs with bounded degree, Erdős-Renyi random graphs, random regular graphs, and dense graphons.

This talk is based on joint work with Bhaswar B. Bhattacharya and Sayan Das.