Invited Speakers
Elizabeth Ogburn, Johns Hopkins - Causal inference with social network data exhibiting dependence, contagion, and interference
Karl Rohe, Wisconsin - Network sampling for Stochastic Blockmodel inference
Jennifer Neville, Purdue - Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models
Elizabeth Ogburn, Johns Hopkins - Causal inference with social network data exhibiting dependence, contagion, and interference
Interest in and availability of social network data has led to more and more attempts to make causal and statistical inferences using data collected from subjects linked by social network ties. But inference about all kinds of estimands, from simple sample means to complicated causal peer effects, is difficult when only a single network of non-independent observations is available. Estimation of causal effects is complicated not only by dependence, but also by the fact that one subject`s exposure or treatment may affect not only his/her own outcome but also the outcomes of his/her social contacts. This phenomenon, known as interference, poses challenges for nonparametric identification of causal effects. There is a dearth of principled methods for identifying causal effects using observational data of this kind and for statistical inference that takes into account the dependence that such observations can manifest. We describe methods for causal and semiparametric inference when the dependence is due to the transmission of information or outcomes along network ties, and discuss extensions to other, more general sources of dependence.
Karl Rohe, Wisconsin - Network sampling for Stochastic Blockmodel inference
Web crawling, snowball sampling, and respondent-driven sampling (RDS) are three types of network sampling techniques that are popular when it is difficult to contact individuals in the population of interest. For example, the CDC uses RDS to estimate the proportion of injection drug users that are HIV+ in major metropolitan areas. It has been previously shown that if participants refer "too many" other participants, then the standard approaches do not provide "square root n" consistent estimators of HIV prevalence. This talk will show how we can (1) use the network sample to make inferences about the Stochastic Blockmodel parameters and (2) how this inference aids the estimation of HIV prevalence.
Jennifer Neville, Purdue - Lifted and Constrained Sampling of Attributed Graphs with Generative Network Models
Recently, a number of generative network models (GNMs) have been developed that accurately capture characteristics of real world networks, but since they are typically defined in a procedural manner, it is difficult to identify commonalities in their structure. Moreover, procedural definitions make it difficult to develop statistical sampling algorithms that are both efficient and correct. In this talk, we identify a family of GNMs that share a common latent structure, create a Bayesian network (BN) representation to capture their common form, and show how to transform existing GNMs to this representation. Then, we develop a provably correct sampling method that exploits parametric symmetries and context-specific dependence in the BNs to maximize computational efficiency.
Next, we show how to use the sampling method to generate networks with correlated vertex attributes. Most methods in network science focus on modeling the graph structure P(G) without considering node-attributes. Other methods in relational learning focus on modeling the attribute distribution conditioned on the network structure, i.e., P(X|G). But, no method effectively represents and samples from the joint distribution P(G,X) due to the complexities of modeling both structure and correlated node attributes. To address this, we outline a novel sampling method, CSAG, which approximates sampling from P(G,X) by combining hierarchical GNMs with an attribute model in a proposal distribution. CSAG constrains every step of the sampling process to consider the structure of the GNM—to bias the search to regions of the space with higher likelihood and produce more accurate samples.
Next, we show how to use the sampling method to generate networks with correlated vertex attributes. Most methods in network science focus on modeling the graph structure P(G) without considering node-attributes. Other methods in relational learning focus on modeling the attribute distribution conditioned on the network structure, i.e., P(X|G). But, no method effectively represents and samples from the joint distribution P(G,X) due to the complexities of modeling both structure and correlated node attributes. To address this, we outline a novel sampling method, CSAG, which approximates sampling from P(G,X) by combining hierarchical GNMs with an attribute model in a proposal distribution. CSAG constrains every step of the sampling process to consider the structure of the GNM—to bias the search to regions of the space with higher likelihood and produce more accurate samples.