The Statistics Seminar Speaker for Wednesday, March 2, 2016 is Edoardo Airoldi, an Associate Professor of Statistics, Harvard University, where he also leads the Harvard Laboratory for Applied Statistical Methodology & Data Science. Airoldi's research explores modeling, inferential, and other methodological issues that often arise in applied problems where network data (i.e., measurements on pairs of units, or tuples more generally) need to be considered, and standard statistical theory and methods are no longer adequate to support the goals of the analysis. More broadly, his research interests encompass statistical methodology and theory with application to molecular biology and computational social science, including

• Theory and methods for the analysis of network data

• Design and analysis of experiments in the presence of interference

• Design and evaluation of network sampling mechanisms

• Geometry of inference in ill-posed inverse problems, including network tomography and contingency tables

• Modeling and inference in high-throughput biology, including mass spectrometry and next-generation sequencing

• Applied methodology in computational social science

• Statistical inference strategies for massive data sets

Areas of technical interest include approximation theorems, inequalities, convex and combinatorial optimization, and geometry.

* Title*: Elements of Causal Inference on Social, Biomedical and Biological Networks

* Abstract*: Estimating the causal effect of an intervention in a network setting is the primary interest, and a major challenge, in many modern endeavors at the nexus of science, technology and society. Examples include HIV testing and awareness campaigns on mobile phones, improving healthcare in rural populations using social interventions, promoting standard of care practices among US oncologists on dedicated social media platforms, and gaining a mechanistic understanding of cellular and regulation dynamics in the cell.

A salient feature of these problems is that the response measured on any one unit likely depends on the intervention given to other units, a situation technically referred to as “interference” in the parlance of statistics and machine learning. Importantly, the causal effect of interference itself is often among the inferential targets of primary interest. Classical approaches to causal inference, however, largely rely on the assumption of “lack of interference”, and/or on designing experiments that limit the role of interference, and are therefore untenable in many modern endeavors.

In this talk, we will focus on technical issues that arise in estimating causal effects when interference can be attributed to a network among the units of analysis. We will outline new methodology for making causal inferences in this setting, offer some theoretical insights, and, time permitting, we will discuss strategies for optimal experimental design that involve a piecewise constant approximation of a certain graphon.

*Refreshments will be served after the seminar in 1181 Comstock Hall. *