ABSTRACT: Prediction problems typically assume the training data are independent samples, but in many modern applications samples come from individuals connected by a network. For example, in adolescent health studies of risk-taking behaviors, information on the subjects’ social networks is often available and plays an important role through network cohesion, the empirically observed phenomenon of friends behaving similarly. Taking cohesion into account should allow us to improve prediction. Here we propose a regression-based framework with a network penalty on individual node effects to encourage similarity between predictions for linked nodes, and show that it outperforms traditional models both theoretically and empirically when network cohesion is present. The framework is easily extended to other models, such as the generalized linear model and Cox’s proportional hazard model. Applications to predicting teenagers’ behavior based on both demographic covariates and their friendship networks from the AddHealth data are discussed in detail.
BIO: Liza Levina received her PhD in Statistics from UC Berkeley in 2002 and joined the University of Michigan the same year. Her research interestsinclude networks, high-dimensional data, and sparsity. She has worked on estimating large covariance matrices, graphical models, and other topics in inference for high-dimensional data. She also works on statistical inference for network data, including problems of community detection and link prediction. Her research covers methodology, theory, and applications, especially to spectroscopy, remote sensing and, in the past, computer vision. She received the junior Noether Award from the ASA in 2010 and was elected a member of ISI in 2011.