A framework for distributed large-scale sparse regression

The event is taking part on the Friday, Nov 3rd 2017 at 11.00
Theme/s: Statistics
Location of Event: Alan Turing Room 306
This event is a: Public Seminar

Abstract: An attractive approach for down-scaling a Big Data problem is to partition the dataset into subsets before fitting. For a dataset with a large number of variables, this is best done via partitioning features, which however suffers from not taking correlations into account if not done properly. We propose a framework named DECO by applying a simple decorrelation step before performing sparse regression on each subset. The framework works for elliptically distributed features, heavy-tailed errors and a general class of sparsity penalties. Its performance is illustrated via sythesized and real data analysis. This is joint work with Xiangyu Wang at Google and David Dunson at Duke.

Report an error on this page

External Speakers

Prof Chenlei Leng, (University of Warwick)