The Statistics Seminar speaker for Wednesday, February 1, 2017 is Oscar Hernan Madrid Padilla, a PhD student in statistics at the University of Texas at Austin. He works in regularized likelihood methods and Bayesian non-parametrics under the supervision of Dr. James Scott. He enjoys both the theoretical and applied side of problems. Prior to joining UT he earned a B.S in Mathematics at the Mathematical Research Center (CIMAT) in Mexico.

*Title*: The DFS Fused Lasso: Linear-Time Denoising over General Graphs

*Abstract*: The fused lasso, also known as (anisotropic) total variation denoising, is widely used for piecewise constant signal estimation with respect to a given undirected graph. The fused lasso estimate is highly nontrivial to compute when the underlying graph is large and has arbitrary structure. But for a special graph structure, namely, the chain graph, the fused lasso—or simply, 1d fused lasso—can be computed in linear time. In this paper, we establish a surprising connection between the total variation of a generic signal defined over an arbitrary graph, and the total variation of this signal over a chain graph induced by running depth-first search (DFS) over the nodes of the graph. Specifically, we prove that for any signal, its total variation over the induced chain graph is no more than twice its total variation over the original graph. This connection leads to several interesting theoretical and computational conclusions. Denoting by m and n the number of edges and nodes, respectively, of the graph in question, our result implies that for an underlying signal with total variation t over the graph, the fused lasso achieves a mean squared error rate of t^{2/3} n^{−2/3} . Moreover, precisely the same mean squared error rate is achieved by running the 1d fused lasso on the induced chain graph from running DFS. Importantly, the latter estimator is simple and computationally cheap, requiring only O(m) operations for constructing the DFS-induced chain and O(n) operations for computing the 1d fused lasso solution over this chain. Further, for trees that have bounded max degree, the error rate of t^{2/3} n^ {−2/3} cannot be improved, in the sense that it is the minimax rate for signals that have total variation t over the tree. Finally, we establish several related results—for example, a similar result for a roughness measure defined by the L0 norm of differences across edges in place of the total variation metric.