Randomized iterative methods for corrupted data


When the data is large, or comes in a streaming way, randomized iterative methods provide an efficient way to solve a variety of problems, including solving linear systems, finding least square solutions, optimizing with gradient descent and Newton methods, solving feasibility problems and beyond. However, if the data is corrupted with arbitrarily large sparse errors, one can expect the iterates are diverted far from the solution with each corrupted equation they encounter. I am going to talk about our recently proposed robust versions of Randomized Kaczmarz and Stochastic Gradient Descent methods for solving linear systems that avoid harmful corrupted equations by exploring the space as they proceed with the iterates. We show that the convergence rates in expectation are comparable with the convergence of the vanilla methods on uncorrupted systems. I will also touch on how to use the information obtained on the exploration phase efficiently, and what structural characteristics of the data matrix are crucial for such methods. Based on the joint work with Deanna Needell, Jamie Haddock, Will Swartworth, Ben Jarman, and Lu Cheng.