Regularized Fine-Tuning for Representation Multi-Task Learning: Adaptivity, Minimaxity, and Robustness

YANG FENG – NEW YORK UNIVERSITY

ABSTRACT

This talk presents new theory and methods for multi-task linear regression in which related tasks share a latent low-dimensional structure. Each task’s regression vector lies in a subspace of much smaller intrinsic dimension than the ambient space, but unlike classical models assuming identical subspaces, we allow each task’s subspace to drift from a reference subspace within a controllable similarity radius and accommodate an unknown fraction of outlier tasks. We develop a penalized empirical-risk minimization algorithm and a spectral algorithm, both of which automatically adapt to the unknown similarity radius and outlier proportion. We provide information-theoretic lower bounds to demonstrate that both algorithms are nearly minimax optimal in a large regime, with the spectral algorithm being optimal in the absence of outlier tasks. Moreover, the proposed estimators are provably robust, never performing worse than independent single-task regression and achieving strict gains whenever tasks are moderately similar. Theoretical guarantees are supported by extensive numerical experiments.

RELATED PAPER: