We consider convergence rates of the least squares estimator (LSE) in a regression model with possibly heavy-tailed errors. Despite its importance in practical applications, theoretical understanding of this problem has been limited to light-tailed ("Gaussian") errors. We show that from a worst-case perspective, the convergence rate of the LSE in a general non-parametric regression model is given by the maximum of the Gaussian regression rate and the noise rate induced by the errors. Such a rate indicates both certain positive and negative aspects of the LSE as an estimation procedure in a heavy-tailed regression setting. As is often the case, a worst-case analysis can be conservative in certain concrete settings. In the more interesting statistical setting where the errors only have a second moment, we show that the sizes of the localized envelopes of the model give a sharp interpolation for the convergence rate of the LSE between the worst-case rate and the (optimal) parametric rate. The key technical innovation is a new multiplier inequality that sharply controls the size of the multiplier empirical processes, which proves also useful in other applications including shape-restricted/sparse linear regression models.