Chennai Mathematical Institute

Seminars




Data Science Seminar
Date: Friday, 2 February 2024
Time: 2:00 PM
Venue: NKN Hall
When to use Linear Regression? Model assumptions, diagnostics, and robustness

Miheer Dewaskar
Duke University, USA.
02-02-24


Abstract

Given observations, Linear Regression aims to learn a linear relationship between an outcome variable and a set of predictor variables. The usual least squares estimator (and its associated confidence interval) is derived under the assumption that the errors are Gaussian with zero mean and constant variances. In this talk, we discuss diagnostic measures to detect when some of the assumptions underlying linear regression are violated in a way that may adversely affect our inferences. This includes the case when the errors are non-Gaussian, have non-constant variances, or there are outliers and high-leverage data points that unduly influence our model fit. We discuss some strategies to correct for these problems based on transforming data and using robust regression methods.