A Study on Identification of High Leverage Points in Multiple Linear Regression

  • Nor Azima Ismail Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Kelantan, Bukit Ilmu, Machang, Kelantan, Malaysia
  • Prof Dr. Habshah Midi Faculty of Sciences, University Putra Malaysia
  • Nurul Bariyah Ibrahim Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Kelantan, Bukit Ilmu, Machang, Kelantan, Malaysia
  • Norafefah Mohamad Sobri Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Kelantan, Bukit Ilmu, Machang, Kelantan, Malaysia
  • Siti Nurani Zulkifli Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA Kelantan, Bukit Ilmu, Machang, Kelantan, Malaysia
Keywords: Multicollinearity, High Leverage Points, Robust Mahalanobis Distance, Robust Diagnostic Method

Abstract

Outliers with respect to the predictor variables are called high leverage points. The observations that are slightly different from all others can drive to a large difference in the results of regression analysis. In regression analysis, the detection of high leverage points is compulsory, as they will give large impact on the estimation values as well as lead to multicollinearity problems. In this situation, robust regression procedure can be very useful to deal with problems arise due to the existence of high leverage points. The aim of this study is to compare the performance of three methods in detecting high leverage points. At first stage, the two well-known data sets are considered. The first data used is artificial data set generated by Hawkins, Bradu and Kass in 1984 and the second data used is stack loss data by Brownlee in 1965. The second stage of this study is to conduct simulation study whereby the data were generated based on clean and contaminated data. The three sets of measures being considered in this study are Leverage methods Ttwice-the-mean-rule), Generalized Potentials and Diagnostic Robust Generalized Approach (DRGP). The result indicates that DRGP successfully proved its ability as a powerful method of detecting high leverage points as compared to the other two methods using both artificial data sets and simulated data.

Published
2016-06-10