05月15日 南开大学李忠华副教授学术报告

发布时间:2018-05-09   浏览次数:291

报 告 人:李忠华 副教授(南开大学统计研究院)

报告题目:Multiple Outliers Detection in Sparse High Dimensional Regression




  李忠华,南开大学统计研究院副教授,美国北卡罗莱纳大学教堂山分校、美国明尼苏达大学、新加坡国立大学、香港科技大学、香港城市大学访问学者。研究方向为统计质量控制、变点、质量工程、高维统计等。合作出版专著1本,发表SCI学术论文30余篇,包括Technometrics、Journal of Quality Technology等。现任中国优选法统筹法与经济数学研究会工业工程分会常务理事、中国现场统计研究会高维数据统计分会理事、北京大数据协会理事、天津市现场统计研究会理事、美国Mathematical Reviews评论员等。


  The presence of outliers would inevitably lead to distorted analysis and inappropriate prediction, especially for multiple outliers in high-dimensional regression, where the high dimensionality of the data might amplify the chance of an observation or multiple observations being outlying. Noting that the detection of outliers is not only necessary but also important in high-dimensional regression analysis, we, in this talk, propose a feasible outlier detection approach in sparse high-dimensional linear regression model. Firstly, we search a clean subset by use of the sure independence screening (SIS) method and the least trimmed square (LTS) regression estimates. Then, we define a high-dimensional outlier detection (HOD) measure and propose a multiple outliers detection approach through multiple testing procedures. In addition, to enhance efficiency, we refine the outlier detection rule after obtaining a relatively reliable non-outlier subset based on the initial detection approach. By comparison studies based on Monte Carlo simulation, it is shown that the proposed method performs well for detecting multiple outliers in sparse high-dimensional linear regression model. We further illustrate the application of the proposed method by empirical analysis of a real-life protein and gene expression data.