当前位置:首页  学术交流
“数字+”与统计数据工程系列讲座(第106讲)9月15日中国科学院王启华教授来我院讲座预告
发布日期:2025-09-12 阅读:13

题目:Model-Free Feature Screening via Subsampling: A Unified Framework and  Adaptive Two-Step Recovery of   Mysterious Features    

报告人:王启华

会议时间:2025年9月15日(周一)  14: 30

地点:综合楼615会议室

报告人简介:

王启华,教授,博士生导师。国家杰出青年基金获得者,教育部长江学者奖励计划特聘教授,中科院“百人计划”入选者,浙江工商大学特聘教授。主要从事不完全数据分析、高维数据分析及大规模数据统计推断等方面的研究。曾主持国家杰出青年基金项目、重点项目及多项面上项目。在国际重要刊物发表学术论文150多篇,出版专著三部,部分成果已产生二十年持久不断的学术影响。

报告摘要:

Feature screening is a common strategy in ultrahigh-dimensional data analysis. However, in the era of big data, classic feature screening methods face challenges in two aspects. First, the substantial sample size brings computational difficulties to classic screening methods due to limitations in data storage and computing resources. Second, covariates are often highly correlated in ultrahigh-dimensional data, in which case most existing screening methods may ignore significant covariates that are marginally uncorrelated with the response. In this paper, we first develop a general subsampling-based feature screening framework via sampling with replacement scheme. This framework considers a wide range of correlation measures, including model-free screening measures. The proposed general method enjoys the sure screening property under some mild assumptions and is computationally attractive. Furthermore, when strong dependence exists between covariates, we propose a two-step subsampling method based on the proposed general framework. In the first step, uniform sampling is used to select covariates with strong marginal correlation with the response. In the second step, kernel-based non-uniform sampling is designed to recruit important features from remaining covariates. %The proposed general  framework is %used for the two steps with %different sampling strategies. 

The two-step method can help to recover significant covariates that have no marginal correlations with response. The sure screening property is established for the two-step method. Simulation studies and a real data analysis are conducted to illustrate the empirical performance of our proposed methods.



上一篇:下一篇:

友情链接: 浙江工商大学统计学院 |  中国人民大学统计学院 |  厦门大学计划统计系 |  中国统计学会 | 

版权所有 ©2017 浙江工商大学统计学院 All Right Reserver. Email:tjx@zjgsu.edu.cn 技术支持:名冠电子商务
地址:浙江省杭州市下沙高教园区学正街18号 联系电话:(86)571-28008085 浙ICP备15014656号 浙公网安备33011802000512号