科学研究

科学研究

学术讲座
当前位置是: 首页 -> 科学研究 -> 学术讲座 -> 正文

A Statistical Understanding of Deep Learning

作者: 发布时间:2020-10-09 点击数:
主讲人:王汉生
主讲人简介:

王汉生教授现任北京大学光华管理学院商务统计与经济计量系系主任。1998年北京大学数学科学学院,概率统计系,统计学本科,2001年美国威斯康星大学麦迪逊分校,统计学博士。现为国际统计协会会员(International Statistical Institute),美国统计学会(American Statistical Association),美国数理统计研究员(Institute of Mathematical Statistics),英国皇家统计协会(Royal Statistical Society),以及泛华统计学会(International Chinese Statistical Association)会员。

他发表英文学术论文五十余篇,中文论文近二十篇。合著英文专著1本,独立完成中文教材2本。先后担任多个学术刊物副主编(Associate Editor)。这些刊物包括:The Annals of Statistics (2008—2009),Computational Statistics & Data Analysis (2008—2011),Statistics and its Interface (2010至今),Journal of the American Statistical Association (2011至今), 以及Statistica Sinica (2011至今)。现主要理论研究兴趣为:高维数据分析、变量选择、数据降维、极值理论、以及半参数模型。主要应用研究兴趣为:搜索引擎营销、社会关系网络。

主持人:王中雷
讲座简介:

Deep learning is a method of fundamental importance for many AI related applications. From a statistical perspective, deep learning can be viewed as a regression method with complicated input X and output Y. One unique character about deep learning is that the input X could be highly unstructured data (e.g., images and sentences) and the output(outputs?) Y are usual responses (e.g., class label). Another unique character about deep learning is that it has a highly nonlinear model structure with many layers. This provides us a basic theoretical foundation to understand deep learning from statistical perspectives. We do believe classical statistical wisdom combined with modern deep learning techniques can bring us novel methodology with outstanding empirical performance. In this talk, I would like to report 3 recent progresses made by our team in this regard. The first progress is about stochastic gradient descent (SGD) algorithm. SGD is the most popularly optimization algorithm used for deep learning model training. Its practical implementation relies on the specification of a tuning parameter: learning rate. In current practical applications, the choice of learning rate is highly subjective and depends on personal experiences. To solve the problem, we propose a local quadratic approximation (LQA) idea for an automatic and nearly optimal determination of the tuning parameter. The second progress focuses on model compression for convolutional neural networks (CNNs). Convolution neural network is one of the representative models in deep learning which has shown excellent performance in the field of computer vision. However, it is extremely complicated with a huge number of parameters. We then try to apply the classical principal component analysis (PCA) method to each convolutional layer, which leads to a new method called progressive principal component analysis (PPCA) method. Our PPCA method brings a significant reduction in model complexity without significantly sacrifices the out-of-sample forecasting accuracy. The last progress considers factor modeling. The input feature of a deep learning model is often of ultrahigh dimension (e.g., an image). In this case, a strong factor structure of the input feature can be detected constantly. This means a significant proportion of variability about the input feature can be explained by a low dimensional latent factor, which could be modeled in a very easy way. We then develop a novel factor normalization (FN) methodology. We first decompose an input feature X into two parts: the factor part and the residual part. Next, we reconstruct the baseline DNN model into a factor-assisted DNN model. Lastly, we provide a new SGD algorithm with adaptive learning rates for new model training. Our method leads to superior convergence speed and excellent out-of-sample performances compared to original model with a series of gradient type algorithms.

本场讲座采取预约制,名额有限,有意线上参加者请点击下方链接报名,报名后如无法参加请自行取消报名。

报名链接:点击报名

 

注册成功后zoom会通过email发送听会连接,请注意把no-reply@zoom.us放入白名单,或重新查看注册页面以获取听会连接 

时间:2020-10-09(Friday)10:30-12:00
地点:经济楼N402、N302、N203、N303
讲座语言:中文
主办单位:太阳成tyc7111cc邹至庄经济研究中心、厦大-中科院计量建模与经济政策研究基础科学中心
承办单位:
期数:计量经济学与统计学“邹至庄讲座”系列第二讲
联系人信息:
TOP