报告人介绍
Jiashun Jin is Professor in Statistics & Data Science and Affiliated Professor in Machine Learning at Carnegie Mellon University. His earlier work was on the analysis of Rare/Weak signals in big data, focusing on the development of (Tukey’s) Higher Criticism and practical False Discovery Rate (FDR) controlling methods. His more recent interest is on the analysis of complex network and text data, where he has led a team collecting a large-scale data set on statistical publications called the MADStat. In these areas, Jin has co-authored three Editor’s Invited Discussion papers and three Editor’s Invited Review papers. Jin is an elected IMS fellow and an elected ASA fellow, and he has delivered the highly selective IMS Medallion Lecture in 2015 and IMS AoAS (Annals of Applied Statistics) Lecture in 2016. He was also a recipient of the NSF CAREER award and the IMS Tweedie Award. He has served as Associate Editor for several statistical journals and he is currently severing IMS as the IMS Treasurer. Beyond his academic career, Jin has also gained valuable experience in industry by doing research at Two-Sigma Investments and Google LLC.
内容介绍
In his Fisher’s Lecture in 1996, Efron suggested that there is a philosophical triangle in statistics with “Bayesian”, “Fisherian”, and “Frequentist” being the three vertices, and most of the statistical methods can be viewed as a convex linear combination of the three philosophies. We collected and cleaned a data set consisting of the citation and bibtex (e.g., title, abstract, author information) data of 83,331 papers published in 36 journals in statistics and related fields, spanning 41 years. Using the data set, we constructed 21 co-citation networks, each for a time window between 1990 and 2015. We propose a dynamic Degree-Corrected Mixed-Membership (dynamic-DCMM) model, where we model the research interests of an author by a low-dimensional weight vector (called the network memberships) that evolves slowly over time. We propose dynamic-SCORE as a new approach to estimating the memberships. We discover a triangle in the spectral domain which we call the Statistical Triangle, and use it to visualize the research trajectories of individual authors. We interpret the three vertices of the triangle as the three primary research areas in statistics: “Bayes”, “Biostatistics” and “Non-parametrics”. The Statistical Triangle further splits into 15 sub- regions, which we interpret as the 15 representative sub-areas in statistics. These results provide useful insights over the research trend and behavior of statisticians.