2019 - 2022
Statistical learning of many real systems can be significantly enhanced by harnessing and translating domain knowledge into meaningful parameter restrictions. With the advent of high throughput datasets, such restrictions are often present on high-dimensional parameter spaces thereby complicating inference. This research aims to develop novel statistical methods and computational algorithms for such problems drawing motivation from a number of real applications. Working within a nonparametric Bayes framework, the first part of the research project lays emphasis on the importance of calibrating prior distributions in these constrained problems and theoretically quantifying the impact of the constraints on parameter learning. The second part aims to develop efficient Markov Chain Monte Carlo and variational algorithms and analyze their convergence behaviors for the said problems. The PIs will also propose undergraduate courses that will focus on the modeling and applied components of Bayesian methods. When teaching the courses, the PIs will use daily life as well as scientific examples across different disciplines to inspire students' learning. The Activity-Based Learning (ABL) courses aim to enrich students' academic experience and learning outcomes by connecting theory with practice and concepts with methods, using data & insights obtained through engagement with the larger world.The research project is motivated by statistical and computing challenges posed by a number of real scientific applications where various complex restrictions are posed on key parameters, necessitating novel statistical methods and associated computational algorithms. Operating in a Bayesian paradigm which enables incorporation of various constraints in a principled framework and provides readily available uncertainty estimates often sought after in scientific applications, a major emphasis will be laid on calibration of prior distributions under these constrained spaces. Examples will be provided where seemingly innocuous prior choices routinely used in practice can lead to biased inferences in certain specific situations. A rigorous theoretical understanding of such phenomenon will be provided along with development of alternative default priors on these constrained spaces. The methodological and theoretical developments will be accompanied by efficient computational algorithms using novel approximation techniques in the context of Markov chain Monte Carlo and variational algorithms that meet the scalability demanded by the specific applications and beyond. The algorithm development will be paralleled by novel convergence analysis, bridging ideas between the optimization and sampling literature.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.