Course Descriptions

Required courses and elective course options are listed below.

STA 502 Introduction to Statistical Inference

3 Credits, Fall Semester

Introduces basic principles of probability and distribution theory and statistical inference. Topics include axioms of probability theory, independence, conditional probability, random variables, discrete and continuous distributions, functions of random variables, moment generating functions, central limit theorem, point and interval estimation, maximum likelihood methods, tests of significance, and the Neyman-Pearson theory of testing hypotheses.

STA 509 Statistical Genetics

3 Credits, Spring Semester

Prerequisite: STA 505 or STA 527 or STA 503

Statistical tools for analyzing experiments involving genomic data. Topics: Basic genetics and statistics, linkage analysis and map construction using genetic markers, association studies, Quantitative Trait Loci analysis with ANOVA, variance components analysis and marker regression (including multiple and partial regression), QTL mapping with interval mapping and composite interval mapping, LOD test, supervised and unsupervised methods for gene expression microarray data across multiple conditions.

STA 525 Statistics for Bioinformatics

3 Credits, Spring Semester

Since the completion of the human genome project, there is a burgeoning field of new applications for statistics involving high throughput experiments designed to gather large amounts of information on biological systems. This course is focused on discussing the wide array of approaches and technologies implemented to gather this information and the statistical issues involved from initial data processing steps to end stage research objectives. Specifically, time permitting, the technologies we will examine include two dimensional protein gel electrophoresis, protein mass spectrometry, and several flavors of microarray experiments. We will use the text “Bioinformatics and Computational Biology Solutions Using R and Bioconductor.” Much of the work for the course will involve analyzing data sets from class and from the text using the R language.

STA 545 Data Mining I

3 Credits, Fall Semester

Prerequisite: STA 511


This course presents the topic of data mining from a statistical perspective, with attention directed towards both applied and theoretical considerations. An emphasis will be placed on supervised learning methods. Topics include: linear and logistic regression, discriminant analysis, shrinkage methods, subset selection, dimension reduction techniques, classification and regression trees, ensemble methods, neural networks, and random forests. Model selection and estimation of generalization error will be emphasized. Considerations and issues that arise with high-dimensional (N<<p) applications will be highlighted. Applications will be presented in R to illustrate methods and concepts.

STA 546 Data Mining II

3 Credits, Fall Semester

Prerequisite: STA 511

This course presents the topic of data mining from a statistical perspective, with attention directed towards both applied and theoretical considerations. The focus will be on supervised learning, which concerns outcome prediction from input data. Students will be introduced to a number of methods for supervised learning, including: linear and logistic regression, shrinkage methods, lasso, partial least squares, tree-based methods, model assessment and selection, model inference and averaging, and neural networks. Computational applications will be presented using R and high dimensional data to reinforce theoretical concepts.