The field of Statistics aims to interpret large data sets that contain random variation. Baseball is a simple game that contains a high degree of randomness, and because professional baseball has been played since the 19th century, a large amount of data has been collected about players’ performance. In this class we examine key concepts in Statistics and Data Science using baseball as a motivating example. We will also discuss how newer statistics, created by sabermetric researchers, have led to additional insights, and will be learn how to use the R programming language to analyze data. Assignments will consist of weekly problem sets and a short final project. By taking this class students will develop an understanding of key Statistical concepts that will be useful for interpreting data from many fields.
R resources: R tutorial, R Markdown cheat sheet, article on using R to analyze baseball data , Learning R videos: Intro, common functions, vectors , descriptive statistics, Visualizing Univariate Data, scatter plots
Class 1: Introduction
Class 8: Simple linear regression
Class 16: Introduction to statistical inference
Class 21: Parametric tests for two or means
Class 23: Confidence intervals
Class 24: Final project presentations