Data Science Inference and Modeling
1 Learning Objectives
- The concepts necessary to define estimates and margins of errors of populations, parameters, estimates, and standard errors in order to make predictions about data
- How to use models to aggregate data from different sources
- The very basics of Bayesian statistics and predictive modeling
1.1 Course Overview
1.1.1 Section 1: Parameters and Estimates
You will learn how to estimate population parameters.
1.1.2 Section 2: The Central Limit Theorem in Practice
You will apply the central limit theorem to assess how close a sample estimate is to the population parameter of interest.
1.1.3 Section 3: Confidence Intervals and p-Values
You will learn how to calculate confidence intervals and learn about the relationship between confidence intervals and p-values.
1.1.4 Section 4: Statistical Models
You will learn about statistical models in the context of election forecasting.
1.1.5 Section 5: Bayesian Statistics
You will learn about Bayesian statistics through looking at examples from rare disease diagnosis and baseball.
1.1.6 Section 6: Election Forecasting
You will learn about election forecasting, building on what you’ve learned in the previous sections about statistical modeling and Bayesian statistics.
1.1.7 Section 7: Association Tests
You will learn how to use association and chi-squared tests to perform inference for binary, categorical, and ordinal data through an example looking at research funding rates.
1.2 Introduction to Inference
The textbook for this section is available here.
In this course, we will learn:
- statistical inference, the process of deducing characteristics of a population using data from a random sample
- the statistical concepts necessary to define estimates and margins of errors
- how to forecast future results and estimate the precision of our forecast
- how to calculate and interpret confidence intervals and p-values
Key points
- Information gathered from a small random sample can be used to infer characteristics of the entire population.
- Opinion polls are useful when asking everyone in the population is impossible.
- A common use for opinion polls is determining voter preferences in political elections for the purposes of forecasting election results.
- The spread of a poll is the estimated difference between support two candidates or options.