By the end of this course, students should be able to...
- Describe the statistical investigation process
- Identify observational units, variables, and variable types in a statistical study
- Develop skills needed to evaluate, analyze, prioritize and synthesize information in research articles
- Understand and explain the effects of multiple testing
- Understand and explain the role variability plays in statistics
- Demonstrate an awareness of ethical issues associated with sound statistical practice
- Explain the purpose of random sampling and its effect on scope of inference
- Explain the purpose of random assignment and its effect on scope of inference
- Identify whether a study is observational or an experiment
- Identify potential types of sampling bias in a study (selection, response, non-response)
- Identify confounding variables in observational studies and explain why they are confounding
- Understand and explain the role of randomness in designing studies and drawing conclusions
- Recognize and simulate probabilities as long-run frequencies
- Construct two-way tables to evaluate conditional and unconditional probabilities
- Identify and create appropriate summary statistics and plots given a data set or research question
- Interpret the following summary statistics in context: median, lower quartile, upper quartile, standard deviation, inter-quartile range, coefficient of determination, regression line slope
- Given a plot or set of plots, describe and compare the distribution(s) of a single quantitative variable (center, variability, shape, outliers)
- Given a plot or set of plots, describe the association between two quantitative variables (form, direction, strength, outliers)
Create and interpret the following summary statistics and plots:
- Summary statistics for a single categorical variable: frequencies, relative frequencies
- Summary statistics for a single quantitative variable: mean, median, percentiles, standard deviation, inter-quartile range, range, 5-number summary
- Summary statistics for association between two quantitative variables: correlation, coefficient of determination (R-squared), regression line
- Plots for a single categorical variable: bar plot
- Plots for association between two categorical variables: segmented bar plot, mosaic plot
- Plots for a single quantitative variable: dotplot, boxplot, histogram, density plot
- Plots for association between two quantitative variables: scatterplot
- Plots for association between one quantitative and one categorical variable: side-by-side boxplots; stacked histograms, density plots or dotplots
- Multivariable plots (e.g., scatterplot with factor)
General Inference Concepts
- Identify the sample and population of interest
- Recognize the difference between statistics and parameters and their symbols
- Describe the scope of inference for a study
- Understand and quantify sampling variability
- Understand how different factors of a study (e.g., sample size) affect power, p-values and confidence intervals
- Explain the difference between practical importance and statistical significance
- Identify the two possible explanations (one assuming the null hypothesis, and one assuming the alternative hypothesis) for a relationship seen in sample data
- Identify and describe in context the consequences of a Type I and Type II error
The Five Scenarios
- One proportion
- Difference in proportions
- Paired mean difference (single mean)
- Difference in means
- Simple linear regression (slope and correlation)
For each of the five scenarios:
- Given a research question, construct the null and alternative hypotheses in words and using appropriate statistical symbols
- Describe and perform simulation-based hypothesis tests
- Calculate and carry-out theory-based hypothesis tests
- Interpret and evaluate a p-value
- Calculate and interpret a standardized statistic
- Construct and interpret a theory-based confidence interval
- Use a confidence interval to determine the conclusion of a hypothesis test
- Conduct exploratory data analyses and inferential statistical analyses in R in a reproducible manner through R Markdown