**Module Four**

In this last module, we will continue to focus on statistical analyses that researchers and statisticians use to make "statistical inferences" (generalizations) about populations based on samples take from the population.

The first sub-module, Chi-Square Goodness of Fit, introduces one of the few statistical analysis (hypothesis testing) on qualitative data. Have you ever wondered if lottery numbers were evenly distributed or if some numbers occur with a greater frequency? How about if the types of movies people prefer were different across various age groups? The Chi-Square Goodness of Fit hypothesis test allows researchers and statisticians to answer such questions by determining if the data "fit" a particular distribution or not.

Many statistical applications in psychology, social science, business administration, and the natural sciences involve several groups. For example, a environmentalist is interested in comparing the average amount of pollution in several bodies of water. A sociologist is researching whether the amount of income a person earns varies according to his upbringing. A consumer, like you, is looking for a new car might compare the average gas mileage of several models. The second sub-module discusses the method, Analysis of Variance (ANOVA), which allows researchers and statisticians determine the existence of a "statistically significant" difference among several group means.

Finally, we consider bivariate data, two variables paired together, to determine if a relationship exists and to create a model of that relationship. This two-prong procedure is called Correlation and Regression.

## Overview

- Chi-Square Goodness of Fit

- Analysis of Variance (ANOVA)

- Correlation and Regression

**COURSE OBJECTIVES**

The student will be able to:

- Demonstrate fundamental concepts in exploratory data analysis
- Describe the concept of the sampling distribution of a statistic, and characterize the behavior of the sample means
- Utilize the foundations of inferential statistics involving confidence intervals and hypothesis testing
- Analyze summary data and modeling techniques for multivariate data
- Communicate and present statistical ideas clearly in oral and written forms using appropriate technical terms and deliver data analysis results to a non-statistical audience.

**MODULE THREE OBJECTIVES**

The student will be able to:

- Use an appropriate software tool for data summary and exploratory data analysis
- Describe properties of the sampling distribution of the sample mean
- Identify the components of a hypothesis test including the parameter of interest, the null and alternative hypotheses, and the test statistic
- Chi-square Goodness of Fit
- Analysis of Variance (ANOVA)

- Compute the p-value of a test statistic
- Diagnose if a possible relationship in bivariate data exists from a scatterplot
- Interpret a sample correlation
- Fit a linear model to a bivariate data set using an appropriate software tool
- Deduce which variable is the explanatory variable in a regression analysis
- Infer which variable is the response variable in a regression analysis
- In the context of a scenario, articulate why correlation need not imply causation
- Summarize the output from a software package for a regression analysis