## failing to reject the null hypothesis when it is true.

### rejecting the null hypothesis when it is true.

Null Hypothesis Research Hypothesis: Researcher’s expectation, prediction; also called hypothesis or alternative hypothesis

Null Hypothesis: No differences, no effects; differences or effects observed are the result of sampling error What you’re doing with statistics… Have a question/topic

Consult literature; create hypothesis (has an accompanying null hypothesis)

Sample a population and measure

Use statistics to estimate probability of your results being true to population; ruling out your null hypothesis to test the (research) hypothesis Significance Testing Test If you set your significance level at .05 and you get the following, significant or not?

### rejecting the null hypothesis when it is false.

Depending on how you want to "summarize" the exam performances will determine how you might want to write a more specific null and alternative hypothesis. For example, you could compare the **mean** exam performance of each group (i.e., the "seminar" group and the "lectures-only" group). This is what we will demonstrate here, but other options include comparing the **distributions**, **medians**, amongst other things. As such, we can state:

Would you reject the hypothesis H(0):MU = 72 versus the alternative H(1):MU =/= 72 on the basis of the observations, when testing at level ALPHA = .05?

## The failure to reject does not imply the null hypothesis is true.

If (that is, ), we say the data are consistent with a population mean difference of 0 (because has the sort of value we expect to see when the population value is 0) or "we **fail to reject the hypothesis that the population mean difference is 0**". For example, if t were 0.76, we would fail reject the hypothesis that the population mean difference is 0 because we've observed a value of t that is unremarkable if the hypothesis were true.

## WISE » Type 1 Error: Rejecting a True Null Hypothesis

The outcome of a statistical test is a decision to either accept or reject H_{0} (the Null Hypothesis) in favor of H_{Alt} (the Alternate Hypothesis). Because H_{0} pertains to the population, it’s either true or false for the population you’re sampling from. You may never know what that truth is, but an objective truth is out there nonetheless.

## Type 2 Error = fail to reject null when you should have ..

The null hypothesis is essentially the "devil's advocate" position. That is, it assumes that whatever you are trying to prove did not happen (*hint:* it usually states that something equals zero). For example, the two different teaching methods did not result in different exam performances (i.e., zero difference). Another example might be that there is no relationship between anxiety and athletic performance (i.e., the slope is zero). The alternative hypothesis states the opposite and is usually the hypothesis you are trying to prove (e.g., the two different teaching methods did result in different exam performances). Initially, you can state these hypotheses in more general terms (e.g., using terms like "effect", "relationship", etc.), as shown below for the teaching methods example:

## Usually we focus on the null hypothesis and type 1 error, ..

The **level of statistical significance** is often expressed as the so-called **p****-value**. Depending on the statistical test you have chosen, you will calculate a probability (i.e., the *p*-value) of observing your sample results (or more extreme) **given that the null hypothesis is true**. Another way of phrasing this is to consider the probability that a difference in a mean score (or other statistic) could have arisen based on the assumption that there really is no difference. Let us consider this statement with respect to our example where we are interested in the difference in mean exam performance between two different teaching methods. If there really is no difference between the two teaching methods in the population (i.e., given that the null hypothesis is true), how likely would it be to see a difference in the mean exam performance between the two teaching methods as large as (or larger than) that which has been observed in your sample?