> Compare the means of two or more variables or groups in the data The compare means t-test is used to compare the mean of a variable in one group to the mean of the same variable in one, or more, other groups. The null hypothesis for the difference between the groups in the population is set to zero. We test this hypothesis using sample data. We can perform either a one-tailed test (i.e., `less than` or `greater than`) or a two-tailed test (see the 'Alternative hypothesis' dropdown). We use one-tailed tests to evaluate if the available data provide evidence that the difference in sample means between groups is less than (or greater than ) zero. ### Example: Professor salaries We have access to the nine-month academic salary for Assistant Professors, Associate Professors and Professors in a college in the U.S (2008-09). The data were collected as part of an on-going effort by the college's administration to monitor salary differences between male and female faculty members. The data has 397 observations and the following 6 variables. - rank = a factor with levels AsstProf, AssocProf, and Prof - discipline = a factor with levels A ("theoretical" departments) or B ("applied" departments) - yrs.since.phd = years since PhD - yrs.service = years of service - sex = a factor with levels Female and Male - salary = nine-month salary, in dollars The data are part of the CAR package and are linked to the book: Fox J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition Sage. Suppose we want to test if professors of lower rank earn lower salaries compared to those of higher rank. To test this hypothesis we first select professor `rank` and select `salary` as the numerical variable to compare across ranks. In the `Choose combinations` box select all available entries to conduct pair-wise comparisons across the three levels. Note that removing all entries will automatically select all combinations. We are interested in a one-sided hypothesis (i.e., `less than`).

Pairwise mean comparisons (t-test)
Data : salary
Variables : rank, salary
Samples : independent
Confidence: 0.95
Adjustment: None
rank mean n n_missing sd se me
AsstProf 80,775.985 67 0 8,174.113 998.627 1,993.823
AssocProf 93,876.438 64 0 13,831.700 1,728.962 3,455.056
Prof 126,772.109 266 0 27,718.675 1,699.541 3,346.322
Null hyp. Alt. hyp. diff p.value se t.value df 0% 95%
AsstProf = AssocProf AsstProf < AssocProf -13100.45 < .001 1996.639 -6.561 101.286 -Inf -9785.958 ***
AsstProf = Prof AsstProf < Prof -45996.12 < .001 1971.217 -23.334 324.340 -Inf -42744.474 ***
AssocProf = Prof AssocProf < Prof -32895.67 < .001 2424.407 -13.569 199.325 -Inf -28889.256 ***
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
* `se` is the standard error (i.e., the standard deviation of the sampling distribution of `diff`)
* `t.value` is the _t_ statistic associated with `diff` that we can compare to a t-distribution (i.e., `diff` / `se`)
* `df` is the degrees of freedom associated with the statistical test. Note that the Welch approximation is used for the degrees of freedom
* `0% 95%` show the 95% confidence interval around the difference in sample means. These numbers provide a range within which the true population difference is likely to fall
### Testing
There are three approaches we can use to evaluate the null hypothesis. We will choose a significance level of 0.05.1 Of course, each approach will lead to the same conclusion.
#### p.value
Because each of the p.values is **smaller** than the significance level we reject the null hypothesis for each evaluated pair of professor ranks. The data suggest that associate professors make more than assistant professors and professors make more than assistant and associate professors. Note also the '***' that are used as an indicator for significance.
#### confidence interval
Because zero is **not** contained in any of the confidence intervals we reject the null hypothesis for each evaluated combination of ranks. Because our alternative hypothesis is `Less than` the confidence interval is actually an upper bound for the difference in salaries in the population at a 95% confidence level (i.e., -9785.958, -42744.474, and -28889.256)
#### t.value
Because the calculated t.values (-6.561, -23.334, and -13.569) are **smaller** than the corresponding _critical_ t.value we reject the null hypothesis for each evaluated combination of ranks. We can obtain the critical t.value by using the probability calculator in the _Basics_ menu. Using the test for assistant versus associate professors as an example, we find that for a t-distribution with 101.286 degrees of freedom (see `df`) the critical t.value is 1.66. We choose 0.05 as the lower probability bound because the alternative hypothesis is `Less than`.


usethis::use_course("https://www.dropbox.com/sh/0xvhyolgcvox685/AADSppNSIocrJS-BqZXhD1Kna?dl=1")
Compare Means Hypothesis Test
* This video shows how to conduct a compare means hypothesis test
* Topics List:
- Calculate summary statistics by groups
- Setup a hypothesis test for compare means in Radiant
- Use the p.value and confidence interval to evaluate the hypothesis test