If we abandon the assumption of equal variances, then our first sample is i.i.d. \({\mathcal{N}\left({\mu_X, \sigma_X^2}\right)}\) and the second sample i.i.d \({\mathcal{N}\left({\mu_Y, \sigma_Y^2}\right)}\), with \(\sigma_X\) not necessarily equal to \(\sigma_Y\). Clearly, there is no single variance parameter to estimate and the test which uses the pooled sample variance would not be appropriate. We wonโt cover the detail of the theory for this in lectures (nor is it examinable), but the idea is straightforward and easy to apply. Without the equal variance assumption, our test statistic becomes \[ t = \frac{(\bar{x}-\bar{y}) - (\mu_X - \mu_Y)}{\sqrt{\frac{s_X^2}{n} + \frac{s_Y^2}{m}}}, \] which still has a \(t\) distribution. The problem arises in determining the appropriate degrees of freedom for the distribution of \(t\). The degrees of freedom of the \(t\) distribution is no longer \(n+m-2\), but instead is approximated by this hideous expression \[ \nu \approx \frac{\left(\frac{s^2_X}{n}+\frac{s^2_Y}{m}\right)^2}{\frac{s^4_X}{n^2(n-1)} + \frac{s^4_Y}{m^2(m-1)}}, \] though often in practice we take the lazier route of using \[\nu=\min(n,m)-1\] as the degrees of freedom (this simpler case corresponds to a conservative version of the test).
t.test function.
equalvariance argument than can be TRUE or FALSE and will adjust the test performed. You will need to us an if statement to handle the different cases.print and cat functions to get R to print output in the console. Use this to show the test statistic, degrees of freedom, and p-value.
immer data set from the MASS package, which contains pairs of measurements of the barley yield from the same fields in years 1931 (Y1) and 1932 (Y2).t.test function using the argument paired=TRUE.
qsignrank and psignrank functions useful to find critical values and \(p\)-values for the signed rank test statistic.
stats packageWhile we could write our own code every time we want to do a \(t\)-test or rank-sum test, this gets rather tedious rather quickly. Thankfully, these tests are supported by the stats package in R which allows us to pass the problem of computing the test to a pre-defined function, and we can then simply interpret the results
library function to load the R package stats.t.test function and apply it to your two samples A and B. Use the optional argument var.equal to perform an equal-variance test (TRUE), and FALSE to test without this assumption. Compare with your results from Section 2.wilcox.test function function and try it on A and B. The optional argument exact=TRUE will compute the test exactly, whereas exact=FALSE will use a Normal approximation. Do the results agree with your calculations?