Description
The p-value is one of the most widely used (and perhaps misused) instruments in Statistics. For a statistical test, it gives the probability of `obtaining
an at least as extreme result as the observed result, if the null hypothesis is true'. A certain statistical effect (for instance, the relevance of a regression parameter)
is usually deemed to be `significant' if its p-value falls below a given significance level, such as 0.05.
In order to publish the outcome of applied studies, many authors feel, rightly or wrongly, a pressure to produce `significant' results, or in other words,
studies are perceived as non-publishable if they do not demonstrate effects with sufficiently small p-values. This practice carries several implications, two examples
for which are given below:
- Assume that a researcher has some data available which seem interesting and seem to imply some sort of pattern, but it is not clear what the pattern is or with what the observed
data actually correlate. Therefore, they assume a null hypothesis of ``no pattern'' and keep looking for different alternative hypotheses (correlation with different variables, for instance),
until they find an appropriate alternative which leads to a significant result. They do not report the many unsuccessful attempts that they have made, and only publish the final,
significant result.
- Assume that 20 research teams around the world work on the same research problem and pursue the same promising idea in order to solve it. However, let us further assume that this
idea is, in fact, not working. All 20 teams carry out a statistical test in order to prove the feasibility of their idea, at the 5% level. Since, as said, the idea is not working, almost
all teams will not be able to reject the null hypothesis (that there is no effect). However, by construction of statistical tests, in 5% of all cases we can expect rejection of the null
hypothesis, even if it is true. That means, we can expect one team of researchers to come up with a significant result. This team is likely to publish their work (with a wrong result),
while the others (with the correct result) are likely not to publish, leading to an instance of publication bias.
There are many questions arising from the two bullet points above, which can form a basis for the work carried out within this project: How can one adjust for the many unsuccessful
attempts when reporting p-values in the first case? How can one avoid and detect publication bias in the second case?
Alternative directions in which such a project could tend would be epistemic questions (what do p-values actually mean under different inferential approaches?), or more practical considerations
(computation of confidence intervals from p-values, for instance).
There is a possibility of using data which are available through a collaboration with the Earth Sciences Department, particularly with view to questions of the type of the first bullet point.
A further field of application where such problems are starting to get more intensely discussed is Bioinformatics.
Prerequisites
Statistical Concepts II
Resources
|