Nonparametric Predictive Inference
This page is under construction - in due course, it will contain an introduction to
this topic, links to published material, preprints (a few already here!), and
comments on work in progress. There will also be information on possible topics for PhD study.
Introduction
A natural starting point for statistical inference is often the assumptions of exchangeability
of random quantities. To put it simply for real-valued random quantities: if one has n exchangeable
random quantities, they are all equally likely to be the smallest, second smallest, etc. So, for one
such a random quantity, the probability of its rank among all these random quantities is uniformly
distributed over the values 1 to n (assuming no ties for simplicity). Bruce Hill (1968; JASA 63, 677-691)
introduced an assumption, called A(n), where this exchangeability property is actually used directly for prediction
of one (or more) future values, on the basis of n observations. He later discussed this in much more
detail, and summarized his findings as: `Let me conclude by observing that A(n) is supported by
all of the serious approaches to statistical inference. It is Bayesian, fiducial, and even a
confidence/tolerance procedure. it is simple, coherent, and plausible. It can even be argued, I
believe, that A(n) constitutes the fundamental solution to the problem of induction.'
Nevertheless, A(n) has not received much attention in the statistical literature. A logical
reason is that it only assigns equal probabilities for the next observation to belong to each
of the n+1 intervals created by the previous n observations, so very few inferences can be
based on this without requiring additional assumptions!? Well, this is indeed so if one uses
only precise probabilities, where every event A of interest is assumed to occur with a single-valued
probability P(A). However, there is no need to quantify uncertainty via only a single number, and
indeed there are many arguments to use an interval-valued probability instead, so [L(A),U(A)], for
which a variety of interpretations are available, all generalizing possible interpretations of P(A).
A convenient interpretation of these lower and upper probabilities, L(A) and U(A), is as the optimal
bounds for P(A) that can be deduced from the available information. Such interval-valued probabilities
have been around since the middle of the 19th century, and have received increasing attention since
the early 90s, under different names including `imprecise probability' (Walley) and
`interval probability' (Weichselberger). Clearly, within such a concept of interval-valued probability, one
can base statistical inference on Hill's A(n) only, so without requiring further assumptions.
Inspired by the need to develop statistical methods that rely on few (modelling) assumptions, we have
been developing A(n)-based inferences, mostly using interval probability, since the mid 90s, in collaboration
with a number of colleagues and students, both at Durham and further afield. We have worked on general
statistical inferences, on problems in reliability, and on problems in operational research, the latter
leading to OR policies which are explicitly adaptive to available data, so deleting the often made
assumption of fully known probability distributions. As such inferential methods are both nonparametric
and predictive, that is directly in terms of one or more future observables, we like to refer to this
approach as `NONPARAMETRIC PREDICTIVE INFERENCE'. Below we summarize our work in each of these three
areas, of course there is overlap between these. One exciting aspect of this approach is that the amount
of information available in the data is directly related to the differences between corresponding upper
and lower probability, providing a whole new dimension to uncertainty quantification when compared to
statistical methods which use only precise probabilities, such as standard Bayesian and frequentist methods
including most commonly used nonparametric methods.
NPI: Statistics
NPI: Reliability
NPI: Operational Research
NPI: Recent Research
Several papers already published can be found here. (Not all these papers are on NPI: there will soon be descriptions on this page.)
Below are a few recent papers, mostly as preprints:
- Nonparametric predictive comparison of proportions: pdf version.
This is a preprint version of a paper, jointly with Pauline Coolen-Schrijner, that is currently
in submission to Journal of Statistical Planning and Inference (invited revised version).
- Learning from multinomial data: a nonparametric predictive alternative to the Imprecise Dirichlet Model:
pdf version. This paper, jointly with Thomas Augustin (Munich), has appeared in:
ISIPTA'05: Proceedings of the Fourth International Symposium on Imprecise Probabilities and Their Applications, F.G. Cozman, R. Nau and T. Seidenfeld (Eds), published by SIPTA, pp. 125-134.
- Nonparametric adaptive opportunity-based age replacement strategies: pdf version.
This is a preprint version of a paper, jointly with Pauline Coolen-Schrijner and Simon Shaw (Bath), that is to appear in
Journal of the Operational Research Society, probably early 2006.
- On nonparametric predictive inference and objective Bayesianism: pdf version.
This is a preprint version of a paper to appear in a special issue of Journal of Logic, Language and Information, containing papers
presented at the Progic2005 workshop.
Topics for PhD study
We invite strong(ly motivated) candidates for postgraduate study (PhD and MSc by research) to
contact us about opportunities to study topics in NPI at Durham, under our supervision.
Examples of interesting topics are available in the following areas (more details will follow),
of course further suggestions are most welcome!
- Statistics: classification; bootstrapping; quality control; regression; multi-dimensional data
- Reliability: applications of the Coolen-Yan method; NPI alternatives to the Proportional Hazards model; competing risks
- Operational Research: NPI and stochastic processes; further applications to queueing problems; inventory models
- Combinatorics: NPI lower and upper probabilities for multiple future observations (discrete random quantities)
- Computational: we would like to establish a library of NPI algorithms in R, to make the method widely available
Frank Coolen
Pauline Coolen-Schrijner
Last revision: 20/10/05