The data operators described in this section all work independently of the current data selection. Thus, each data operator takes into account the maximum possible number of data cases.
Individual data cases can be accessed in equations by giving the name of the data carrier together with the case number in parenthesis - this is the same format as when the data is being defined. For example, we could have
The argument supplied in parenthesis for the observation number may be any valid equation, which will be rounded to what should be a positive integer. A simple example is as follows.
It will be an error if the ``observation number'' thus supplied does not correspond to an observation. The other functions described in this section generally deal with all of the data defined over particular data carriers, rather than individual cases.
Usage
The centile operator gives the Ith percentile of currently
selected data values on D. For example, if we order the data values as
, then
is the smallest value such that at least of the values are
less than or equal to this value, and at least of the
values are greater than or equal to this value. The lower of two
candidates is returned. Note that placing gives the minimum,
and gives the maximum. All possible data cases defined for
the data carrier D are taken into account.
An example of the use of the command is as follows. Suppose that
Temperature is the name of a data carrier, and that we require the
median. (We generate the 50 artificially to illustrate the use of
equations here).
BD>print : (centile (temperature,25+25))
This returns the sample correlation between and . All
possible data cases defined both for the data carrier and the data
carrier are taken into account.
This returns the sample covariance between and as
times the sum of cross product deviations. All
possible data cases defined both for the data carrier and the data
carrier are taken into account.
This returns the largest case number for which there is data on the
data carrier D. All possible data cases defined
for the data carrier D are taken into account.
This returns the number of cases for which there are observations on
both and . All possible data cases defined both for
the data carrier and the data carrier are taken into
account. The match operator is the bivariate analogue of the
number operator.
This returns the arithmetic average of the observations defined for D.
All possible data cases defined for the data carrier D are taken into
account. If the total number of cases is zero, or if D has not been
defined, then the operator returns a value of zero and an error is
reported.
returns the total number of observations for the data carrier . All possible data cases defined
for the data carrier D are taken into account. The number
operator is the univariate analogue of the match operator.
This returns the sample variance of D, as times the
sum of squared deviations. All possible data cases defined for the data
carrier D are taken into account.
Usage
Usage
Usage
Usage
Usage
Usage
Usage