GSLIB Help Page: TRANS
-
Description:
-
trans is a generalization of the quantile transformation used
for normal scores, the $p$-quantile of the original distribution is
transformed to the $p$-quantile of the target distribution.
This transform preserves the $p$-quantile indicator variograms
of the original values. The variogram (standardized by the variance)
will also be stable provided that the target distribution is not too
different from the initial distribution.
-
Parameters:
-
vartype: the variable type (1=continuous, 0=categorical)
-
refdist: the input data file with the target distribution and
weights.
-
ivr and iwt: the column for the values and the column
for the (declustering) weight. If there are no declustering weights
then set iwt= 0.
-
datafl: the input file with the distribution(s) to be
transformed.
-
ivrd and iwtd: the column for the values and the
declustering weights (0 if none).
-
tmin and tmax: all values strictly less than tmin
and strictly greater than tmax are ignored.
-
outfl: output file for the transformed values.
-
nsets: number of realizations or "sets" to transform.
Each set is transformed separately.
-
nx, ny, and nz: size of 3-D model (for categorical
variable). When transforming categorical variables it is essential to
consider some type of tie-breaking scheme. A moving window (of the
following size) is considered for tie-breaking when considering a
categorical variable.
-
wx, wy, and wz: size of 3-D window for categorical
variable tie-breaking.
-
nxyz: the number to transform at a time (when dealing with a
continuous variable). Recall that nxyz will be considered
nsets times.
-
zmin and zmax: are the minimum and maximum values that
will be used for extrapolation in the tails.
-
ltail and ltpar specify the back transformation
implementation in the lower tail of the distribution:
$ltail=1$ implements linear interpolation to
the lower limit zmin and $ltail=2$ implements power model
interpolation, with w=ltpar, to the lower limit
zmin.
-
utail and utpar specify the back transformation
implementation in the upper tail of the distribution:
$utail=1$ implements linear interpolation to the upper
limit zmax $utail=2$ implements power model
interpolation, with w=utpar, to the upper limit zmax, and
$utail=4$ implements hyperbolic model extrapolation with
w=utpar.
-
transcon: constrain transformation to honor local data? (1=yes,
0=no)
-
estvfl: an input file with the estimation variance (must
be of size nxyz).
-
icolev: column number in estvfl for the estimation
variance.
-
omega: the control parameter for how much weight is given to the
original data (w between 0.33 and 3.0)
-
seed: random number seed used when constraining a categorical
variable transformation to local data.
A short description of the program
-
Application notes:
-
When ``freezing'' the original data values, the quantile transform is
applied progressively as the location gets further away from the set of
data locations. The distance measure used is
proportional to a kriging variance at the location of the value being
transformed. That kriging variance is zero at the data locations
(hence no transformation) and increases away from the data (the
transform is increasingly applied). An input kriging variance file
must be provided or, as an option, trans can calculate these
kriging variances using an arbitrary isotropic and smooth (Gaussian)
variogram model.
-
Because not all original values are transformed, reproduction of the
target histogram is only approximate. A control parameter,
w in [0,1], allows the desired degree of approximation to be
achieved at the cost of generating discontinuities around the data
locations. The greater w, the lesser the discontinuities.
-
Program trans can be applied to either
continuous or categorical
values. In the case of categorical values a hierarchy or spatial
sequencing of the $K$ categories is provided implicitly through the
integer coding $k=1,\ldots,K$ of these categories. Category $k$ may
be transformed into category $(k-1)$ or $(k+1)$ and only rarely into
categories further away.
-
An interesting side application of program trans is in cleaning
noisy simulated images. Two successive runs (a
``roundtrip'') of trans, the first changing the original
proportions or distribution, the second restituting these original
proportions, would clean the original image while preserving data
exactitude.