ABIVARIATE procedure
Produces graphs and statistics for bivariate analysis of variance
(R.F.A. Poultney)
Options
APRINT
= strings Controls output from the (univariate) ANOVAs of Y1 and Y2 (usual ANOVA print options); default aovtTREATMENTSTRUCTURE
= formula Treatment terms to be fitted in the analysis of variance; this option must be setBLOCKSTRUCTURE
= formula Block model defining the error terms in the analysis of variance; if unset, the design is assumed to be unstratified (i.e. to have a single error term)TERM
= formula Single model term identifying the treatment term whose means are to be plottedSTRATUM
= formula Stratum from which to extract treatment information; default is to take the bottom stratumFACTORIAL
= scalar Limit on number of factors in a treatment term; default 3PROBABILITY
= scalar Significance level to use in the calculation of the radius of the confidence region and the region of non-significance; default 0.95GRAPHICS
= string Type of graphical output (lineprinter, highresolution); default highSTYLE
= string controls the style of axes in a high-resolution graph (xy, none); default xyLABELS
= factor or text Plotting symbols for the means; default is to take the letters A to Z, then a to z
Parameters
Y1
= variates First variate for the bivariate analysisY2
= variates Second variate for the bivariate analysisTITLE
= texts Title for the graph
Description
ABIVARIATE
produces a bivariate analysis of variance with a graphical representation of the results, as described by Dear & Mead (1983, 1984). The procedure was developed from a Genstat 4 macro, further information about which is given by Poultney & Riley (1986), and is intended primarily for data from intercropping experiments. The variates to be analysed (specified by parameters Y1 and Y2) are measurements, usually yields, taken on the two crops. The final parameter, TITLE, defines a title for the graph.The procedure will work for any of the designs that can be analysed by
ANOVA, except that there must be no pseudo-factors. Option TREATMENTSTRUCTURE defines the treatment formulae for the analysis, and the block formula is defined by the BLOCKSTRUCTURE option. BLOCKSTRUCTURE can be omitted if there is a single error stratum (i.e. the analysis is of a completely randomized design). The FACTORIAL option controls the number of factors in each treatment term, as in the ANOVA directive.First of all,
ABIVARIATE calculates a univariate analysis of variance for each of the variates Y1 and Y2, with output controlled by the APRINT option. The settings are the same as those in the ANOVA directive; by default APRINT=aovtable.Output from the bivariate analysis of variance, which follows, is controlled by the
PRINT option. The setting error generates the error summary statistics from the bivariate analysis:Error Sum of Products, Variances after Adjustment for Covariance, Correlation Coefficient between
Y1 and Y2, Radius of Standard Errors, Radius of Confidence Regions, and Radius of Non-Significance Regions. The setting treatment produces the following statistics for each treatment term estimated within the specified error stratum: Treatment Sum of Products, Wilks' Lambda, Bivariate F-Statistic.The stratum from which the means (and other information) are to be taken is defined by
STRATUM option; if this is omitted, the lowest stratum is used. The significance level to use in the calculation of confidence regions is defined by the PROBABILITY option; by default this is 0.95.The
TERM option specifies a treatment term whose means are to be represented graphically. The means are plotted on axes transformed to allow for the variability in, and the correlation between, each crop variate. The plotting symbols can be defined as a factor or text using the option LABELS. Alternatively they will be taken to be the first n values of the series A to Z, a to z where n is the number of means to be plotted. The graph can be either line printer or high resolution, the default being high resolution. The external axes of a high-resolution graph can be suppressed by setting STYLE=none.Problems arise in situations where the table of means to be plotted is incomplete; this can occur when a whole factor level is restricted out, or where the treatment structure is nested within a control. The length of the vector
LABELS is calculated as the number of cells in the table, including missing values. If LABELS is declared, it must have length equal to the dimension of the table otherwise a fault will occur. Similarly, the calculation of the radius statistics is based on the assumption that the table of means is complete and has equal replication. These values, if printed, would be incorrect for a table with missing cells and so are suppressed. They can be calculated by hand as shown by Dear & Mead (1983).
Options:
Parameters:
Y1, Y2, TITLE.
Method
(1) calculate the SSP matrix for all terms in the formula
(2) transform the variables such that the new set are uncorrelated and have unit error variance
(3) calculate new axes based on the maximum and minimum points of the transformed variables
(4) draw the graph of the transformed means with the axes rotated such that they are at the same angle to the vertical
Action with
RESTRICTVariates
Y1 and Y2 can be restricted, however this restriction must be identical for the two variates. Some problems may occur when whole levels of factors are restricted out leaving empty cells in the table of means to be plotted (see above).
References
Dear, K.B.G. & Mead, R. (1983). The use of bivariate analysis techniques for the presentation, analysis and interpretation of data. Statistics in Intercropping Technical Report No.1, Department of Applied Statistics, University of Reading, U.K
Dear, K.B.G. & Mead, R. (1984). Testing assumptions and other topics in bivariate analysis. Statistics in Intercropping Technical Report No.2, Department of Applied Statistics, University of Reading, U.K
Poultney, R.F.A. & Riley, J. (1986). A Genstat Macro for the Bivariate Analysis of Intercropping Data. Genstat Newsletter No.17, pp.27-46
AFALPHA procedure
Generates alpha designs
(R.W. Payne)
Option
Parameters
GENERATOR
= matrices generating array (of size number-of-plots-per-block by number-of-reps)LEVELS
= scalars or variates Defines the levels of each treatment factor; if this is omitted, the levels of the TREATMENT factor are used, if available, otherwise LEVELS is determined from the generating array on the assumption that the blocks are to be of equal sizeSEED
= scalar Seed to be used to randomize the design, if requiredTREATMENTS
= factors Specifies the treatment factor for each designREPLICATES
= factors Specifies the replicate factorBLOCKS
= factors Specifies the block factorUNITS
= factors Specifies the factor to index the units within each block
Description
Alpha designs are a very flexible class of resolvable incomplete block designs. A resolvable design is one in which each block contains only a selection of the treatments, but the blocks can be grouped together into subsets in which each treatment is replicated once. The groupings of blocks thus form replicates, and the block structure of the design is
Replicates / Blocks / Units
Such designs are particularly useful when there are many treatments to examine and the variability of the units is such that the block size needs to be kept small. Alpha designs were thus devised originally for the analysis of plant breeding trials (Patterson & Williams 1976), where many varieties may need to be evaluated in a single trial, and have the advantage that they can provide effective designs for any number of treatments.
The construction of an alpha design requires a k ´ r array of integers between 0 and s-1, where r is the number of replicates, and s is the number of blocks per replicate. If the number of treatments, v, is a multiple of the number of blocks per replicate, k will be the number of units in each block, and v will be given by s ´ k. Otherwise, the design will have some blocks of size k and some of size k-1, and v will lie between s ´ (k-1) and s x k. Clearly, the properties of the design that is formed will be very dependent on the choice of array. Patterson, Williams and Hunter (1978) present 11 basic arrays to generate designs with up to 100 treatments and 2, 3 or 4 replicates when k is greater than 3 and s is greater than or equal to k; these arrays are reproduced in John (1987). Williams (1975) presents arrays for any sensible values of s and k with up to 100 treatments and 2 to 4 replicates.
Procedure
AFALPHA generates the treatment, replicate, block and unit factors for an alpha design. The design can be printed by setting option PRINT=design, and the factors can be saved using the parameters TREATMENTS, REPLICATES, BLOCKS and UNITS. The generating array for the design must be specified as a k ´ r matrix using the GENERATOR parameter, and the number of levels of the treatment factor can be defined by the LEVELS parameter. If LEVELS is omitted, AFALPHA will see whether the TREATMENTS parameter has been set to a factor whose levels have already been defined; if not, AFALPHA will set LEVELS to the scalar value v = s ´ k. By default the design is unrandomized, but randomization can be requested by setting the SEED parameter.
Option:
PRINT.Parameters:
GENERATOR, LEVELS, SEED, TREATMENTS, REPLICATES, BLOCKS, UNITS.
Method
Each column of the generating array is used to form s-1 further columns by successively adding 1 modulo s. Next, s is added to row 2 of every column, 2s to row 3, and so on. Each resulting column then gives one of the blocks of the design, and the replicates are formed by the sets of columns that were all generated from the same initial column.
References
Patterson, H.D. & Williams E.R. (1976). A new class of resolvable incomplete block designs. Biometrika, 63, 83-92.
Patterson, H.D., Williams E.R. & Hunter, E.A. (1978). Block designs for variety trials. J. Agric. Sci, 90, 395-400.
Williams, E.R. (1975). A new class of resolvable block designs. Ph.D. Thesis. Univ. of Edinburgh.
AFCYCLIC procedure
Generates block and treatment factors for cyclic designs
(R.W. Payne)
Option
Parameters
INITIAL
= variates or pointers Defines one (variate) or more (pointer to variates) initial blocks for a treatment factorINCREMENT
= scalars or pointers Defines the size of the successive increments (scalar) or increments (pointer to scalars) for each initial blockLEVELS
= scalars or variates Defines the levels of each treatment factor; this need not be specified if the factor has already been declaredSEED
= scalar Seed to be used to randomize each design, if requiredTREATMENTS
= factors Specifies treatment factorsBLOCKS
= factors Specifies block factorsUNITS
= factors Specifies factors to index the units within each block
Description
The cyclic method is a very powerful way of constructing incomplete block designs. In its simplest form, it starts with an initial block, containing some subset of the treatments. This subset is then represented by the ordinal number in the range 0...m-1 where m is the number of treatment levels. The second and subsequent blocks are then generated by successive addition modulo m of one to the numbers in the subset. Thus, for seven treatments (0...6) and an initial block (0,1,4), the subsequent blocks would contain treatments (1,2,5), (2,3,6), (3,4,0), (4,5,1), (5,6,2) and (6,0,3). As can be seen, if m is a prime number, m blocks are generated with each initial block. However, if m can be expressed as the product of other integers, shorter cycles can occur. For example, for m=8 and initial block (0,1,4,5), four blocks are generated altogether, the others being (1,2,5,6), (2,3,6,7) and (3,4,7,0). The procedure allows for all of this. It is also possible to have more than one initial block, and the increment need not be one.
The
INITIAL parameter specifies the initial blocks. If the design is to be generated from a single initial block, INITIAL should be set to a variate containing the levels corresponding to the treatments concerned; if there are several, the appropriate variates should be placed into a pointer. Similarly the INCREMENT parameter, which specifies the increment to be used, should be set to a scalar if the same increment is to be used for all the initial blocks, otherwise to a pointer of scalars. The levels of the treatment factor are specified by the LEVELS parameter and the SEED parameter allows the design to be randomized. As is customary in Genstat, if LEVELS is set to a scalar the levels are assumed to be represented by the integers 1 upwards, but LEVELS can be set to a variate to specify other numbers. LEVELS can be omitted if the TREATMENTS parameter is used to supply a factor to store the treatments, provided the levels of that factor have already been defined outside the procedure. The factors for blocks and units within blocks can be saved similarly by the BLOCKS and UNITS parameters respectively. The design can also be printed, by setting option PRINT=design.The properties of the cyclic designs that can be generated for any particular number of treatments or size of block varies according to the choice of initial block and increment. Tables showing the most efficient combinations have been presented for example by John, Wolock & David (1972), John (1981, 1987) and Lamacraft & Hall (1982).
Option:
PRINT. Parameters: INITIAL, INCREMENT, LEVELS, SEED, TREATMENTS, BLOCKS.
Method
The procedure generates the design using the standard Genstat directives for calculation and manipulation.
References
John, J.A., Wolock, F.W. & David, H.A. (1972). Cyclic Designs. National Bureau of Standards, Applied Mathematics Series 62.
John, J.A. (1981). Efficient cyclic designs. J. Roy. Statist. Soc., B, 43, 76-80.
John, J.A. (1987). Cyclic Designs. Chapman & Hall, London.
Lamacraft, R.R. & Hall, W.B. (1982). Tables of incomplete cyclic block designs: r=k. Austral. J. Statist., 24, 350-360.
AFORMS procedure
Prints data forms for an experimental design
(R.W. Payne)
Options
BLOCKSTRUCTURE
= formula Defines the block factors to be used to label the units of the design; default takes those specified in an earlier BLOCKSTRUCTURE directiveTREATMENTSTRUCTURE
= formula Defines the treatment factors to be used, if any, to labels the formsNLINES
= scalar Number of lines to be allowed for each measurement; default 1
Parameters
LABEL
= texts Labels for the measurements to be recorded on the formsFIELDWIDTH
= scalar Fieldwidth to be allowed for each label
Description
AFORMS
prints data forms which can be used to record data from an experimental design. Several measurements can be recorded, in separate columns across the page, and space is provided for a row of values for each unit of the design. The block factors to label the units can be supplied by setting the BLOCKSTRUCTURE option to the block formula of the design. If this is not set AFORMS will use the formula, if any, defined previously by the BLOCKSTRUCTURE directive.The units can also be labelled with the treatments that have been used in the design, by setting the
TREATMENTSTRUCTURE option to the appropriate treatment formula. However, to guard against bias, experimenters will often prefer not to know which treatments were applied to each unit when recording the results, so if this is omitted no treatment information is included.The
LABEL parameter supplies the column label to identify each column of measurements, and the FIELDWIDTH parameter can specify the width of the column. By default, a single line is provided for row of measurements but this can be increased using the NLINES option.
Options:
BLOCKSTRUCTURE, TREATMENTSTRUCTURE, NLINES.Parameters:
LABEL, FIELDWIDTH.
Method
AFORMS
uses the standard Genstat directives for printing and manipulation.
Action with
RESTRICTAFORMS needs to use RESTRICT in order to organise the labelling of the forms, and so any existing restrictions will be cancelled.
AFUNITS procedure
Forms a factor to index the units of the final stratum of a design
(R.W. Payne)
Option
BLOCKSTRUCTURE
= formula Defines the block factors for the design; the default is to take those specified by the BLOCKSTRUCTURE directive
Parameter
UNITS
= factor Factor to be formed
Description
When analysing experimental data in Genstat, it is usually unnecessary to specify the final stratum of the design. For example
ANOVA, as explained on page 417 of the Genstat Reference Manual, will set up an internal factor called *Units* to define (along with the other block factors) the final stratum. However, it is then impossible, for example, to put the residuals into a table classified by the block factors, or to tabulate the treatment levels according to the block structure. Thus AFUNITS takes a set of block factors (specified in either a pointer or a model formula by the BLOCKSTRUCTURE option) and sets up the necessary extra factor, which is then returned by the UNITS parameter.
Option:
BLOCKSTRUCTURE. Parameter: UNITS.
Method
The
FCLASSIFICATION and FORMULA directives are used, if necessary, to form a list of factors from the block formulae and then the factor values are set up using the standard Genstat facilities for calculations and manipulation.
Action with
RESTRICTNone of the block factors must be restricted.
AGALPHA procedure
Forms alpha designs by standard generators for up to 100 treatments
(M.F. Franklin & R.W. Payne)
Option
Parameters
LEVELS
= scalars Number of treatmentsNREPLICATES
= scalars Number of replicatesNBLOCKS
= scalars Number of blocks per replicateSEED
= scalars Seed for randomization; zero implies no randomizationTREATMENTS
= factors Identifier for the treatment factorREPLICATES
= factors Identifier for the replicate factorBLOCKS
= factors Identifier for the factor to index the blocks within replicatesUNITS
= factors Identifier for the factor to index the units (or plots) within each block
Description
Alpha designs are a very flexible class of resolvable incomplete block designs. A resolvable design is one in which each block contains only a selection of the treatments, but the blocks can be grouped together into subsets in which each treatment is replicated once. The groupings of blocks thus form replicates, and the block structure of the design is
Replicates / Blocks / Units
Such designs are particularly useful when there are many treatments to examine and the variability of the units is such that the block size needs to be kept small. Alpha designs were thus devised originally for the analysis of plant breeding trials (Patterson & Williams 1976), where many varieties may need to be evaluated in a single trial, and have the advantage that they can provide effective designs for any number of treatments.
The formation of an alpha design requires a generating array, as explained in the description of procedure
AFALPHA, and the effectiveness of the design that is produced will be very dependent on the choice of array. Procedure AGALPHA selects an appropriate array from those presented by Patterson, Williams & Hunter (1978) and Williams (1975), and then calls AFALPHA to generate the design.AGALPHA
is easiest to use interactively. It then asks questions to determine the necessary information to select the generating array: for example, the number of treatments, the number of blocks per replicate and so on. The parameters allow you to anticipate questions, or to define all the necessary information if you want to use AGALPHA in batch.The number of treatments can be defined using the
LEVELS parameter. Similarly, the NREPLICATES and NBLOCKS parameters define the number of replicates and the number of blocks per replicate. If the number of blocks per replicate is greater than or equal to the number of units (or plots) per block, generators are available for either two, three or four replicates; otherwise there can only be two. The SEED parameter allows you to specify a seed to be used to randomize the design. In batch the default seed is zero, to suppress randomization. If you do not set SEED when running interactively AGALPHA will ask for a seed, and again a zero value suppresses any randomization. The remaining parameters, TREATMENTS, REPLICATES, BLOCKS and UNITS, allow you to specify identifiers for the treatment, replicate, block-within-replicate and unit-within-block factors. If these are not specified in a batch run, AGALPHA will use identifiers that are local within the procedure and thus lost at the end of the procedure. If you are running interactively, AGALPHA will ask you to provide identifiers, and these will remain available after AGALPHA has finished running.AGALPHA
has a PRINT option which can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGALPHA will ask whether or not you wish to print the design.
Option:
PRINT.Parameters:
Method
The
QUESTION directive is used to obtain the necessary details of the design. Procedure AFALPHA is then called to generate the design.
References
Patterson, H.D. & Williams E.R. (1976). A new class of resolvable incomplete block designs. Biometrika, 63, 83-92.
Patterson, H.D., Williams E.R. & Hunter, E.A. (1978). Block designs for variety trials. J. Agric. Sci, 90, 395-400.
Williams, E.R. (1975). A new class of resolvable block designs. Ph.D. Thesis. Univ. of Edinburgh.
AGBIB procedure
Generates balanced incomplete block designs
(R.W. Payne)
Options
ANALYSE
= string Controls whether or not to analyse the design, and produce a skeleton analysis-of-variance table using ANOVA (no, yes); default is to ask if this is unset in an interactive run, and not to analyse if it is unset in a batch run
Parameters
LEVELS
= scalars Number of treatmentsNBLOCKS
= scalars Number of blocksNUNITS
= scalars Number of units per blockSEED
= scalars Seed for randomization; zero implies no randomizationTREATMENTS
= factors Identifier for the treatment factorBLOCKS
= factors Identifier for the factor to index the blocksUNITS
= factors Identifier for the factor to index the units within each block
Description
Incomplete block designs occur when the units in an experiment need to be divided into blocks that are not large enough to contain a unit for every treatment. In a balanced incomplete block design the contents of the blocks are arranged so that every pair of treatments occurs in an equal number of blocks. All comparisons between treatments are thus made with equal accuracy, so the design is balanced and, in particular, can be analysed by
ANOVA.AGBIB
provides a range of balanced incomplete block designs, and is easiest to use interactively. It then asks questions to determine the necessary information to form the design. The options and parameters allow you to anticipate questions, or to define all the necessary information if you want to use AGBIB in batch.First of all,
AGBIB asks you to select the design. It lists those available, specifying the number of treatments, the number of blocks, the size of each block and the number of blocks containing each pair of treatments. Alternatively, if you set the LEVELS parameter to the required number of treatments, the NBLOCKS parameter to the number of blocks and the NUNITS parameter to the number of units per block, AGBIB will select the design (if available) automatically.The
SEED parameter allows you to specify a seed to be used to randomize the design. In batch the default seed is zero, to suppress randomization. If you do not set SEED when running interactively AGBIB will ask for a seed, and again a zero value suppresses any randomization.Parameters
TREATMENTS, BLOCKS and UNITS, allow you to specify identifiers for the treatment, the block and unit-within-block factors. If these are not specified in a batch run, AGBIB will use identifiers that are local within the procedure and thus lost at the end of the procedure. If you are running interactively, AGBIB will ask you to provide identifiers, and these will remain available after AGBIB has finished running.The
PRINT option controls printed output, with setting design to print a plan of the design, and catalogue to print a list of the available designs. By default, if you are running Genstat in batch, nothing is printed. If you do not set PRINT when running interactively, AGBIB will ask whether or not you wish to print the design, after it has been generated. Similarly the ANALYSE option governs whether or not AGBIB produces a skeleton analysis-of-variance table (containing just source of variation, degrees of freedom and efficiency factors). Again AGBIB assumes that this is not required if ANALYSE is unset in a batch run, and asks whether it is required if ANALYSE is unset in an interactive run.
Option:
PRINT, ANALYSE.Parameters:
LEVELS, NBLOCKS, NUNITS, SEED, TREATMENTS, BLOCKS, UNITS.
Method
AGBIB
generates the designs from Hadamard matrices, as described by Hedayat & Wallis (1978). The QUESTION directive is used to obtain the necessary details of the design. The matrices are then recovered from a backing-store file and the standard Genstat manipulation directives are used to generate the design.
Reference
Hedayat, A. & Wallis, W.D. (1978). Hadamard matrices and their applications. Annals of Statistics 6, 1184-1238.
AGBOXBEHNKEN procedure
Generates Box Behnken designs
(R.W. Payne)
Options
NCENTRALPOINTS
= scalar Defines the number of central points to include; default 4LEVELS
= variate Defines the outer levels to be used; default !(-1,1)NCOMBINATIONS
= scalar Number of factors to vary in combination at once; default 2SEED
= scalar Seed to be used to randomize each design; default 0 implies no randomization
Parameter
TREATMENTFACTOR
= factors Treatment factors
Description
Box-Behnken designs are often used to study response surfaces. The design is usually formed to allow a quadratic response surface to be fitted. The factors are studied at three equally-spaced levels, below denoted by -1, 0 and 1. The construction uses a balanced incomplete block design to select successive sets of factors to be applied at all factorial combinations of -1 and +1, while other factors are held at 0. For example, with three factors
A, B and C, the relevant balanced incomplete block design would have three blocks (A,B), (A,C) and (B,C). So the design would first have a section with A and B varying but C constantA B C
-1 -1 0
-1 +1 0
+1 -1 0
+1 +1 0
then a section where B is held constant but A and C take all combinations of -1 and +1
A B C
-1 0 -1
-1 0 +1
+1 0 -1
+1 0 +1
and finally a section with A constant
A B C
0 -1 -1
0 -1 +1
0 +1 -1
0 +1 +1
In addition, there can be some "central points", where all the factors take the central value
A B C
0 0 0
0 0 0
0 0 0
0 0 0
The treatment factors are listed using the TREATMENTFACTOR parameter. If this is omitted in an interactive run, you will be asked how many factors you want and their names. The number of central points is specified by the NCENTRALPOINTS option; by default this is taken to be four. The LEVELS option can supply a variate to specify the outer treatment levels; the defaults are 1 and -1 (so the central point is at zero). The NCOMBINATIONS option defines the number of factors whose combinations of (outer) levels are to be varied at once. For the default of two, the relevant balanced incomplete block design is formed within AGBOXBEHNKEN. Other values can be supplied, but the corresponding balanced incomplete block design must be one of those obtainable from procedure AGBIB. You can find out the possibilities by putting
AGBIB [PRINT=catalogue]
The SEED option allows you to specify a seed for randomization, with a default of zero indicating no randomization. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGBOXBEHNKEN will ask whether or not you wish to print the design.
Options: PRINT, NCENTRALPOINTS, LEVELS, NCOMBINATIONS, SEED.
Parameters: TREATMENTFACTOR.
Method
The QUESTION directive is used to obtain the necessary details of the design and this is then generated by the standard Genstat manipulation directives and procedure AGBIB.
AGCENTRALCOMPOSITE procedure
Generates central composite designs
(R.W. Payne)
Options
NCENTRALPOINTS
= scalar Defines the number of central points to include; default 4NSTARPOINTS
= scalar Defines the number of star points to include; default 1LFACTORIAL
= variate Defines the treatment levels in the factorial part of the design; default !(-1,1)LSTAR
= variate Defines the treatment levels for the star points; default is to use the levels defined by LFACTORIALFRACTION
= scalar Denominator for fractional factorial; default 1 species a complete designSEED
= scalar Seed to be used to randomize each design; default 0 implies no randomization
Parameter
TREATMENTFACTOR
= factors Treatment factors
Description
Central composite designs are used for estimating quadratic response surfaces, that is, the model to be fitted to the results is a quadratic function of the various factors. The design is made up of three sets of points.
a) a factorial design: usually this contains all combinations of the factors at a pair of levels (l1,l2), but for five or more factors it is feasible to use a fractional factorial (and still be able to estimate all the parameters of the response surface)
b) star points: this contains a pair of points for each factor where the other factors take the value (l1+l2)/2 and the factor has the values s1 and s2
c) centre points: here all the factors have the value (l1+l2)/2
The treatment factors are listed using the
TREATMENTFACTOR parameter. If this is omitted in an interactive run, you will be asked how many factors you want and their names. The number of central points is specified by the NCENTRALPOINTS option; by default this is taken to be four. The LFACTORIAL option can supply a variate to specify the levels to be used in (a); the defaults are 1 and -1 (so the central point is at zero). Similarly, LSTAR specifies the levels for (b), which are taken, by default, to be the same as in (a). The star levels must, however, be equally spaced around the centre point. Option NSTARPOINTS defines how may replicates to have of each star point. The FRACTION supplies the denominator of a fractional design, if required for (a); the default of one indicates that a complete factorial design is to be used. The SEED option allows you to specify a seed for randomization, with a default of zero indicating no randomization. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGCENTRALCOMPOSITE will ask whether or not you wish to print the design.
Options:
PRINT, NCENTRALPOINTS, NSTARPOINTS, LFACTORIAL, LSTAR, FRACTION, SEED.Parameters:
TREATMENTFACTOR.
Method
The
QUESTION directive is used to obtain the necessary details of the design and this is then generated by the standard Genstat manipulation directives and procedure AGFRACTION.
AGCYCLIC procedure
Generates cyclic designs from standard generators
(M.F. Franklin & R.W. Payne)
Options
METHOD
= string Type of design - ordinary cyclic, cyclic change-over or cyclic superimposed (cyclic, changeover, superimposed); if unset in an interactive run AGCYCLIC will ask about the type of design, in a batch the default is assumed to be cyclic
Parameters
LEVELS
= scalars Number of treatmentsNBLOCKS
= scalars Number of blocksNUNITS
= scalars Number of units per block, or number of periods in a cyclic change-over designSEED
= scalars Seed for randomization; zero implies no randomizationTREATMENTS
= factors Identifier for the treatment factorSUPERIMPOSED
= factors Identifier for the second treatment factor in a cyclic superimposed designBLOCKS
= factors Identifier for the factor to index the blocksUNITS
= factors Identifier for the factor to index the units within each block, or the periods of a cyclic change-over designINITIAL
= variates or pointers To save one (variate) or more (pointer to variates) initial blocks
Description
Cyclic designs provide an effective way of assessing treatments using a block design where the blocks are each too small to hold all the treatments. In its simplest form, the cyclic method of generation starts with an initial block containing some subset of the treatments. This subset is represented by integers in the range 0...m-1 where m is the number of treatment levels. The second and subsequent blocks are then generated by successive addition modulo m of one to the numbers in the subset. Some designs have more than one initial block, and the increment need not be one. Further details of the method are given in the description of procedure
AFCYCLIC.The efficiency of the design depends very much on the choice of initial blocks. Procedure
AGCYCLIC selects appropriate initial blocks from a repertoire obtained mainly from the program DSIGNX (Franklin & Mann 1986), and including designs from Davis & Hall (1969), Hall & Williams (1973) and John, Wolock & David (1972). It then calls AFCYCLIC to generate the design.AGCYCLIC
is easiest to use interactively. It then asks questions to determine the necessary information to form the design. In particular, it will tell you which block sizes are available for your chosen number of treatments. The options and parameters allow you to anticipate questions, or to define all the necessary information if you want to use AGCYCLIC in batch.The first question, which can be anticipated by setting the
METHOD option, determines the type of cyclic design. In addition to the standard cyclic designs, AGCYCLIC can also generate the cyclic change-over designs of Davis & Hall (1969) and the cyclic superimposed designs of Hall & Williams (1973). The change-over designs are used for trials in which subjects are given different treatments in different time periods; these thus have a crossed block structure subjects*periods. The extension in the cyclic superimposed design is that there are two treatment factors (each with the same number of levels); the design is intended to estimate their main effects but not their interaction.The
PRINT option controls whether AGCYCLIC prints a plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGCYCLIC will ask whether or not you wish to print the design, after it has been generated.The number of treatments can be defined using the
LEVELS parameter. Similarly, the NBLOCKS and NUNITS parameters define the number of blocks and the number of units per block (or the number of periods in a cyclic change-over design). The SEED parameter allows you to specify a seed to be used to randomize the design. In batch the default seed is zero, to suppress randomization. If you do not set SEED when running interactively AGCYCLIC will ask for a seed, and again a zero value suppresses any randomization.Parameters,
TREATMENTS, SUPERIMPOSED, BLOCKS and UNITS, allow you to specify identifiers for the treatment, the superimposed treatment (for a cyclic superimposed design), the block and unit-within-block factors. If these are not specified in a batch run, AGCYCLIC will use identifiers that are local within the procedure and thus lost at the end of the procedure. If you are running interactively, AGCYCLIC will ask you to provide identifiers, and these will remain available after AGCYCLIC has finished running.Finally, the
INITIAL parameter allows you to save the initial blocks, in a variate if there is only one, or in a pointer (to a list of variates) if there are several.
Option:
PRINT METHOD.Parameters:
Method
The
QUESTION directive is used to obtain the necessary details of the design. The initial blocks are then recovered from a backing-store file and procedure AFCYCLIC is called to generate the design.
References
Davis, A.W. & Hall, W.B. (1969). Cyclic change-over designs. Biometrika 56, 283-293.
Hall, W.B. & Williams, E.R. (1973). Cyclic superimposed designs. Biometrika 60, 47-53.
John, J.A., Wolock, F.W. & David, H.A. (1972). Cyclic Designs. National Bureau of Standards, Applied Mathematics Series 62.
AGDESIGN procedure
Generates generally balanced designs
(R.W. Payne)
Options
ANALYSE
= string Controls whether or not to analyse the design, and produce a skeleton analysis-of-variance table using ANOVA (no, yes); default is to ask if this is unset in an interactive run, and not to analyse if it is unset in a batch runFILENAME
= text Name of the backing store file containing the design information; default uses the standard design fileSUBFILE
= identifier Subfile of the backing store file to be used
Parameters
DESIGN
= variates Contains codes to indicate the choice of designTREATMENTFACTORS
= pointers Specifies identifiers for the treatment factorsBLOCKFACTORS
= pointers Specifies identifiers for the block factorsPSEUDOFACTORS
= pointers Specifies identifiers for any pseudo-factorsREPLICATEFACTOR
= factors Specifies the identifier of the factor to represent the replicates (if any) in each designUNITLABELS
= variates Specifies the identifier of a variate to store a unique numerical label for each plot in the designSEED
= scalars Seed to be used to randomize each design; zero implies no randomization
Description
AGDESIGN
generates the factors and, if necessary, pseudo-factors required to define a generally balanced design. It also sets the block and treatment formulae (using the BLOCKSTRUCTURE and TREATMENTSTRUCTURE directives) to allow the design to be analysed by ANOVA. It can be accessed most conveniently through interactive Genstat design system, using the procedure DESIGN.AGDESIGN
relies upon a backing-store subfile that contains a repertoire of available designs, together with the information required to form them. FILENAME has a default file containing four subfiles.FACTORIAL
- factorial designs (with blocking): these have several treatment factors and a single blocking factor (giving strata for blocks and plots within blocks); the blocks are too small to contain a complete replicate of the treatment combinations and so various interaction are confounded with blocks.LATTICE
- lattice designs: designs for a single treatment factor with number of levels that is the square of some integer k; the design has replicates, each containing k blocks of k plots, and different treatment contrasts can be confounded with blocks in each replicate.LATTSQ
- lattice squares: these are similar to lattices except that the blocking structure with the replicates has rows crossed with columns; again different treatment contrasts can be confounded with the rows and columns in each replicate.LATIN
- Latin squares: designs are available for 3 to 14 treatments; several different orthogonal squares are available for most of these so, for example, Graeco Latin squares can be formed by using a different square for each of the two treatment factors.If the default
FILENAME is being used, the usual abbreviation rules are used to match SUBFILE with the names of the subfiles in the default file.The backing-store file can be created by a procedure called
FDESIGNFILE. This requires a data file, details of whose format can be obtained by setting option PRINT=filestructure when running FDESIGNFILE. You can thus provide additional files of designs which can be accessed by setting the FILENAME and SUBFILE options as appropriate.AGDESIGN
has two other options. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGDESIGN will ask whether or not you wish to print the design. The other setting catalogue lists the designs in the subfile. Similarly the ANALYSE option governs whether or not AGDESIGN produces a skeleton analysis-of-variance table (containing just source of variation, degrees of freedom and efficiency factors). Again AGDESIGN assumes that this is not required if ANALYSE is unset in a batch run, and asks whether it is required if ANALYSE is unset in an interactive run.The information required to select the design and give identifiers to its factors can be defined using the parameters of
AGDESIGN. In an interactive run, AGDESIGN will ask questions to obtain any necessary information that is not supplied in this way; when running in batch, if any of the required information has not been specified, AGDESIGN will terminate with a warning message.It is thus easiest to use
AGDESIGN interactively. Then only the SUBFILE option need be set (assuming that you are happy to use the standard default design file), and the other information will be obtained by (clearly explained) questions. You need set the parameters only if you wish to anticipate some of the questions, or if you wish to use AGDESIGN in batch.The
DESIGN parameter can supply a variate whose first value selects the "type" of design: for example, in the LATTICE subfile, this would select between a 3´ 3 lattice, a 4´ 4 lattice, and so on. Some of these designs are available in several different "versions": for example, in lattice designs there are several ways of defining which treatment contrasts are to be confounded with blocks. If there is more than one version, the second and subsequent values of the DESIGN variate indicate which version, or versions, are required. These need not be distinct so, for example, you can replicate a basic design several times. If the variate has a single value, AGDESIGN will select the first version.The
TREATMENTFACTORS parameter can specify a pointer to supply identifiers for the treatment factors in the design. For example, if there are two factors you could define their identifiers to be A and B by forming the pointer Tf (say) with the statementPOINTER [VALUES=A,B] Tf
and then setting
TREATMENTFACTORS=Tf. Alternatively, and more succinctly, you could put TREATMENTFACTORS=!p(A,B), where !p(A,B) is an unnamed pointer containing the required two identifiers. Similarly the BLOCKFACTORS parameter can specify a pointer to define the identifiers for the block factors in the basic design. If you have requested several versions, or several replicates, of the basic design AGDESIGN will also need a factor to represent the replicates. The identifier of this factor can be supplied using the REPLICATEFACTOR parameter. Partially balanced designs, such as lattices, will require pseudo-factors in the treatment formula to enable the design to be analysed by ANOVA. Identifiers can be supplied for these using the PSEUDOFACTORS parameter.The
UNITLABELS parameter can specify a variate to store a unique number to label each of the plots in the design. In the first replicate (or version) in the generated design, the variate contains the numbers one up to the number of plots per replicate. The second replicate (if any) contains these numbers plus the smallest power of ten greater than the number of plots per replicate, the third replicate contains the numbers plus twice this power of ten, and so on.The
SEED parameter allows you to specify a seed to randomize the design. In a batch run, this has a default of zero, to suppress randomization. If SEED is unset in an interactive run, you will be asked to provide a seed (and again a zero value will leave the design unrandomized).
Options:
PRINT, ANALYSE, FILENAME, SUBFILE.Parameters:
Method
The
QUESTION directive is used to obtain the details of the required design. The design is then generated using GENERATE and the other standard Genstat directives for calculation and manipulation.
AGFRACTION procedure
Generates fractional factorial designs
(M.F. Franklin & R.W. Payne)
Options
ANALYSE
= string Controls whether or not to analyse the design, and produce a skeleton analysis-of-variance table using ANOVA (no, yes); default is to ask if this is unset in an interactive run, and not to analyse if it is unset in a batch runFACTORIAL
= scalar Limit on number of factors in treatments terms in the analysis of variance; default 2FILENAME
= text Name of the backing store file containing the design information; default uses the standard fractional design file
Parameters
LEVELS
= scalars Number of levels of the treatment factors in each designFRACTION
= scalars Denominator of required fractionNTREATMENTFACTORS
= scalars Number of treatment factorsNUNITS
= scalars Number of units per blockSEED
= scalars Seed to be used to randomize each design; zero implies no randomizationTREATMENTFACTORS
= pointers Specifies identifiers for the treatment factorsBLOCKS
= factors Identifier for the block factorUNITS
= factors Identifier for the factor to index the units (or plots) within each block
Description
AGFRACTION
generates fractional factorial designs from stored keys & other information. It also sets the block and treatment formulae (using the BLOCKSTRUCTURE and TREATMENTSTRUCTURE directives) to allow the design to be analysed by ANOVA.The procedure relies upon a backing-store file that contains a repertoire of available designs, together with the information required to form them. There is a standard file, used by default, but the
FILENAME option allows you to specify another if you wish to form your own alternative file.AGFRACTION
has two other options. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGFRACTION will ask whether or not you wish to print the design. Similarly the ANALYSE option governs whether or not AGFRACTION produces a skeleton analysis-of-variance table (containing just source of variation, degrees of freedom and efficiency factors). Again AGFRACTION assumes that this is not required if ANALYSE is unset in a batch run, and asks whether it is required if ANALYSE is unset in an interactive run. The FACTORIAL option sets a limit on the number of factors in the treatment terms in the analysis of variance; by default, this is two.The information required to select the design and give identifiers to its factors can be defined using the parameters of
AGFRACTION. In an interactive run, AGFRACTION will ask questions to obtain any necessary information that is not supplied in this way; when running in batch, if any of the required information has not been specified, AGFRACTION will terminate with a warning message.It is thus easiest to use
AGFRACTION interactively. Then all the information necessary to select and define the required design will be obtained by (clearly explained) questions. You need set the parameters only if you wish to anticipate some of the questions, or if you wish to use AGFRACTION in batch.The number of levels of the treatment factors can be defined using the
LEVELS parameter. The FRACTION parameter defines the denominator of the required fraction, and the NTREATMENTFACTOR parameter specifies how many treatment factors the design is to contain. Thus, for example,
would print the plan of a quarter replicate of a 26 design.
For some of the designs it is possible also to allow a blocking factor (and you will be given details of what is feasible if you are running
AGFRACTION interactively). The NUNITS parameter can then be used to define the number of units per block.The
SEED parameter allows you to specify a seed to be used to randomize the design. In batch the default seed is zero, to suppress randomization. If you do not set SEED when running interactively AGFRACTION will ask for a seed, and again a zero value suppresses any randomization.The
TREATMENTFACTORS parameter can specify a pointer to supply identifiers for the treatment factors in the design. For example, if there are two factors you could define their identifiers to be A and B by forming the pointer Tf (say) with the statementPOINTER [VALUES=A,B] Tf
and then setting
TREATMENTFACTORS=Tf. Alternatively, and more succinctly, you could put TREATMENTFACTORS=!p(A,B), where !p(A,B) is an unnamed pointer containing the required two identifiers. The remaining parameters, BLOCKS and UNITS, allow you to specify identifiers for the block and unit-within-block factors. If the treatment, block or unit factors are not specified in a batch run, AGFRACTION will use identifiers that are local within the procedure and thus lost at the end of the procedure. If you are running interactively, AGFRACTION will ask you to provide identifiers, and these will remain available after AGFRACTION has finished running.
Options:
PRINT, ANALYSE, FACTORIAL, FILENAME.Parameters:
Method
The
QUESTION directive is used to obtain the details of the required design. The design is then generated using GENERATE and the other standard Genstat directives for calculation and manipulation.
AGHIERARCHICAL procedure
Generates orthogonal hierarchical designs
(R.W. Payne)
Options
ANALYSE
= string Controls whether or not to analyse the design, and produce a skeleton analysis-of-variance table using ANOVA (no, yes); default is to ask if this is unset in an interactive run, and not to analyse if it is unset in a batch runSEED
= scalars Seed to be used to randomize each design; zero implies no randomization
Parameters
BLOCKFACTORS
= factors Specifies the identifier for the block factor used to index the units of each stratum (or level of the hierarchy)TREATMENTFACTORS
= factors or pointersSpecifies the identifier of the treatment factor or factors applied to the units of each stratum
LEVELS
= scalars or pointers Number of levels for the treatment factors in each stratum; if required, a pointer can contain an extra scalar to specify replication
Description
AGHIERARCHICAL
forms orthogonal hierarchical designs: for example randomized blocks, split-plots, split-split-plots, and so on. The units of each stratum (or level of the hierarchy) are identified by a block factor: for example Replicates, Blocks, Plots, Subplots, Subjects &c.AGHIERARCHICAL
can be used either interactively or in batch. Interactively, there is no need to set any options or parameters - the procedure will ask questions to ascertain the necessary details of the design. The first question is to find out how many block factors (and thus strata) there are in the design. For a randomized block design there would be two, for example Blocks and Plots, defining strata for blocks and for plots within blocks. In a split-plot design there are three (for example Blocks, Wholeplots and Subplots) giving strata for blocks, whole plots within blocks and subplots within whole plots.The questions involve each stratum in turn, and asking first for the name of the block factor to be used to identify the units of the stratum. Next it asks how many treatment factors are applied to the units of that stratum. In a randomized block design, there are no treatment factors applied to the blocks and one, or more, applied to the plots, whereas in a split-plot design treatments are applied to both the whole plots and the subplots. It then asks for the names of the treatment factors, and how many levels they are to have. Alternatively, if there are no treatments applied to the stratum,
AGHIERARCHICAL asks how many levels the corresponding block factor should have - so, for example, it would how many blocks there should be in a randomized block or a split-plot design.The parameters of
AGHIERARCHICAL provide an alternative way of providing the details of the design. BLOCKFACTORS lists the block factors for the strata, TREATMENTFACTORS defines factors for the treatments applied to the units of the strata and LEVELS defines the levels of treatments and replication of block factors. For exampleAGHIERARCHICAL [PRINT=design; ANALYSE=yes] Blocks,Plots; *,A; 3,5
defines a randomized block design with three blocks, and a single treatment factor
A (applied to the plots) with five levels. If there are several factors in a stratum, the identifiers should be placed into a pointer. For example,AGHIERARCHICAL Blocks,Plots; *,!p(A,B); 3,2
for randomized block design with two treatment factors,
A and B, both with two levels. Similarly, if the factors in a stratum have different numbers of levels, the LEVELS parameter may contain pointers.AGHIERARCHICAL Blocks,Plots; *,!p(A,B); 3,!p(2,3)
defines
A to have two levels and B to have three. The pointer can contain an extra element to indicate that there is to be replication (as well as treatments) in a stratum.AGHIERARCHICAL Blocks,Plots; *,!p(A,B); 3,!p(2,3,4)
indicates that there are to be four replicates of the
A and B combination on the plots of each block.In an interactive run,
AGHIERARCHICAL will ask about the treatment factors and the levels if these are not set. In a batch run all three parameters must be set.The
SEED option allows you to specify a seed to randomize the design. In a batch run, this has a default of zero, to suppress randomization. If SEED is unset in an interactive run, you will be asked to provide a seed (and again a zero value will leave the design unrandomized).AGHIERARCHICAL
has two other options. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGHIERARCHICAL will ask whether or not you wish to print the design. Similarly the ANALYSE option governs whether or not AGHIERARCHICAL produces a skeleton analysis-of-variance table (containing just source of variation, degrees of freedom and efficiency factors). Again AGHIERARCHICAL assumes that this is not required if ANALYSE is unset in a batch run, and asks whether it is required if ANALYSE is unset in an interactive run.
Options:
PRINT, ANALYSE, SEED.Parameters:
BLOCKFACTORS, TREATMENTFACTORS, LEVELS.
Method
The
QUESTION directive is used to obtain the details of the required design. The design is then generated using GENERATE and the other standard Genstat directives for calculation and manipulation.
AGMAINEFFECT procedure
Generates designs to estimate main effects of two-level factors
(R.W. Payne)
Options
ANALYSE
= string Controls whether or not to analyse the design, and produce a skeleton analysis-of-variance table using ANOVA (no, yes); default is to ask if this is unset in an interactive run, and not to analyse if it is unset in a batch runFOLDED
= string Whether to include an extra "folded" replicate with the levels of each factor interchanged (no, yes); default noSEED
= scalar Seed to be used to randomize each design; default 0 implies no randomization
Parameter
TREATMENTFACTOR
= factors Treatment factors
Description
AGMAINEFFECT
generates designs for estimating main effects of factors with two levels, using a minimum number of experimental units; see Plackett & Burman (1946). The numbers of treatment factors for which designs are available can be printed by setting option PRINT=catalogue. They are, however, all expressible as 4n-1 for some integer n. The treatment factors are listed using the TREATMENTFACTOR parameter. If this is omitted in an interactive run, you will be asked how many factors you want and their names.The basic design allows the main effects to be estimated, but has no residual degrees of freedom. This is fine if you merely want to screen the main effects to identify the largest. Otherwise you can generate a design for more factors than are needed, and then use the degrees of freedom of the unnecessary factors to provide the residual. Alternatively, if you set option
FOLDED=yes, AGMAINEFFECT will include a "folded" replicate of the design: this is identical to the initial replicate except that the levels of the factors are swapped (level one instead of level two and vice versa). This particular arrangement has the advantage that no main effect is aliased with any first-order interaction.The
SEED option allows you to specify a seed for randomization, with a default of zero indicating no randomization. The PRINT option can be set to design to print the plan of the design. By default, if you are running Genstat in batch, the plan is not printed. If you do not set PRINT when running interactively, AGMAINEFFECT will ask whether or not you wish to print the design. Similarly the ANALYSE option governs whether or not AGMAINEFFECT produces a skeleton analysis-of-variance table (containing just source of variation, degrees of freedom and efficiency factors). Again AGMAINEFFECT assumes that this is not required if ANALYSE is unset in a batch run, and asks whether it is required if ANALYSE is unset in an interactive run.
Options:
PRINT, ANALYSE, FOLDED, SEED.Parameters:
TREATMENTFACTOR.
Method
The designs are based on the Hadamard matrices that are stored in the design system for forming balanced incomplete block designs (see procedure
AGBIB). The QUESTION directive is used to obtain the necessary details of the design and this is then generated by the standard Genstat manipulation directives and procedure AGBIB.
Reference
Plackett, R.L. & Burman, J.P. (1946). The design of optimum factorial experiments. Biometrika, 33, 305-325 & 328-332.
AGNEIGHBOUR procedure
Generates neighbour-balanced designs
(R.W. Payne)
Options
METHOD
= string Type of design, n-1 blocks of n plots, or n blocks of n-1 plots (N_1BLOCKS, NBLOCKS); if unset in an interactive run AGNEIGHBOUR will ask about the type of design, in a batch the default is assumed to be n blocks of n-1 plots
Parameters
LEVELS
= scalars Number of treatmentsSEED
= scalars Seed for randomization; in batch there is a default of 12345TREATMENTS
= factors Identifier for the treatment factorBLOCKS
= factors Identifier for the factor to index the blocks within replicatesUNITS
= factors Identifier for the factor to index the units within each block, or the periods of a cyclic change-over designLEFTNEIGHBOUR
= factors To save the treatment on the left neighbouring unitRIGHTNEIGHBOUR
= factors To save the treatment on the right neighbouring unit
Description
In experiment designs it is often necessary to allow for the possibility that a treatment may have an effect on neighbouring plots, as well as on its own plot. For example, in variety trials, tall varieties may shade their neighbours. Likewise, in experiments on insecticides and fungicides, there may be cross infection from plots receiving control or ineffective treatments to neighbouring plots. In both of these examples the neighbour effect may depend on direction (for example of prevailing wind or of sunlight), so it is usual to distinguish between left and right neighbours. To avoid bias when comparing the effects of treatments in these situations, it is important to ensure that no treatment is unduly disadvantaged by its neighbours. This is best done by using a neighbour-balanced design. Here the allocation of treatments is such that every treatment occurs equally often with each other treatment as a right neighbour, and as a left neighbour.
The table below shows a design for five treatments in 5 blocks of size 4. Notice that in addition to the experimental plots, the design also needs a line of treated border plots on each side. These provide the neighbouring treatments for plots 1 and 4, but do not provide yields or other response variables. The border plots are not included in the generated factor values.
Plot border 1 2 3 4 border
Block
1 5 | 2 3 1 5 | 2
2 3 | 5 4 1 3 | 5
3 4 | 2 5 3 4 | 2
4 1 | 4 3 2 1 | 4
5 4 | 5 1 2 4 | 5
Methods of constructing and randomizing neighbour-balanced designs for n treatments in either n blocks of n-1 plots or in n-1 blocks of n plots are described by Azais, Bailey and Monod (1993) together with generators for 3£ n£ 16 (other than for n=4 or 6 with n-1 blocks of size n, for which no designs are available). AGNEIGHBOUR uses these methods and generators, together with some further generators for blocks of n-1 plots formed using the method of Azais (1987).
AGNEIGHBOUR
is easiest to use interactively. It then asks questions to determine the necessary information to form the design, and indicates the numbers of treatments for which designs are available. The options and parameters allow you to anticipate questions, or to define all the necessary information if you want to use AGNEIGHBOUR in batch.The first question, which can be anticipated by setting the
METHOD option, determines the type of design: n blocks of n-1 plots (METHOD=nblocks) or in n-1 blocks of n plots (METHOD=n_1blocks). The default in batch is n_1block. The PRINT option controls printed output, with setting design to print a plan of the design, and catalogue to print a list of the available designs. By default, if you are running Genstat in batch, nothing is printed. If you do not set PRINT when running interactively, AGNEIGHBOUR will ask whether or not you wish to print the design, after it has been generated.The number of treatments can be defined using the
LEVELS parameter. This can be set to zero to avoid constructing a design, as may be required if you merely wish to print the catalogue. The SEED parameter allows you to specify a seed to be used to randomize the design. If you do not set SEED when running interactively AGNEIGHBOUR will ask for a seed. In batch there is a default of 12345. Parameters TREATMENTS, BLOCKS and UNITS, allow you to specify identifiers to save the treatment, the block and unit-within-block factors. If these are not specified in a batch run, AGNEIGHBOUR will use identifiers that are local within the procedure and thus lost at the end of the procedure. If you are running interactively, AGNEIGHBOUR will ask you to provide identifiers and these will remain available after AGNEIGHBOUR has finished running. There are also parameters LEFTNEIGHBOUR and RIGHTNEIGHBOUR to allow you to save the treatments on the left and right neighbouring plots.Some of the designs are such that each ordered pair of treatments occurs the same number of times as the left and right neighbours of some other treatment, the design is then said to be neighbour-balanced at distance 2. These designs have the further advantage that they are balanced if analysed with
ANOVA withBLOCKSTRUCTURE BLOCKS / UNITS
TREATMENTSTRUCTURE TREATMENTS + LEFTNEIGHBOUR + RIGHTNEIGHBOUR
Option:
PRINT, METHOD.Parameters:
Method
The generation methods are described by Azais, Bailey & Monod (1993). The
QUESTION directive is used to obtain the necessary details of the design and this is then generated by the standard Genstat manipulation directives.
References
Azais, J.M-. (1987). Design of experiments for studying intergenotypic competition. Journal of the Royal Statistical Society, Series B 49, 334-345.
Azais, J.M-., Bailey, R.A. & Monod, H. (1993). A catalogue of efficient neighbour designs with border plots. Biometrics 49, 1252-1261.
AGRAPH procedure
Plots one- or two-way tables of means from
ANOVA(R.W. Payne)
Options
GRAPHICS
= string Type of graph (highresolution, lineprinter); default highMETHOD
= string What to plot: means to plot just the means, lines to plot the means joined by lines, and data to plot lines joining the mean values and points representing the original data (means, lines, data); default meanXFREPRESENTATION
= string How to label the x-axis (levels, labels); default labels uses the XFACTOR labels, if availableSAVE
= ANOVA save structure Save structure to provide the table of means if the MEANS parameter is unset; default uses the save structure from the most recent ANOVA
Parameters
XFACTOR
= factors Factor providing the x-values for each plot; by default this is chosen automaticallyGROUPS
= factors Factor identifying the different lines from a two-way table; by default chosen automaticallyMEANS
= tables Table of means to be plotted; default obtains the table from the structure specified by the SAVE parameter or, if this too is unset, from the most recent ANOVABAR
= scalars Length of error bar to be plotted to indicate the variability of the means; default calculates the standard error for differences between means, if possible, from the SAVE structure (or from the most recent ANOVA)NEWXLEVELS
= variates Values to be used for XFACTOR instead of its existing levels
Description
AGRAPH
plots tables of means. In its simplest form, if none of the options nor parameters are specified, AGRAPH plots a high-resolution graph for the first two-way table of means in the most recent ANOVA, or for the first one-way table if there were no two-way tables. Usually, each mean is represented by a point (using pens 1, 2, and so on for each level in turn of the second factor). However, with high-resolution plots, the METHOD option can be set to lines to draw lines between the points, or data to draw just the lines and then also plot the original data values. The GRAPHICS option controls whether a high-resolution or a line-printer graph is plotted; by default GRAPHICS=high.The
MEANS parameter allows the table of means to be specified explicitly. If this is not set, AGRAPH obtains the table automatically from ANOVA - from the save stucture specified by the SAVE option or from the most recent analysis if this too is unset. The BAR parameter can specify the size of an error bar (such as a standard error for differences between the means) to be plotted on the graph; if this is not specified, AGRAPH will calculate an average standard error of difference, if possible, for the body of the table from the SAVE structure (or from the most recent ANOVA).The
XFACTOR parameter indicates the factor against whose levels the means are plotted. With a two-way table, a separate line will be plotted for every level of a second factor. If the table is being obtained automatically, this second factor can be specified explicitly using the GROUPS parameter. If neither XFACTOR nor GROUPS are specified with a two-way table, AGRAPH will select the XFACTOR according to the following criteria (in decreasing order of importance): that the factor has no labels, that it has levels that are not the default integers 1 upwards, or that it has more levels than the other factor. Two-way tables can be obtained from ANOVA and plotted, even if the analysis contained only the main effects of the two factors; however, the lines will then all be parallel. The NEWXLEVELS parameter enables different levels to be supplied for XFACTOR if the existing levels are unsuitable. If XFACTOR has labels, these are used to label the x-axis unless option XFREPRESENTATION=levels.
Options:
GRAPHICS, METHOD, XFREPRESENTATION, SAVE.Parameters:
XFACTOR, GROUPS, MEANS, BAR, NEWXLEVELS.
Method
To set the various defaults (if necessary)
AGRAPH uses the GET directive to obtain the ANOVA save structure, AKEEP to obtain the TREATMENT formula, FCLASSIFICATION to obtain the first two factors and GETATTRIBUTE to discover which have labels and so on.
AKAIKEHISTOGRAM procedure
Prints histograms with improved definition of groups
(A. Keen)
Options
CHANNEL
= scalar Channel number of output file; default is the current output fileTITLE
= text General title; default 'Histogram of ...', where ... is the identifier of the structure specified by DATALOWER
= scalar Lowest class limitWIDTH
= scalar Interval widthSCALE
= scalar Number of units represented by each symbol; default 1 (or more if the page width is not sufficient)
Parameters
DATA
= identifiers Data for the histograms (variate, table, factor or matrix)NOBSERVATIONS
= tables One-way table to save numbers in the groupsGROUPS
= factors Factor to save groups defined, with LEVELS the midpoints of the intervals and LABELS as LEVELS, but as text-vectorSYMBOLS
= texts Characters to be used to represent the bars of each histogramDESCRIPTION
= texts Annotation for key
Description
The procedure
AKAIKEHISTOGRAM has been designed as an alternative for the Genstat directive HISTOGRAM, for cases where the default settings are not optimal. Such cases may arise due to the following disadvantages of HISTOGRAM:-
- The default number of groups equals the square root of the number of observations, irrespective of the shape of the distribution. In some situations (for instance if the number of observations is large) the number of groups is unnecessarily large; in other situations (for instance if the shape of the distribution is complex) the number of groups can be too small. If the number of groups is too large, then differences in numbers of observations between neighbouring classes may be just random fluctuations, while if the number of groups is too small, valuable information is lost.
- The specification of own class limits (in a variate) can be rather cumbersome, especially if many histograms have to be produced.
AKAIKEHISTOGRAM
aims to avoid these disadvantages of HISTOGRAM. By default an "optimal" number of groups is determined using Akaike's Information Criterion.Alternatively, own class limits can be specified using options
LOWER and WIDTH instead of the option LIMITS of HISTOGRAM. In a FOR loop different values for the lower limit and/or for the interval width can be specified for different quantitative structures. Scalars with missing values can be used to specify default values for these options. Option LOWER is especially important if the observations have a "natural" lower limit, for example the value 0; then 0 is taken as the lower limit of the first group and the first group has the same interval width as the following groups.The option
TITLE and the parameters of HISTOGRAM have been transferred to AKAIKEHISTOGRAM. However, options NGROUPS and LABELS from HISTOGRAM have been omitted, because they are not in line with the style of AKAIKEHISTOGRAM.
Options:
CHANNEL, TITLE, LOWER, WIDTH, SCALE.Parameters:
DATA, NOBSERVATIONS, GROUPS, SYMBOLS, DESCRIPTION.
Method
The optimality criterion used is Akaike's Information Criterion (AIC), which is twice the number of free parameters of the model (that is, the number of groups minus 1) minus the maximal log likelihood of the observations under the multinomial model. The starting histogram is a histogram with equal length intervals and more than sufficient groups. From this histogram, new histograms are derived with interval length r times the interval length of the starting histogram, r = 2 ... etc. The "optimal" histogram is the one with minimal AIC. The basic idea for the method is obtained from Sakamoto, Ishiguro & Kitagawa (1986); also see Taylor (1987).
The starting histogram is obtained as follows. First the range of the observations is divided into five equal length intervals from which the apparent number of observations Na is calculated as five times the number of observations in the interval with the largest frequency. Na is then used as the number of observations instead of the true number, and the number of groups Ng is calculated as five times the number obtained from Sturgess' formula (see, for example, Sakamoto, Ishiguro & Kitagawa (1986), page 117.):
Ng = 5 ´ ( 1 + log10( Na/2 ))
The final limits of the starting histogram are obtained by a relatively strong rounding-off of the class limits (comparable with that in
HISTOGRAM), where the width is always a multiple of the rounding-off interval.
Action with
RESTRICTThe structures in
DATA can be restricted, and in different ways; AKAIKEHISTOGRAM uses only those units that are not excluded by their respective restrictions.
References
Sakamoto, Y., Ishiguro, M & Kitagawa, G. (1986). Akaike Information Statistics. D. Reidel Publishing Company. Dordrecht.
Taylor, C.C., (1987). Akaike's Information Criterion and the Histogram. Biometrika, 74, 636-639.
AKEY procedure
Generates values for treatment factors using the design key method
(R.W. Payne)
Options
BLOCKFACTORS
= factors Defines the block factors for the design; default is to take those in the formula already specified by the BLOCKSTRUCTURE directive, in the order in which they occur thereKEY
= matrix Matrix (number of treatment factors ´ number of block factors) key for the designBASEVECTOR
= variate Base vector (length = number of treatment factors) for the design; default is a variate of zerosROWPRIMES
= variate Prime numbers for the rows of the KEY matrixCOLPRIMES
= variate Prime numbers for the columns of the KEY matrixROWMAPPINGS
= variate Mappings from the rows of the KEY to the TREATMENTFACTORSCOLMAPPINGS
= variate Mappings from the columns of the KEY to the BLOCKFACTORS
Parameter
TREATMENTFACTORS
= factors Defines the treatment factors for the design; default is to take those in the formula already specified by the TREATMENTSTRUCTURE directive, in the order in which they occur there
Description
AKEY
generates the values of the block factors, if necessary, in systematic order and then generates the treatment factors from the block factors using a design key. It then allows you to print the design.The design key method, described by Patterson (1976) and Patterson & Bailey (1978), provides a very flexible way of specifying the allocation of treatments in an experimental design. The method assumes that the units are identified by a set of what are termed "plot" factors. Generally these will be the same factors that are used in the block formula. Thus, in the procedure, they are specified by an option called
BLOCKFACTORS which will take the factors from the formula already set by the BLOCKSTRUCTURE directive (outside the procedure) as its default. However, if any of these factors has a non-prime number of levels, it may need to be specified instead as the combination of two or more (pseudo) factors: for example, in a block design with blocks of size eight, the plots might need to be indexed by three factors with two levels (see Example 4). The method can also be used to set up pseudo-factors for use in the treatment formula, and then the "plot" factors may be the treatment factors themselves (Example 3). If these "plot" factors do not already have values, they will be generated in "standard order" using the GENERATE directive.The factors whose values are to be generated are specified by the
TREATMENTFACTORS parameter. Again this can be omitted, and AKEY will take the factors from the existing setting of the TREATMENTSTRUCTURE directive, in the order in which they occur there.The generated values of the factors can be printed by setting option
PRINT=design. The other options define how the values are generated. The KEY option specifies a matrix known as the design key, which indicates how the values of each treatment factor are to be calculated from the plot factors. The matrix has a row for each treatment factor and a column for each plot factor; below Kij represents the element in row i and column j. (This is the transpose of the form used by Patterson 1976, but in Genstat it seems more convenient to specify the treatments by rows.) There is also an option called BASEVECTOR, which can specify a variate with an element Bi for each treatment factor to allow the levels of the factor to be shifted cyclically; by default this is a variate of zeros.The calculation assumes that the values of the plot factors are represented by the integers zero upwards (and
AKEY will perform this mapping automatically if necessary). The value q[i]u in unit u of treatment factor i is then given byq[i]u = bi + ki1 ´ p[1]u + ki2 ´ p[2]u + ... + kin ´ p[n]u modulo ti
where p[1]u ... p[n]u are the values of the plot factors in unit u, and ti is the number of levels of treatment factor i. The calculated values are integers in the range 0, 1 ... ti-1, but AKEY will again map these to the defined levels if necessary. However, all this takes place behind the scenes, within AKEY. The numbers of levels ti must be prime numbers. They need not all be equal, but the key must be zero in any element where the row and column factors have different numbers of levels: that is, each treatment factor must be generated only from "plot" factors with the same number of levels as the treatment factor itself.
To illustrate the process, the treatments to be allocated (before randomization) to the plots of an N ´ N Latin Square may be calculated as
Latin-factor-value = Row-factor-value + Column-factor-value modulo N
The values of the extra factor in a Graeco-Latin square can then be formed as
Graeco-factor-value = Row-factor-value + 2 ´ Column-factor-value modulo N
The design key thus has rows (1,1) and (1,2); as shown in Example 1, this generates the following 5 ´ 5 Graeco-Latin square.
Column 0 1 2 3 4
Row
0 0 0 1 2 2 4 3 1 4 3
1 1 1 2 3 3 0 4 2 0 4
2 2 2 3 4 4 1 0 3 1 0
3 3 3 4 0 0 2 1 4 2 1
4 4 4 0 1 1 3 2 0 3 2
If any of the block or treatment factors has a non-prime number of levels, it must be specified as the combination of two or more (pseudo) factors: for example, in a block design with blocks of size eight, the plots would need to be specified by three factors with two levels (see Example 4). Thus the
COLPRIMES option allows you to supply a variates listing the prime numbers for each column of the key, and the COLMAPPINGS option then to indicate the "plot" factor corresponding to each column. So, in Example 4, where we haveAKEY [BLOCKFACTORS=Block,Plot; KEY=HRkey; \
COLPRIME=!(4(2)); COLMAP=!(1,2,2,2)]
COLPRIME specifies that the prime for each column is 2, COLMAP specifies that the first column corresponds to the first "plot" factor (Block in the example) and that columns 2-4 correspond to the second "plot" factor (Plot in the example). The default for COLMAP is a variate containing the integers 1 up to the number of "plot" factors, so it can be omitted if no pseudo-factors are required. If COLPRIME is omitted, the primes for the columns are provided by the numbers of levels of the "plot" factors, as already explained. Options ROWPRIME and ROWMAP similarly allow you to specify pseudo-factors to generate the treatment factors.
The design key thus provides a very convenient way of defining treatment factors. Patterson & Bailey (1978) show a range of examples of keys, which are used to form the worked examples below. Essentially, the key identifies each factor i with the set of contrasts (in the usual terminology)
and the skill when forming a design is in selecting the best set for each factor. The Genstat design system has a repertoire of keys, and these are used by procedures
DESIGN and AGDESIGN to generate a range of designs, including factorials, fractional factorials, Latin squares and Lattices.
Options:
Parameter:
TREATMENTFACTORS.
Method
The
FCLASSIFICATION and FORMULA directives are used, if necessary, to form lists of factors from the block or treatment formulae. The factor levels are then generated using the standard Genstat facilities for calculations and manipulation.
Action with
RESTRICTIf any of the factors is restricted, only the part of the design not excluded by the restriction will be generated.
References
Patterson, H.D. (1976). Generation of factorial designs. J. R. Statist. Soc. B, 38, 175-179.
Patterson, H.D. & Bailey, R.A. (1978). Design keys for factorial experiments. Applied Statistics 27, 335-343.
ALIAS procedure
Finds out information about aliased model terms in analysis of variance
(R.W. Payne)
Options
TREATMENTSTRUCTURE
= formula Treatment model for the designBLOCKSTRUCTURE
= formula Block model for the designFACTORIAL
= scalar Value used in the FACTORIAL option of ANOVA if not the defaultDESIGN
= pointer Design structure for the analysis
Parameter
TERM
= factors Factors defining the aliased model term
Description
When a term is aliased in an analysis of variance, it is listed in the Information summary ( produced by
ANOVA [PRINT=information] ) under the heading "Aliased model terms" (see the Genstat 5 Reference Manual, Section 9.7.1). However ANOVA does not indicate the terms with which it is aliased. This information can be obtained using procedure ALIAS.The aliased term is specified by setting the
TERM parameter to the list of factors that define it. The structure of the design can be specified either by options BLOCKSTRUCTURE and TREATMENTSTRUCTURE (together with option FACTORIAL, if necessary); alternatively you can save the design structure from the original analysis and supply this using the DESIGN option - this is the only way of specifying the design if there are weights or if the analysis is restricted. If an undeclared structure is specified for DESIGN (and BLOCKSTRUCTURE and TREATMENTSTRUCTURE are also specified), it will be set to the design structure for the analysis.
Options:
TREATMENTSTRUCTURE, BLOCKSTRUCTURE, FACTORIAL, DESIGN. Parameter: TERM.
Method
The procedure calculates a set of dummy effects for the aliased model term, and then forms and analyses a variate in which only these effects are present. The analysis detects the model terms to which the term is aliased as those that have non-zero sums of squares.
Action with
RESTRICTNone of the options nor parameters are vectors. To indicate that the analysis is of a restricted set of units you must use the
DESIGN option to specify the design structure from the original analysis.
AMERGE procedure
Merges extra units into an experimental design
(R.W. Payne)
Option
SORT
= string Whether to sort the factors afterwards (no, yes); default no
Parameters
FACTOR
= factors Factors to which the new units are to be addedNEWUNITS
= factors, variates or scalarsExtra units to be added to each factor
Description
AMERGE
provides a convenient way of adding extra units into an experimental design. In the simplest case, this can be used to add control treatments to an already generated factorial design. More complicated uses may join together two completely different designs, from example a randomized block design to a balanced incomplete block design. These are both illustrated in the example.The factors of the design which is to be augmented are specified using the
FACTOR parameter, and the units that are to be added to each one are specified by the NEWUNITS parameter. The same number of units must be added to every FACTOR, and their levels (and labels) will be extended, if necessary, according to those defined on the units that are added. New units of a factor that are to receive different levels should be specified in a factor or a variate. Alternatively, if every new unit is to receive the same level of the FACTOR, NEWUNIT can be set to a scalar.The
SORT option can be set to yes to request that the FACTOR values are sorted after the new units have been added. Otherwise, they are simply placed at the end of the existing values.
Option:
SORT.Parameters:
FACTOR, NEWUNITS.
Method
AMERGE
uses the standard Genstat manipulation facilities.
Action with
RESTRICTAny restrictions on the vectors are ignored.
ANTMVESTIMATE procedure
Estimates missing values in repeated measurements
(M.G. Kenward & R.W. Payne)
Options
GROUPS
= factor Factor indicating the plot on which each sequence of observations was madeORDER
= scalar Order of ante-dependence structure (i.e. number of past times for which to adjust)
Parameters
DATA
= variates Observations at each timeNEWDATA
= variates Data variates with missing observations replaced by their estimatesMEANPROFILE
= tables Estimated mean profiles at each time
Description
Suppose that we have a set of experimental units, or plots, within which observations are made in several locations at a sequence of times. Data from some of the locations may be missing at various times. The observed data values are specified in separate variates, one for each time point, using the
DATA parameter. The factor identifying the experimental unit on which each observation was made is specified using the GROUPS option.ANTMVESTIMATE
assumes that the data have an ante-dependence (AD(r)) covariance structure whose order can be specified using the ORDER option; if this is not set, ANTMVESTIMATE takes the maximum possible order, number of times minus one. Using this assumption, ANTMVESTIMATE estimates the missing values and calculates the mean profiles for each unit. These can be saved, in tables indexed by the GROUPS factor, using the MEANPROFILES parameter, or printed by setting the PRINT option to meanprofiles. Also, the NEWDATA parameter allows new variates to be saved with the missing values replaced by their estimates.
Options:
PRINT, GROUPS, ORDER. Parameters: DATA, NEWDATA, MEANPROFILE.
Method
The algorithm in the procedure is a first-order approximation to maximum likelihood estimation which has the advantage of requiring only one pass through the data. At each time point, current plot means are estimated using the equations of maximum likelihood under an AD(r) covariance structure. The calculations required are simply those of analysis of covariance with previous individual measurements as covariates. Where previous measurements are missing they are replaced by previously estimated mean values and if there are no previous missing values the estimated plot means are full maximum likelihood estimates. The procedure uses a single pass through the time points. If the whole cycle were iterated to convergence joint maximum likelihood estimates of all the plot means would be obtained. Full details are given by Kenward (1994).
Action with
RESTRICTAny restriction on the data variates will be cancelled and a warning printed.
Reference
Kenward, M.G. (1994). The estimation of mean plot profiles and the identification of atypical plots using incomplete sequences of porous cup nitrate levels. Rothamsted Technical Report written for ADAS Biometric Unit, Cheltenham.
ANTORDER procedure
Assesses order of ante-dependence for repeated measures data
(M.S. Ridout & R.W. Payne)
Options
TREATMENTSTRUCTURE
= formula Treatment formula for the model at each time; if this is not set, the default is taken from the setting (which must already have been defined) of the TREATMENTSTRUCTURE directiveBLOCKSTRUCTURE
= formula Block formula for the model at each time; if this is not set, the default is taken from any existing setting specified by the BLOCKSTRUCTURE directive and if neither has been set the design is assumed to be unstratified (i.e. to have a single error term)MAXORDER
= scalar Maximum order against which to test; default is maximum possible orderFACTORIAL
= scalar Limit on the number of factors in a treatment term
Parameter
DATA
= variates Data variates (observed at successive times) for an analysis
Description
A repeated measures experiment is one in which the same set of units, or subjects, is observed at a sequence of times to investigate treatment effects over a period of time. The set of variates observed at the successive times is said to have an ante-dependence structure of order r if each ith variate (i>r), given the preceding r, is independent of all further preceding variates (Gabriel 1961, 1962). Procedure
ANTORDER calculates statistics to assist in the selection of an appropriate order of ante-dependence structure for sets of repeated measures data, using the method of Kenward (1987). Once the order of ante-dependence structure has been established, the individual variates can be analysed individually by analysis of covariance, adjusting for the r previous variates, to assess the times at which treatment effects occurred. Also, procedure ANTTEST can be used to perform overall tests of treatment effects.The data variates are supplied, as a list, by the
DATA parameter. The variates may contain missing values, provided these are "dropouts", that is, once a unit becomes missing it should remain missing at all subsequent times.The model for the design is specified by options of the procedure.
TREATMENTSTRUCTURE specifies a model formula to define the treatment terms in the analysis; if this is unset, ANTORDER will use the model already defined by the TREATMENTSTRUCTURE directive, or will fail if that too has not been set. BLOCKSTRUCTURE defines the underlying structure of the design, and ANTORDER will use the model (if any) previously defined by the BLOCKSTRUCTURE directive if this is not set; these can both be omitted if there is only one error term (i.e. if the design is unstratified). Option MAXORDER specifies the maximum order of ante-dependence structure to be tested; by default, this is taken as the maximum possible order (the smaller of the number of times minus one or the number of residual degrees at each time; see Kenward 1987). The FACTORIAL option can be used to set a limit on the number of factors in the terms generated from the treatment formula.
Options:
TREATMENTSTRUCTURE, BLOCKSTRUCTURE, MAXORDER, FACTORIAL.Parameter:
DATA.
Method
The procedure uses the method of Kenward (1987) to calculate the statistics using residual sums of squares from analysis of covariance. For further details of ante-dependence see Gabriel (1961, 1962).
Action with
RESTRICTAny restriction on the
DATA variates will be applied to all of them.
References
Gabriel, K.R. (1961). The model of ante-dependence for data of biological growth. Bulletin Institut International Statistique (Paris), 39, 253-264, (33rd session).
Gabriel, K.R. (1962). Ante-dependence analysis of an ordered set of variables. Ann. Math. Stat., 33, 201-212.
Kenward, M.G. (1987). A method for comparing profiles of repeated measurements, Applied Statistics, 36, 296-308.
ANTTEST procedure
Calculates overall tests based on a specified order of ante-dependence
(R.W. Payne & M.S. Ridout)
Options
TREATMENTSTRUCTURE
= formula Treatment formula for the model at each time; if this is not set, the default is taken from the setting (which must already have been defined) of the TREATMENTSTRUCTURE directiveBLOCKSTRUCTURE
= formula Block formula for the model at each time; if this is not set, the default is taken from any existing setting specified by the BLOCKSTRUCTURE directive and if neither has been set the design is assumed to be unstratified (i.e. to have a single error term)ORDER
= scalar Number of past times for which to adjust; default is maximum possible orderFACTORIAL
= scalar Limit on the number of factors in a treatment term
Parameter
DATA
= variates Data variates (observed at successive times) for an analysis
Description
A repeated measures experiment is one in which the same set of units, or subjects, is observed at a sequence of times to investigate treatment effects over a period of time. The set of variates observed at the successive times is said to have an ante-dependence structure of order r if each ith variate (i>r), given the preceding r, is independent of all further preceding variates (Gabriel 1961, 1962). Procedure
ANTTEST calculates overall tests of treatment terms based on a specified order of ante-dependence structure (see Kenward, 1987).The data variates are supplied, as a list, by the
DATA parameter. The variates may contain missing values, provided these are "dropouts", that is, once a unit becomes missing it should remain missing at all subsequent times.The model for the design is specified by options of the procedure.
TREATMENTSTRUCTURE specifies a model formula to define the treatment terms in the analysis; if this is unset, ANTTEST will use the model already defined by the TREATMENTSTRUCTURE directive, or will fail if that too has not been set. BLOCKSTRUCTURE defines the underlying structure of the design, and ANTTEST will use the model (if any) previously defined by the BLOCKSTRUCTURE directive if this is not set; these can both be omitted if there is only one error term (i.e. if the design is unstratified). Option ORDER specifies the order of ante-dependence structure to be assumed for the tests; by default, this is taken as the maximum possible order (the smaller of the number of times minus one or the number of residual degrees at each time). A suitable order can be established using the ANTORDER procedure. The FACTORIAL option can be used to set a limit on the number of factors in the terms generated from the treatment formula.
Options:
TREATMENTSTRUCTURE, BLOCKSTRUCTURE, ORDER, FACTORIAL.Parameter:
DATA.
Method
The procedure uses the method of Kenward (1987) to calculate the statistics using residual sums of squares from analysis of covariance. For further details of ante-dependence see Gabriel (1961, 1962).
Action with
RESTRICTAny restriction on the
DATA variates will be applied to all of them.
References
Gabriel, K.R. (1961). The model of ante-dependence for data of biological growth. Bulletin Institut International Statistique (Paris), 39, 253-264, (33rd session).
Gabriel, K.R. (1962). Ante-dependence analysis of an ordered set of variables. Ann. Math. Stat., 33, 201-212.
Kenward, M.G. (1987). A method for comparing profiles of repeated measurements, Applied Statistics, 36, 296-308.
AONEWAY procedure
Provides one-way analysis of variance for inexperienced users
(R.W. Payne)
Options
GROUPS
= variate, text or factor Defines the groups for the analysis (which, for example, have been given different treatments)HOMOGENEITY
= string Whether or not to check homogeneity of variances (no, yes); default noPLOT
= strings Which residual plots to provide (fittedvalues, normal, halfnormal, histogram); default * i.e. noneEXPLAIN
= string Whether or not to print an explanation of the Genstat instructions (no, yes); default no
Parameter
Y
= variates Each of these contains the data values for an analysis
Description
This procedure provides one-way analysis of variance, avoiding the technicalities of the Genstat analysis-of-variance directives, but with extra options for residual plots and to check homogeneity of variances. Also, by setting option
EXPLAIN=yes, the procedure can be requested to print an explanation of the statements required for the analysis, helping users to learn how to run the analysis directly in Genstat, and go beyond the facilities in the procedure; see the Genstat 5 Release 3 Reference Manual, Chapter 9, or Genstat 5: an Introduction, Chapter 7. This illustrates the sort of procedures that might be provided to assist inexperienced users.The data values to be analysed are specified by the parameter of the procedure, in a variate. The groups to be compared are indicated using the
GROUPS option. The easiest way of doing this is to give either a variate or a text, each of whose distinct values represents one of the groups; a more efficient method is to use a factor.Printed output from the analysis of variance is requested by listing the required components in the
PRINT option. The relevant possibilities are: aovtable to print the analysis-of-variance table, means to print the table of means, effects to print the effects (means minus grand mean), %cv to print the coefficient of variation, and missingvalues to print estimates for missing values (if any).Homogeneity of the variances within the different groups can be tested by setting option
HOMOGENEITY=yes, and settings of the PLOT option allow various residual plots to be requested: fittedvalues for a plot of residuals against fitted values, normal for a Normal plot, halfnormal for a half Normal plot, and histogram for a histogram of residuals.Residuals and fitted values cannot be saved using the procedure, but the directive
AKEEP can be used to get those from the analysis of the last data variate to be analysed by the procedure. For exampleAKEEP [RESIDUALS=Res; FITTEDVALUES=Fit]
to save the residuals in a variate called Res and the fitted values in a variate called Fit (Genstat 5 Release 3 Reference Manual, page 507).
Options:
PRINT, GROUPS, HOMOGENEITY, PLOT, EXPLAIN. Parameter: Y.
Method
If the setting of the
GROUPS option is a variate or a text, the GROUPS directive is used to form a factor, with one level for each distinct value. The factor is then used to treatment model with a single factor, for exampleTREATMENTS GROUPS
(Genstat 5 Release 3 Reference Manual, page 466). The data variate is then analysed using
ANOVA (page 469), for exampleANOVA Y
Homogeneity of variance can be tested using procedure
VHOMOGENEITY with option GROUPS set to the factor and parameter DATA to the y-variate. Plots of residuals are provided by procedure APLOT, for exampleAPLOT fittedvalues,normal,halfnormal,histogram
Action with
RESTRICTIf the
Y variate is restricted, only the units not excluded by the restriction will be analysed.
APLOT procedure
Plots residuals from an
ANOVA analysis(A.D. Todd)
Option
SAVE
= ANOVA save structure Specifies the analysis from which the residuals and fitted values are to be taken; by default they are taken from the most recent ANOVA
Parameter
METHOD
= strings Type of graph to be plotted (fittedvalues, normal, halfnormal, histogram); default fitt
Description
Procedure
APLOT provides four types of plots of residuals from an ANOVA analysis. These are selected using the METHOD parameter with settings: fittedvalues for residuals versus fitted values, normal for a Normal plot, halfnormal for a half-Normal plot, andhistogram
for a histogram of residuals.The residuals and fitted values are accessed automatically from the structure specified by the
SAVE option. If the SAVE option is not set, they are taken from the SAVE structure of the last y-variate to have been analysed by ANOVA.
Option:
SAVE. Parameter: METHOD.
Method
Residuals and fitted values are accessed, using
AKEEP, from the latest ANOVA, or from that specified by the SAVE option. For a Normal plot, the Normal quantiles are calculated as follows:qi = NED( (i-0.375) / (n+0.25) )
while for a half-Normal plot they are given by
qi = NED( 0.5 + 0.5 ´ (i-0.375) / (n+0.25) )
Action with
RESTRICTIf the y-variate in the
ANOVA was restricted, only the units not excluded by the restriction will be included in the graphs.
APPEND procedure
Appends a list of vectors of the same type
(R.W. Payne)
Options
NEWVECTOR
= vector Vector to store the appended values; by default uses the first vector of the OLDVECTOR listFREPRESENTATION
= string How to match the values of old factors (levels, labels, ordinals, renumbered); default leveGROUPS
= factor Factor to represent the vector to which each unit originally belonged
Parameter
OLDVECTOR
= vectors Vectors whose values are to be appended together
Description
APPEND
provides a convenient way of putting the values from several vectors, all into a single vector. The new vector will contain all the values of the first vector, then all those from the second vector, and so on. The vectors can thus contain different numbers of values but they must all be of the same type: all variates, all factors or all texts.The vectors whose values are to be appended together are specified by the
OLDVECTOR parameter, and the NEWVECTOR option supplies the vector to store the appended values. If NEWVECTOR is omitted, the values are placed into the first OLDVECTOR.For factors, the
FREPRESENTATION option indicates how the levels are to be matched amongst the old vectors. If this is set to labels and the levels of the old factors are compatible (that is if each label corresponds to the same level in all the old factors), then the level definitions are transferred to the new factor; if not, the levels are defined to be the default values 1, 2... and a warning is printed. Similarly, with the default setting FREPRESENTATION=levels, the labels are retained if they are compatible, but no warning is printed if they are not. For FREPRESENTATION=ordinals, the levels of all the factors are taken as the ordinal values 1, 2... (and no labels are defined). Finally, the renumbered setting assumes that the old factors all have independent sets of levels, and renumbers these from one upwards for the first factor, from number of levels of the first factor plus one upwards for the second factor, and so on; the new factor will thus have a different level for every level of the original factors.The
GROUPS option allows a factor to be formed indicating the OLDVECTOR to which each unit of the appended vector originally belonged. The levels are labelled by the identifier of the corresponding OLDVECTOR. This factor could be used in the CONDITION option of the SUBSET procedure subsequently to recover the original vectors.
Options:
NEWVECTOR, FREPRESENTATION. Parameter: OLDVECTOR.
Method
APPEND
defines the lengths and all other relevant attributes of the NEWVECTOR and then uses EQUATE to transfer the values.
Action with
RESTRICTAny restrictions on the vectors are ignored.
APRODUCT procedure
Forms a new experimental design from the product of two designs
(R.W. Payne)
Options
ANALYSE
= string Whether to analyse the design by ANOVA (yes, no); default noMETHOD
= string How to combine the designs (cross, nest); default nestBF1
= formula Block formula for design 1TF1
= formula Treatment formula for design 1BF2
= formula Block formula for design 2TF2
= formula Treatment formula for design 2
No parameters
Description
APRODUCT
forms an experimental design by taking the product of two other designs. The METHOD option controls whether the product is formed by nesting the second design within the first, or by crossing the two designs together. For example, suppose that the first design has a single factor Units in the block structure and a single treatment factor A, while the second design is a Latin square with block structure Rows*Colums and treatment factor B. If we nest the second design within the first, we would obtain a design with block structure Units/(Rows*Columns) in which each unit of the first design has been subdivided into a row by column array of subplots to contain a Latin square of the sort defined by the second design. Nesting is thus useful when you want to subdivide the units of a design and apply further treatments (in this case those defined by the factor B) to the resulting subplots. Similarly, if we cross the two designs, the new design will have a block structure of Units*(Rows*Columns), or Units*Rows*Columns, in which we have duplicated the second design for every level of Units. Crossing is useful if you need to introduce a new blocking structure into an existing design. For example, the Units factor might represent different time periods or different locations in which the latin square design was to be used, and the factor A the different systematic conditions that might apply on each occasion.With both nesting and crossing, the new design will contain a unit for every combination of the block factors in the two original designs, and so every combination of the treatment factors in the first design will occur with every combination of the treatment factors in the second design. The treatment structure is thus defined for the new design by crossing the treatment structures of the two original designs, to estimate all the original treatment terms and their interactions. So, in the example above, the treatment structure is defined to be
A*B.APRODUCT
redefines the values of the factors as required for the new design, and executes BLOCKSTRUCTURE and TREATMENTSTRUCTURE directives with the new block and treatment formulae. The new formulae can then be accessed, outside the procedure, using the GET directive or procedure ASTATUS. The PRINT option can be set to design to print the new design, and the ANALYSE option can be set to yes to produce a skeleton analysis of variance from ANOVA. Options BF1, TF1, BF2, and TF2 define the block structure and treatment structure of the first and then the second design.
Options:
PRINT, ANALYSE, METHOD, BF1, TF1, BF2, TF2. Parameters: none.
Method
APRODUCT
uses the standard Genstat manipulation directives such as FCLASSIFICATION, CALCULATE and DUPLICATE.
Action with
RESTRICTNone of the factors must be restricted, and any existing restrictions will be cancelled.
ARANDOMIZE procedure
Randomizes and prints an experimental design
(R.W. Payne)
Options
BLOCKSTRUCTURE
= formula Defines the block factors according to which the randomization is to be carried out; default takes the existing specification as defined by the BLOCKSTRUCTURE directiveEXCLUDE
= factors (Block) factors whose levels are not to be randomizedSEED
= scalar Seed to generate the random numbers used to define the randomization; default 0LPERMUTE
= string Whether to randomly permute treatment factor levels (no, yes); default no
Parameters
OLDVECTOR
= factors or variates Vectors whose values are to be randomized; default is to use the factors occurring in the formula (if any) specified by the most recent TREATMENTSTRUCTURE directiveNEWVECTOR
= factors or variates Vectors to store the randomized values; by default these overwrite the values in the original vectors
Description
ARANDOMIZE
provides a convenient way of randomizing the treatment allocations in an experimental design. It has several advantages over the RANDOMIZE directive (which is used inside the procedure).First of all, the
BLOCKSTRUCTURE option, which (as in RANDOMIZE) specifies the block model formula to indicate how the randomization is to take place, will use any setting that has already been defined by the BLOCKSTRUCTURE directive as its default. Moreover, the formula need not index all the units of the design, as would be required by RANDOMIZE; if necessary ARANDOMIZE will set up an extra factor _units_ simular to the factor *units* used by ANOVA.ARANDOMIZE
allows the original (unrandomized) values to be retained. There are two parameters: OLDVECTOR to specify the factors or variates to be randomized, and NEWVECTOR to allow new structures to be supplied to store the randomized values. If no NEWVECTOR is specified, the randomized values replace the original values of the corresponding OLDVECTOR. By default, NEWVECTOR is assumed to contain the list of factors in the model formula (if any) specified by the previous TREATMENTSTRUCTURE directive.The levels of the treatment factors can be randomized by setting option
LPERMUTE=yes; ARANDOMIZE then randomly permutes the numbering of the levels of each treatment factor on the units of the design. There is also a PRINT option which can be set to design to print the design. The other two options, EXCLUDE and SEED, are as in RANDOMIZE. EXCLUDE lists block factors whose levels are not to be permuted during the randomization; for example the period factor might need to be excluded in the randomization of a trial to study carry over effects. SEED defines the seed used to generate the random numbers used for the randomization; the default of 0 ensures that a seed will be chosen at random if SEED is not set.
Options:
PRINT, BLOCKSTRUCTURE, EXCLUDE, SEED, LPERMUTE.Parameters:
OLDVECTOR, NEWVECTOR.
Method
The
GET directive is used to access any existing settings defined by the BLOCKSTRUCTURE or TREATMENTSTRUCTURE directives. AFUNITS, if necessary, forms the extra _units_ factor, and DUPLICATE generates new copies of the original vectors, if these are to be kept, before RANDOMIZE is used to produce the randomized values. Finally, if required, PDESIGN is used to print the design.
Action with
RESTRICTRESTRICT can be used, as usual, to restrict the set of units to be randomized.
AREPMEASURES procedure
Produces an analysis of variance for repeated measurements
(R.W. Payne)
Options
APRINT
= strings Printed output from the analysis of variance (as for the ANOVA PRINT option); default *TREATMENTSTRUCTURE
= formula Defines the treatments given to the subjects; if this is not set, the default is taken from any existing setting defined by the TREATMENTSTRUCTURE directiveBLOCKSTRUCTURE
= formula Defines any block structure over the subjects if this is not set, the default is taken from any existing setting defined by the BLOCKSTRUCTURE directiveCOVARIATE
= variates Specifies any covariates on the subjects if this is not set, the default is taken from any existing setting defined by the COVARIATE directiveFACTORIAL
= scalar Limit in the number of factors in the terms generated from the TREATMENTSTRUCTURE formula
Parameter
DATA
= variates List of variates, one for each time, containing the data observations
Description
A repeated-measures design is one in which subjects (animals, people, plots, etc) are observed several times. Each subject receives a randomly allocated treatment, either at the outset, or repeatedly through the experiment. The subjects are observed at successive occasions to see how the treatment effects develop.
The design might thus seem analogous to a split-plot design, with subjects corresponding to whole plots, and the occasions of observation to the sub-plots. There are, however, some important differences between the two situations. With repeated measurements, there is likely to be a greater correlation between observations that are made at adjacent time points than between those that are more greatly spaced. Furthermore, the
Times factor cannot, by its very nature, be allocated at random to the occasions within subjects. In the customary split-plot situation we can usually assume that there is an equal correlation between the sub-plots of each whole plot and, even if this were not so, the sub-plot treatment should have been allocated at random to the sub-plots within each whole plot. The formal conditions for the validity of the split-plot analysis will be discussed in more detail below, together with advice on how to proceed if they do not hold.It is worth pointing out first, though, that this problem affects only the
Subjects.Times stratum. The Subjects stratum contains an analysis of variance of the measurements totalled over the subjects, and this part of the analysis will be valid whatever the within-subject correlation structure. A further point is that, when measurements are taken on only two occasions, the analysis in the Subjects.Times stratum will also be valid; there can then be only one within-subject correlation, and the analysis in the Subjects.Times stratum is of the difference between the observations at time 2 and time 1 on each subject.Another potential problem arising from the systematic nature of the
Times factor is that effects arising from the "length of treatment time" will be confounded with any effects arising from the duration of the experiment, such as age of subject (which may be important with short-lived material such as aphids), season of year, time of day, and so on. This does not affect the validity of the analysis, and some of the confusion may be capable of being unravelled by running the experiment during more than one period. Nevertheless, care needs to be taken in drawing conclusions about time-effects.The
Subjects.Times information, describing the way in which the treatment effects change differentially with time, is often the aspect of most interest in the study. The formal requirement for the validity of the analysis in the sub-plot stratum of a split-plot design is that all the normalised contrasts in that stratum have an equal variance. The only practical arrangement of covariances between times that satisfies this condition would have a single variance down the diagonal and a single covariance off-diagonal. This pattern is known as a uniform covariance structure or, equivalently, the matrix is said to show compound symmetry; Box (1950) describes how this can be tested. In the usual split-plot analysis, the Subjects.Times sum of squares is assumed to be distributed as s2 ´ c2r where s2 is a constant and c2r has a chi-square distribution on r degrees of freedom. Similarly, under the assumption that there is no Treatments.Times interaction, the Treatments.Times sum of squares is assumed to be distributed as s2 ´ c2t where c2t has a chi-square distribution on t degrees of freedom. If the variance-covariance structure does not exhibit compound symmetry, it is possible to show that the distributions can still be approximated by chi-squared distributions, but the degrees of freedom are instead epsilon ´ r and epsilon ´ t. The correction factor epsilon lies between one, which would give the ordinary split-plot analysis, and 1/(number of times minus one), which would leave just one degree of freedom within each subject (remember that when there are only two observation on each subject, and thus just one within-subject degree of freedom, the analysis is valid); epsilon can be estimated by maximum likelihood, as described by Greenhouse & Geisser (1959).The printing of information about the covariances is controlled by the strings listed for the PRINT option: vcovariance variance-covariance matrix, correlation correlation matrix, epsilon Greenhouse-Geisser epsilon, test test for compound symmetry.
The output from the analysis of variance is controlled by the APRINT option, with settings identical to those in the PRINT option of the ANOVA directive.
The treatments applied to the subjects can be specified (as a model formula) using the TREATMENTSTRUCTURE option, the block structure (if any) on the subjects can be specified by the BLOCKSTRUCTURE option, and the COVARIATE option can be used to list any covariates. If any of these options is unset, the default is taken from any existing setting defined by the directives TREATMENTSTRUCTURE, BLOCKSTRUCTURE or COVARIATE, respectively. The FACTORIAL option can be used to set a limit on the number of factors in the terms generated from the TREATMENTSTRUCTURE option.
The observed data for the procedure should be specified in a set of variates, each one containing the measurements made on the subjects at one of the occasions on which they were observed, and input using the DATA parameter.
Options: PRINT, APRINT, TREATMENTSTRUCTURE, BLOCKSTRUCTURE, COVARIATE, FACTORIAL.
Parameter: DATA.
Method
The procedure uses the standard Genstat directives for calculations and manipulation to obtain the various matrices and tests. Formulae for these are given by Box (1950), Winer (1962, pages 523 and 594-599, although note that eqn {1} on page 595 should contain N¢ & n¢ i, not N & ni), and Greenhouse & Geisser (1959). It then extends the factors and covariates, temporarily, to length number of subjects ´ number of times in order to produce the analysis of variance.
Action with RESTRICT
The procedure does not allow for restrictions, and will cancel any that have been applied.
References
Box, G.E.P. (1950). Problems in the analysis of growth and wear curves. Biometrics, 6, 362-389.
Greenhouse, S.W. & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24, 95-112.
Winer, B.J. (1962). Statistical Principals in Experimental Design (2nd edition). McGraw-Hill, New York.
ASTATUS procedure
Provides information about the settings of
ANOVA models and variates(R.W. Payne)
Option
Parameters
Y
= pointers Pointer of length 1 to save the identifier of the y-variate of the most recent ANOVA or that used to form INSAVETREATMENTSTRUCTURE
= formulaeSaves the current setting of
TREATMENTSTRUCTURE or the setting used to form INSAVEBLOCKSTRUCTURE
= formulae Saves the current setting of BLOCKSTRUCTURE or the setting used to form INSAVECOVARIATE
= pointers Saves the current COVARIATE setting or the setting used to form INSAVESAVE
= asave structures Saves the save structure from the most recent ANOVAINSAVE
= asave structures Provides a save structure from which to save Y, TREATMENTSTRUCTURE, BLOCKSTRUCTURE and COVARIATE; default * uses the current settings
Description
ASTATUS
allows information to be printed and saved about the model settings and other information involved in an ANOVA analysis.By default
ASTATUS prints the current settings defined by the directives TREATMENTSTRUCTURE, BLOCKSTRUCTURE and COVARIATE. This is governed by the default setting, model, of the PRINT option. The other setting y prints the name of the y-variate from the most recent ANOVA. Alternatively, if the INSAVE parameter is set to the save structure from an ANOVA analysis, the y-variate and model settings will be those used to form the save structure.If the
INSAVE parameter is not set, the Y parameter can be used to save the identifier of the y-variate most recently analysed by ANOVA, in a pointer of length one. The TREATMENTSTRUCTURE parameter saves the current setting defined by the TREATMENTSTRUCTURE directive (in a formula structure), and the BLOCKSTRUCTURE parameter similarly saves the current setting defined by the BLOCKSTRUCTURE directive. The COVARIATE parameter saves the current setting defined by the COVARIATE directive (in a pointer). Alternatively, if INSAVE is set to an ANOVA save structure, the parameters Y, TREATMENTSTRUCTURE, BLOCKSTRUCTURE and COVARIATE save the settings used to form INSAVE.The
SAVE parameter saves the save structure from the most recent ANOVA (regardless of the setting of INSAVE).
Options:
PRINT.Parameters:
Y, TREATMENTSTRUCTURE, BLOCKSTRUCTURE, COVARIATE, SAVE, INSAVE.
Method
ASTATUS
uses the GET directive to obtain the current settings of BLOCKSTRUCTURE, TREATMENTSTUCTURE and COVARIATE, and the save structure from the most recent ANOVA. It uses specialist knowledge of the ANOVA save structure to obtain information from an ANOVA save structure input using the INSAVE option.
ASWEEP procedure
Performs sweeps for model terms in an analysis of variance
(R.W. Payne)
Options
TERM
= formula Model term (or terms) involved in the sweep (this need not be specified if EMETHOD=calculated); default is to sweep for the grand meanEFFICIENCY
= scalar Efficiency factor of the term(s)EMETHOD
= string Source of the effects (calculated, given); default calcRMETHOD
= string Method to be used to obtain the residual variate (subtract, replace); default subt
Parameters
Y
= variate Working variates to be sweptEFFECTS
= table Estimated effectsRESIDUALS
= variate New working variates, following the sweepSS
= scalars Sum of squares due to the term(s)RSS
= scalars Sum of squares of the working variate after the sweep
Description
The analysis-of-variance algorithm in the Genstat
ANOVA directive involves a series of sweep operations performed on a working variate which initially contains the data values and finally contains the residuals. Sweeps may have two parts. The first involves the estimation of the effects of a particular term. For a term that is orthogonal to the terms that preceed it in the model, the effects are estimated simply by the tables of means for that term, calculated from the working variate; for non-orthogonal terms, the effects are the means divided by an efficiency factor. In the second part, the working variate is modified. Usually this involves subtracting the estimated effects. Alternatively there is a special sweep, known as a pivot, which is used to initiate the analysis within a stratum. In this, the value in each unit of the working variate is replaced by the corresponding effect of the term. Further details can be found in the Genstat 5 Release 3 Reference Manual, page 523, or in the paper by Payne & Wilkinson (1977). Procedure ASWEEP is provided as a research tool for studying the algorithm and its properties.The values initially in the working variate are specified by the
Y parameter. The procedure can sweep for a single term or, if several terms have the same efficiency factor, these can all be swept together. The TERM option specifies the term (or terms) and the efficiency factor is defined by the EFFICIENCY option. The EFFECTS parameter allows the estimated effects of the term(s) to be stored if option EMETHOD=calculated, or to be supplied if EMETHOD=given. The values in the working variate after the sweep can be saved using the RESIDUALS parameter, and the RMETHOD option indicates whether these are to be formed by an ordinary sweep (RMETHOD=subtract) or by a pivot (RMETHOD=replace). The SS parameter saves the sum of squares due to the term(s), and the RSS parameter saves the sum of squares of the working variate after the sweep.
Options:
TERM, EFFICIENCY, EMETHOD, RMETHOD.Parameters:
Y, EFFECTS, RESIDUAL.
Method
The procedure uses the standard Genstat directives for analysis of variance, calculations and manipulation.
Action with
RESTRICTIf the working variate (specified by the
Y parameter) is restricted, the sweep will use only the units not excluded by the restriction.
Reference
Payne, R.W. & Wilkinson, G.N. (1977). A general algorithm for analysis of variance. Applied Statist., 26, 251-260.
AUDISPLAY procedure
Produces further output for an unbalanced design (after
AUNBALANCED)(R.W. Payne)
Options
PFACTORIAL
= scalar Limit on number of factors in printed tables of predicted means; default 3FPROBABILITY
= string Printing of probabilities for variance ratios in the analysis-of-variance table (yes, no); default noPLOT
= strings Which residual plots to provide (fittedvalues, normal, halfnormal); default * i.e. none
Parameter
SAVE
= identifiers Save structure (from AUNBALANCED) containing details of the analysis for which further output is required; if omitted, output is from the most recent use of AUNBALANCED
Description
This procedure can be used, following the use of procedure
AUNBALANCED, to produce further output for the analysis of variance of an unbalanced design.The output to be printed is controlled by the
PRINT option, with settings: aovtable to print the analysis-of-variance table, effects to print the effects (as estimated by Genstat regression), residuals to print residuals and fitted values, means to print tables of predicted means with standard errors, and %cv to print the coefficient of variation. The default is to print the analysis-of-variance table and tables of means.The model is fitted sequentially, first any covariates and then the treatments. Thus, any sums of squares produced for covariates are ignoring treatments (not after eliminating treatments, as with the
ANOVA directive). Tables of means are produced with the PREDICT directive using the default weighting (Genstat 5 Release 3 Reference Manual, page 395).The
PFACTORIAL option limits the number of factors in terms for which predicted means are printed. Probabilities can be printed for variance ratios by setting option FPROBABILITY=yes. Finally, there is a PLOT option which allows various residual plots to be requested: fittedvalues for a plot of residuals against fitted values, normal for a Normal plot, and halfnormal for a half Normal plot.The
SAVE parameter can be set to the save structure from the analysis for which further output is required. If SAVE is not set, output will be produced for the most recent analysis from AUNBALANCED; however, none of the Genstat regression directives (MODEL, TERMS, FIT, ADD, DROP and so on) must then have been used in the interim.
Options:
PRINT, PFACTORIAL, FPROBABILITY, PLOT. Parameter: SAVE.
Method
The output is produced mainly using the directives
RKEEP and PREDICT.
Action with
RESTRICTIf the y-variate originally analysed by
AUNBALANCED was restricted, only the units not excluded by the restriction will have been analysed.
AUNBALANCED procedure
Performs analysis of variance for unbalanced designs
(R.W. Payne)
Options
FACTORIAL
= scalar Limit on number of factors in a treatment term; default 3PFACTORIAL
= scalar Limit on number of factors in printed tables of predicted means; default 3FPROBABILITY
= string Printing of probabilities for variance ratios in the analysis-of-variance table (yes, no); default noPLOT
= strings Which residual plots to provide (fittedvalues, normal, halfnormal); default * i.e. none
Parameters
Y
= variates Data values to be analysedRESIDUALS
= variates Variate to save the residuals from each analysisFITTEDVALUES
= variates Variate to save the fitted values from each analysisSAVE
= identifiers To save details of each analysis to use subsequently with the AUDISPLAY procedure
Description
This procedure carries out analysis of variance using the regression directives in Genstat. It is particularly useful for designs that are unbalanced and which thus cannot be analysed by the
ANOVA directive.The method of use is similar to that for
ANOVA. The treatment terms to be fitted must be specified, before calling the procedure, by the TREATMENTSTRUCTURE directive. Similarly, any covariates must be indicated by the COVARIATE directive. However, the procedure does not cater for blocking, so any settings of the BLOCKSTRUCTURE directive are ignored.The parameters of the procedure are identical to those of
ANOVA. The variates to be analysed are specified by the Y parameter. Residuals and fitted values can be saved using the RESIDUALS and FITTEDVALUES parameters respectively. Finally, the SAVE parameter allows details of the analysis to be saved so that further output can be obtained using the AUDISPLAY procedure. (Note that this is a regression save structure, not an ANOVA structure, so it cannot be used with the directives ADISPLAY or AKEEP).Printed output is controlled by the
PRINT option, with settings: aovtable to print the analysis-of-variance table, effects to print the effects (as estimated by Genstat regression), residuals to print residuals and fitted values, means to print tables of predicted means with standard errors, and %cv to print the coefficient of variation. The default is to print the analysis-of-variance table and tables of means.The model is fitted sequentially, first any covariates and then the treatments. Thus, any sums of squares produced for covariates are ignoring treatments (not after eliminating treatments, as with the
ANOVA directive). Tables of means are produced with the PREDICT directive using the default weighting (Genstat 5 Release 3 Reference Manual, page 395).The
FACTORIAL option sets a limit on the number of factors that a higher-order term, such as an interaction, can contain; any terms with more factors are deleted from the analysis. Similarly, the PFACTORIAL option limits the number of factors in terms for which predicted means are printed. Probabilities can be printed for variance ratios by setting option FPROBABILITY=yes. Finally, there is a PLOT option which allows various residual plots to be requested: fittedvalues for a plot of residuals against fitted values, normal for a Normal plot, and halfnormal for a half Normal plot.
Options:
PRINT, FACTORIAL, PFACTORIAL, FPROBABILITY, PLOT.Parameters:
Y, RESIDUALS, FITTEDVALUES, SAVE.
Method
The y-variate is specified using the
MODEL directive, along with any variates to save residuals and fitted values. The current settings of the TREATMENTSTRUCTURE and COVARIATE directives are recovered using the SET directive, and used to define the terms in the analysis (using the TERMS directive). The model is then fitted (using FIT), AUDISPLAY is called to print the output and any plots of residuals.
Action with
RESTRICTIf the
Y variate is restricted, only the units not excluded by the restriction will be analysed.
A2PLOT procedure
Plots effects from two-level designs with robust s.e. estimates
(Eric D. Schoen & Enrico A.A. Kaul)
Options
CHANNEL
= scalar What channel to use for anova and line-printer output; default * i.e. the current output channelFACTORIAL
= scalar Limit for factorial expansion of TREATMENT formula; default 3STRATUM
= formula Error strata from which Yates effects are to be plotted; if unset, plots are made for all the strataGRAPHICS
= string What type of graphics (highresolution, lineprinter); default highTITLE
= strings Separate titles for each of the plotsMETHOD
= string Whether to make half-Normal or Normal plots (halfnormal, normal); default halfROBUSTNESS
= string Robustness of scale estimators against contamination with active effects (low, medium, high); default mediALPHA
= scalar Type I error (0.20, 0.15, 0.10, 0.05, 0.01); default 0.05EXCLUDE
= scalars How many of the largest effects to withhold from each of the half-Normal plots; default 0
Parameters
Y
= variates Data to be analysedEFFECTS
= pointers To save a variate for each error stratum containing the (sorted) Yates effects estimated thereSE
= pointers To save a scalar with the standard error of the Yates effects for each error stratumSIGNIFICANT
= pointers To save formulae containing the significant Yates effects in each stratum
Description
Daniel (1959) shows how contrasts from two-level experiments in single or fractional replication can be evaluated through half-Normal plotting. Box et al. (1978) emphasize Normal plotting of the Yates effects. They suggest making separate plots for each error stratum. The Yates definition ensures that the effects from the same error stratum share a common variance. When there is sparsity of effects and Normality of error, most effects will come from a Normal distribution with zero mean and unknown variance. Inactive effects, plotted against quantiles of the Normal or half-Normal distribution, are roughly on a straight line through the origin. Effects not compatible with this line are designated active. Thus (half-)Normal plots will separate the few active effects from the inactive ones.
A well-known problem with the technique is the subjectivity as to which effects constitute the null-line. Many authors, therefore, have developed procedures for getting robust estimates of the standard errors of the Yates effects from unreplicated two-level experiments, see Haaland & O'Connell (1995) for an overview. Based on simulation results for 24 experiments (15 effects in the plot) the latter authors recommend three estimators according to a-priori ideas on the likely number of active effects (1-3, 4-6, and 7-8, respectively). The estimators are formed by (1) calculating an initial estimator of the standard error as a quantile of the full set of effects, multiplied with a consistency constant determined from the Normal distribution; (2) stripping of potential active effects by retaining only effects smaller than a constant times the initial scale estimate; (3) multiplying some function of the remaining effects with a simulated consistency constant. One of the three recommended estimators is based on the median of the full set and the sum of squares of the retained effects; it is called the Adaptive Standard Error (ASE). The other two estimators are based on the median and the 45th percentile, respectively, of the full set; these are Pseudo Standard Errors (PSE). Both use the median of the retained effects. In general, ASE is less robust against contamination with active effects than PSE, because it uses all the effects below the cut-off point. The median-based PSE is obviously less robust than the PSE based on the 45th percentile.
Haaland & O'Connell (1995) suggest judging t-values from the effects and the calculated scale estimate against critical values determined by simulation. They present consistency constants for two of the recommended estimators and critical values for one of them, each for 7, 11, 15, 17, 23 and 31 effects, respectively. We have extended their results to the whole range from 7 up to 127 effects and all three estimators.
The treatment effects to be studied should be specified using the
TREATMENTSTRUCTURE directive before using A2PLOT. They are grouped according to the error strata as specified by a previous BLOCKSTRUCTURE statement. Normal or half-Normal plots, according to the METHOD option, are made in either lineprinter or high-resolution quality (option GRAPHICS). By default plots are made for each error stratum. Alternatively, option STRATUM can be set to a formula defining the strata from which the Yates effects are to be plotted. The EXCLUDE option specifies the number of largest effects to be exclude from half-Normal plots (the option does not work with Normal plots). The titles of the plots can be provided using option TITLE. Setting GRAPHICS=* suppresses the plots. Options FACTORIAL, PRINT and CHANNEL, are as in ADISPLAY. Note, however, that effects are printed as Yates effects, and that CHANNEL also controls the line-printer graphics.When the number of effects in the plot is in the range 7 to 127, robust estimators are calculated for the standard error of the effects. The robustness of the estimators against contamination with active effects is specified through option
ROBUSTNESS. A vertical line in the plot indicates the least significant Yates effect (LSE). The type I error is controlled by option ALPHA. Effects larger than the LSE are labelled in the plot.The data variates are specified using the
Y parameter. The EFFECTS parameter can save a pointer holding a variate of effects, sorted from small to large, for each error stratum. Effects are either the usual Yates effects (METHOD=normal) or their absolute values (METHOD=halfnormal). Parameter SIGNIFICANT can save a formula with the joint significant effects of all the strata. Parameter SE holds scalars with the standard errors of the effects in the respective strata.
Options:
Parameters:
Y, EFFECTS, SE, SIGNIFICANT.
Method
A2PLOT
accesses the current BLOCKSTRUCTURE and TREATMENTSTRUCTURE settings using the GET directive. If the STRATUM option is unset, separate plots for each of the strata are to be produced. A2PLOT checks, therefore, whether all strata are set explicitly. If this is not the case it augments the current BLOCKSTRUCTURE with a bottom stratum using procedure AFUNITS. If no BLOCKSTRUCTURE is set, it generates an explicit Units stratum and sets the BLOCKSTRUCTURE and STRATUM options to this stratum.Yates effects for each stratum are saved using
AKEEP. They are ordered and plotted against either Normal or half-Normal quantiles. Normal quantiles are calculated asqi = NED( (i-0.375) / (n+0.25) ) i=1...nHalf-Normal quantiles are calculated as
qi =
For
ROBUSTNESS=low, ASE based standard errors are calculated with the initial standard error calculated from the median of all effects, a cut-off of 2.5 times this value, and a final standard error from the sum of squares of the remaining effects. For ROBUSTNESS=medium, PSE based standard errors are calculated with the same cut-off as for ASE and a final standard error is calculated from the median of the remaining effects. For ROBUSTNESS=high, PSE based standard errors are calculated using the 45th percentile instead of the median for the initial estimate, and 1.25 instead of 2.5 as a multiplication factor to establish the cut-off. The final estimate uses the median of the retained effects.Significant Yates effects are labelled in the half-Normal plots using the factor names from the
TREATMENT statement.
Acknowledgements
The authors thank Peter Lane for suggesting and sketching procedure
A2PL_EXPAND.
Action with
RESTRICTAFUNITS (which may be called by A2PLOT if the STRATUM option is unset and no explicit bottom error stratum is specified in the current BLOCKSTRUCTURE setting) requires that none of the blocking factors be restricted.
References
Box, G.E.P., W.G. Hunter & J.S. Hunter (1978), Statistics for Experimenters. New York, Wiley.
Daniel, C. (1959), Use of half-normal plots in interpreting factorial two-level experiments. Technometrics, 1, 311-342.
Haaland, P.D. & M.A. O'Connell (1995), Inference for effect-saturated fractional factorials. Technometrics, 37, 82-93.