DCLEAR directive
Clears a graphics screen.
Options
DEVICE
= scalar Device whose screen is to be cleared; default is to clear the screen of the current graphics deviceENDACTION
= string Action to be taken after clearing the screen (continue, pause); default * uses the setting from the last DEVICE statement
No parameters
Description
DCLEAR
clears the screen of a graphics device so that the next plot produced on this device by any of the high-resolution graphics or procedures will be drawn onto an empty screen. The DEVICE option indicates the device to be cleared; by default this is the current graphics device (as set by the DEVICE directive). The ENDACTION option controls what happens after clearing the screen. The default action is the setting specified by the most recent DEVICE statement.
DCONTOUR directive
Draws contour plots on a plotter or graphics monitor.
Options
INTERVAL
= scalar or variate Contour interval for scaling (scalar) or positions of the contours (variate); default * i.e. determined automaticallyTITLE
= text General title; default *WINDOW
= scalar Window number for the plots; default 1KEYWINDOW
= scalar Window number for the key (zero for no key); default 2LOWERCUTOFF
= scalar Lower cut-off for array values; default *UPPERCUTOFF
= scalar Upper cut-off for array values; default *SCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaKEYDESCRIPTION
= text Overall description for the keyENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
GRID
= identifiers Pointers (of variates representing the columns of a data matrix), matrices, or two-way tables specifying values on a regular gridPEN
= scalars or variates Pen number to be used for the contours of each grid (use of a variate allows every nth contour to be highlighted; the first n-1 units should contain the number of the standard pen, and the nth unit the number of the highlighting pen); default * uses pens 1, 2, and so on for the successive gridsDESCRIPTION
= texts Annotation for key
Description
The surface to be plotted is represented by a grid of z-values or heights. The grid can be a rectangular matrix, a two-way table, or a pointer to a set of variates; the y-dimension is represented by the rows of the structure and the x-dimension by the columns. In each case there must be at least three rows and three columns of data (after allowing for any restrictions on a set of variates). Missing values are not permitted; that is, only complete grids can be displayed. If the grid is supplied as a table with margins, these will be ignored when plotting the surface. The orientation of the contour plot puts element (1,1) of the grid at the point (1,1); that is, the bottom left-hand corner of the plot. Normally the data will lie on a regular grid but you can also specify an irregular grid by using a matrix and setting the
ROWS and COLUMNS options of the MATRIX directive to variates containing the appropriate x- and y- values when the matrix is declared. By specifying a list of structures in the GRID parameter you can produce several superimposed contour plots.The
WINDOW option defines the window where the histogram is plotted, and the KEYWINDOW option similarly specifies where the key should appear. The grid axes are scaled so that the y- and x-dimensions (rows and columns respectively) will match the dimensions of the specified window: if you wish to preserve the "shape" of the grid you should use the FRAME directive to define a window whose y- and x-dimensions are in the same proportions as the grid dimensions. Titles can be added to these windows using the TITLE and KEYDESCRIPTION options. The SCREEN option controls whether the graphical display is cleared before the histogram is plotted and the ENDACTION option controls whether Genstat pauses at the end of the plot.By default, the contour lines are plotted at heights that are determined automatically from the range of the data. A constant interval is used, such that the contour heights are round numbers, up to a maximum of ten contours lines. This can be changed by setting the
INTERVAL option to a scalar containing the required interval between contours; the number of contours will then depend on the range of the data. Alternatively, you can specify the actual contour heights by setting INTERVAL to a variate containing the required values. If the resulting number of contours is less than two, no contours are drawn. However, if the number is very large, the contours may be difficult to interpret and take a long time to plot. You can also set the LOWERCUTOFF and UPPERCUTOFF options to truncate the grid values; the default interval and contour heights are then adjusted accordingly.The contour lines are labelled by integers, and the translation from contour number to the actual height is provided in the key. Contour lines that are very short will not be labelled but their height can be determined from adjacent contours. Each line of the key occupies a space of height 0.02 (in normalized device coordinates), and the key window by default has room for a heading and nine contour levels. If necessary, the size of the window can be redefined using the
FRAME directive.The way in which the contour lines are drawn for each grid is determined by the pen that has been defined for that grid, using the
PEN parameter of DCONTOUR. If the PEN parameter is not set, Genstat uses the pens in turn, pen 1 for the first grid, pen 2 for the second grid, and so on, so that the different grids can easily be distinguished. The relevant aspects of the pens should be set in advance, if required, using the METHOD, COLOUR, LINESTYLE, and THICKNESS parameters of the PEN directive.If the
PEN directive is not used, the plotting method will be line, so that individual contours are made up of straight line segments. If curves are required, METHOD should be set to monotonic to use the method of Butland (1980), or open (or closed) to use the method of McConalogue (1970). Both these methods produce curves that are fitted to independent sets of interpolated points and can thus produce contour lines that cross, particularly if the supplied grid of data is coarse or in a region where the contour height is changing rapidly. If METHOD is set to other values, straight lines will be used to draw the contours.The
PEN parameter of DCONTOUR can also be set to a variate containing a list of pen numbers, which allows highlighting of particular contours. The first value specifies the pen for the first (lowest) contour, second value for the second contour, and so on. The list is recycled if there are too few values for the number of contours to be plotted. For example, the statementDCONTOUR Matrix; PEN=!(1,1,2)
will produce a contour plot where every third contour is drawn by pen 2. The contours drawn by pen 2 may be highlighted in various ways. The default attributes of the pens, which will be in place unless the
PEN directive has been used to specify otherwise, will often be satisfactory. By default, on a colour device, the pens will be defined to use different colours, while on a monochrome device they will use different line styles.By default, the axis bounds are determined from the grid. Normally the lower bound for each axis will be 1.0 and the upper bound will be the number of rows of the grid for the y-axis, and the number of columns for the x-axis. If a matrix is used to specify the grid, its row and column labels can be set to variates whose values will then be used to determine the axis bounds. If more than one grid is specified, the axes are derived from the first grid and subsequent grids are plotted relative to these axes. The
AXES directive can be used to control how the axes are drawn or, by setting STYLE=none, to suppress them altogether.
References
Butland, J. (1980). A method of interpolating reasonably-shaped curves through any data. Proceedings of Computer Graphics 80, 409-422.
McConalogue, D.J. (1970). A quasi-intrinsic scheme for passing a smooth curve through a discrete set of points. Computer Journal 13, 392-396.
DDISPLAY directive
Redraws the current graphical display.
Options
DEVICE
= scalar Device on which to redraw the display (on some systems it may only be possible to redisplay the picture on an interactive graphics device); default uses the current graphics deviceENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
No parameters
Description
This directive is provided to allow additional control of some interactive devices. In some of these, such as PC's, the screen can operate in either text mode or graphics mode. Genstat will automatically switch the screen into the appropriate mode when starting or finishing a graph. Having returned to text mode after examining a graph you may later wish to have another look at the graph that was plotted.
DDISPLAY will switch the screen back to graphics mode, thus re-displaying the graph. The ENDACTION option controls what happens after re-displaying the graph; normally with this type of device you would want to pause. The default action for DDISPLAY is the setting specified by the most recent DEVICE statement.DDISPLAY
has no effect when output is directed to a graphics file. For devices that do not operate in this dual-mode fashion, for example a graphics window under X-windows, DDISPLAY has no effect on the graphical display itself. It will however generate a pause if ENDACTION is set to request one.Note that
DDISPLAY does not actually re-plot the graphical output; it merely switches the screen into graphics mode, and assumes that your system has preserved the graphics image.
DEBUG directive
Puts an implicit
BREAK statement after the current statement and after every NSTATEMENTS subsequent statements, until an ENDDEBUG is reached.
Options
CHANNEL
= scalar Channel number; default 1NSTATEMENTS
= scalar Number of statements between breaks; default 1FAULT
= string Whether to invoke DEBUG only at the next fault (yes, no); default no
No parameters
Description
The straightforward use of
DEBUG causes an immediate break, and then further breaks at regular intervals until you issue an ENDDEBUG statement. Alternatively, by setting option FAULT=yes, you can arrange for Genstat to continue until the next fault diagnostic, and then break. The interval before each further break is specified by the NSTATEMENTS option; by default, breaks take place after every statement.During the breaks, Genstat takes statements from the channel specified by the
CHANNEL option; by default they are taken from channel 1. Each individual break is terminated by an ENDBREAK, exactly like a break invoked explicitly by the BREAK directive.
DECLARE directive
declares one or more customized data structures.
Options
TYPE
= text Single-valued text defining the type of structure to declareMODIFY
= string Whether to modify (instead of redefining) existing structures (yes, no); default no
Parameters
IDENTIFIER
= identifiers Identifiers of the structuresVALUES
= pointers Values for each structureEXTRA
= texts Extra text associated with each identifier
Description
DECLARE
is used to set up compound data structures of a customized type. The form of each customized type is defined using the STRUCTURE directive. So, for example, after defining a complex_number type, bySTRUCTURE [NAME='complex_number'] 'real','imaginary'; \
TYPE='scalar'
we can declare
a complex number C byDECLARE [TYPE='complex_number'] C; VALUES=!p(3,2)
The
VALUES parameter allows values to be defined for the structure, similarly to the VALUES parameter of the POINTER directive. So, here, the real part of the number C['real'] is given the value 3, and the imaginary part C['imaginary'] has the value 2. The EXTRA parameter is also used as in the POINTER directive, allowing extra text to be associated with the structure for annotation, and the MODIFY option allows an existing structure to be modified.
DELETE directive
Deletes the attributes and values of structures.
Options
REDEFINE
= string Whether or not to delete the attributes of the structures so that the type etc can be redefined (yes, no); default noLIST
= string How to interpret the list of structures (inclusive, exclusive, all); default inclPROCEDURE
= string Whether the list of identifiers is of procedures instead of data structures (yes, no); default no
Parameter
identifiers Structures whose values (and attributes, if requested) are to be deleted
Description
The
DELETE directive allows values and attributes of data structures to be deleted so that Genstat can recover the space that they occupy. This may also make the program execute more efficiently as Genstat will then need to keep track of less information. By default only the values are deleted but, if the REDEFINE option is set to yes, the attributes of the structures are also deleted. The only information that is still stored is then the identifier and the internal reference number of the structure. This may be worthwhile merely to save further space. However, the main advantage is that the structures can then be redefined to be of different types.For example, suppose we have defined a variate
Dose byVARIATE [VALUES=0,0,2,2,4,4] IDENTIFIER=Dose
This gives
Dose the values 0, 0, 2, 2, 4, and 4. If we then putDELETE Dose
only the values of
Dose are deleted; so we could now assign a new set: for exampleREAD Dose
2 4 0 4 2 0 :
Dose remains a variate but now has the values 2, 4, 0, 4, 2, and 0.
Alternatively, if we set REDEFINE=yes in the above example, we could then redefine Dose as (for example) a text with seven values.
DELETE [REDEFINE=yes] Dose
TEXT [VALUES=none,double,standard,double,\
none,standard,none] Dose
Once you have defined the type of a structure in a job (as variate, factor, or whatever), you cannot redeclare it as a structure of any other type unless you have first used DELETE to delete its values and attributes. The only exception to this rule is that the GROUPS directive also has a REDEFINE option, which allows a variate or text to be redefined as a factor.
The LIST option defines how the parameter list is to be interpreted. With the default setting, LIST=inclusive, attributes or values are deleted only for the structures in the list. LIST=exclusive means that the parameter list is the complement of the set of structures that are deleted: that is, all named structures that are not in the list are deleted. LIST=all causes the attributes or values of all structures to be deleted. Thus, if LIST=all, any parameter list is ignored; and LIST=exclusive with no parameter is equivalent to LIST=all.
Each time that DELETE is used, Genstat will also remove any unnamed structures that are no longer required and recover any space that has been used for temporary storage. This sort of tidying of workspace will happen automatically if Genstat sees in time that the space is becoming short. However, to avoid unnecessary computation, this does not occur after every statement. Thus, if the space appears to be exhausted, it may be worth using DELETE, even if you have no named structures to delete.
DEVICE directive
Switches between (high-resolution) graphics devices.
No options
Parameters
NUMBER
= scalar Device numberENDACTION
= string Action to be taken after completing each plot (continue, pause)ORIENTATION
= string Orientation of the pictures, if relevant (landscape, portrait); default * retains the current setting for this devicePALETTE
= string How to represent colour (monotone, greyscale, grayscale, colour); default * retains the current setting for this deviceSIZE
= string Size of page for each screen (A4, A3); default * retains the current setting for this device
Description
High-resolution graphics can be generated principally in two forms by Genstat: either on a screen that can operate in graphics mode or by sending output to a file. The screen-based operation is for use in interactive sessions, whereas file output is designed for later use outside Genstat: either to produce hard-copy on a plotter or laser-printer, or to re-display graphics on the screen, if appropriate software is available. Usually there is a choice of various kinds of screen type or file format. Each type of output, whether screen or file, is referred to as a device; thus, the first step in producing graphical output is selecting a device within Genstat that is appropriate for the hardware that you have available. Genstat has built-in interfaces to a number of different graphics devices and the Users' Note will contain a list of those included for any particular version. This information is also available within the
HELP directive, by typing the statementHELP environment,pictures,possible
(or HELP e,p,p for short).
Source code for additional interfaces is provided to allow Genstat to be linked to standard graphical subroutine libraries, such as GKS, to provide access to a wider variety of devices where these are available. Further details of use of these interfaces to extend the graphical facilities are contained in the Installers' Note and Graphical Extension Guide.
The output device is selected by the DEVICE statement. The devices are numbered, so that for example
DEVICE 4
will select the fourth available device. The device numbers to use for the different kinds of output will depend on the version of Genstat that you are using, and how it has been configured. The Users' Note contains details of the available devices; in addition the Genstat help system can be used to display this list as described above.
If you have selected a file-based device you will also have to open a file to receive the output, using the OPEN directive. This can be done before or after selecting the device, so long as the file has been opened before any output is generated. You can close the file when the graphics are complete; if you want to store separate items of graphical output in individual files you can use a sequence of OPEN and CLOSE statements. When opening or closing files for graphical output the CHANNEL parameter of the OPEN and CLOSE statements should be set to the device number specified by the DEVICE statement. For example:
OPEN 'PLOT.HPGL'; CHANNEL=4; FILETYPE=graphics
DEVICE 4
DGRAPH Y; X
CLOSE 4; FILETYPE=graphics
The default device, selected automatically when you start Genstat, is device 1: sometimes you may be able to specify an alternative device number and associated output file on the command line used to start Genstat (the Users' Note should explain if this is possible).
You may get strange results if you try to generate graphics on a screen that is not designed for displaying graphics, or if you specify the wrong device type, as Genstat is not always able to detect the type of device or screen.
This is intended to be a general description of the facilities available and so it is inevitable that some details may vary according to the capabilities of the equipment you are using. For example, when using a PC, the screen will switch from text mode to graphics mode in order to display a graph, whereas in a windowed environment, such as a Unix workstation running X-windows, the graphical output will appear in a window that is independent of the one that contains the Genstat input and text output. There may also be differences in the way that the keyboard or special keys are defined to control the graphics. To obtain the best results from some terminals particular modifications may be required to their settings. For this reason some device-specific information is contained in the Users' Note.
Other than this, there should be little difference in the use of Genstat graphics on different machines, as all the plotting symbols, brush styles, and character output are software-generated by default, using built-in graphics definitions and font files that are supplied with Genstat. The aspects of graphical output that may depend on particular capabilities of the graphics device are identified in the later parts of this section; for example, different defaults may apply to colour and monochrome devices. It may sometimes be advantageous to use particular features of the hardware or additional graphics software (like GKS); for example, other fonts may be available. These device-specific features are usually selected by negative parameter settings, (for example, SYMBOL=-3). Naturally, selection of device-specific attributes may lead to some differences in appearance of the output on different devices.
The ENDACTION parameter, with settings continue and pause, controls the action taken by default at the end of each plot. When using a graphics terminal interactively it may be convenient to pause at the end of a plot to examine the screen. When you are ready to continue, pressing carriage-return or some equivalent key will switch the terminal back to text mode and the Genstat prompt will appear. The precise details will vary according to the device and underlying graphical package; the Users' Note should provide the full information. For some interactive devices, for example workstations with separate graphics windows, it may not be necessary to pause. Each device is initialized to either pause or continue when you start Genstat, according to the particular implementation. If you are running in batch mode the default will always be to continue.
You can repeat the DEVICE statement and set ENDACTION to pause or continue at any time that you wish to change the default action. Alternatively, each graphical directive has an ENDACTION option that controls the device at the end of that directive, without altering the general default setting. For example, if you wish to build up a complex display using several DGRAPH statements with option SCREEN=keep, you could set ENDACTION=continue in the DEVICE statement, then put ENDACTION=pause in the final DGRAPH statement.
The ORIENTATION parameter can be used to specify landscape or portrait orientation of graphical output on PostScript and Interacter raster devices; portrait is the default. PALETTE can be set to monotone, to force all colours to be mapped to colour 1; this is the default for PostScript. Alternatively, PALETTE=colour produces colour PostScript output, and enables the use of the COLOUR directive to specify exactly the composition of the colours. The additional setting PALETTE=greyscale is as for monotone except that area filling (as in histograms) are shaded in grey tones, using the RED parameter of COLOUR to define the grey intensity. The SIZE parameter allows you to specify A3 or A4 pagesizes for HPGL devices.
DGRAPH directive
Draws graphs on a plotter or graphics monitor.
Options
TITLE
= text General title; default *WINDOW
= scalar Window number for the graphs; default 1KEYWINDOW
= scalar Window number for the key (zero for no key); default 2SCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaKEYDESCRIPTION
= text Overall description for the key; default *ENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
Y
= identifiers Vertical coordinatesX
= identifiers Horizontal coordinatesPEN
= scalars or variates or factorsPen number for each graph (use of a variate or factor allows different pens to be defined for different sets of units); default
* uses pens 1, 2, and so on for the successive graphsDESCRIPTION
= texts Annotation for keyYLOWER
= identifiers Lower values for vertical barsYUPPER
= identifiers Upper values for vertical barsXLOWER
= identifiers Lower values for horizontal barsXUPPER
= identifiers Upper values for horizontal bars
Description
The
DGRAPH directive draws high-resolution graphs, containing points, lines, or shaded polygons. The graph is produced on the current graphics device which can be selected using the DEVICE directive. The WINDOW option defines the window, within the plotting area, in which the graph is drawn; by default this is window 1.The
Y and X parameters specify the coordinates of the points to be plotted; they must be numerical structures (scalars, variates, factors, matrices, or tables) of equal length. If any of the variates or factors is restricted, only the subset of values specified by the restriction will be included in the graph. The restrictions are applied to the Y and X variates or factors in pairs, and do not carry over to all the variates or factors in a list. For example, suppose the variate Y1 is restricted but the variate Y2 is not. The statementDGRAPH Y1,Y2; X
will plot the subset of values of Y1 against X, but all the values of Y2 against X. Conversely, if X were restricted the subset would be plotted for both Y1 and Y2. Any associated structures, like variates specified by the PEN parameter or factors used to provide labels for the points, must be of the same length as Y and X.
Each pair of Y and X structures has an associated pen, specified by the PEN parameter. By default, pen 1 is used for the first pair, pen 2 for the second, and so on. The type of graph that is produced is determined by the METHOD setting of that pen. This can be point, to produce a point plot or scatterplot; line to join the points with straight lines; monotonic, open, or closed to plot various types of curve through the points; or fill to produce shaded polygons. In the initial graphics environment, all the pens are defined to produce point plots. This can be modified using the METHOD option of the PEN directive. Other attributes of the pen can be used to control the colour, font, symbols, and labels.
With METHOD=fill, the points defined by the Y and X variates are joined by straight lines to form one or more polygons which are then filled using the brush style specified for the pen. The JOIN parameter of PEN determines the order in which the points are joined; with the default, ascending, the data are sorted into ascending order of x-values, while with JOIN=given they are left in their original order. There should be at least three points when using this method.
A warning message is printed if the data contain missing values. The effect of these depends on the type of graph being produced, as follows. If the method is point there will be no indication on the graph itself that any points were missing (but obviously none of the points with missing values for either the y- or x-coordinate can be included in the plot). If a line or curve is plotted through the points there will be a break wherever a missing value is found; that is, line segments will be omitted between points that are separated by missing values. When using METHOD=fill missing values will, in effect, define subsets of points, each of which will be shaded separately. Note, however, that the position of the missing values within the data will differ according to whether or not the data values have been sorted; this is controlled by the JOIN parameter of PEN, as described above. If the data are sorted, units with missing x-values are moved to the beginning.
The PEN parameter can also be set to a variate or factor, to allow different pens to be used for different subsets of the units. With a factor, the units with each level are plotted separately, using the pen defined by the level concerned. If PEN is set to a variate, its values similarly define the pen for each unit. For example, if you fit separate regression lines to some grouped data, you can easily plot the fitted lines in just two statements, one to set up the pens and one to plot the data:
PEN 1...Ngroups; METHOD=line; SYMBOL=0
DGRAPH Fitted; X; PEN=Groups
By default, Genstat calculates bounds on the axes that are wide enough to include all the data; the range of the data is extended by five percent at each end, and the axes are drawn on the left-hand side and bottom edge of the graph. This can all be changed by the AXES directive using the YLOWER, YUPPER, XLOWER, and XUPPER parameters to set the bounds, and YORIGIN and XORIGIN to control the position of the axes. Other parameters allow you to control the axis labelling and style. If the axis bounds are too narrow, some points may be excluded from the graph, so that clipping occurs. If the plotting method is point, Genstat ignores points that are out of bounds. For other settings of METHOD, lines are drawn from points that are within bounds towards points that are out of bounds, terminating at the appropriate edge. Clipping may also occur if the method is monotonic, open, or closed and you have left Genstat to set default axis bounds, because these methods fit curves that may extend beyond the boundaries. If this occurs you should use the AXES directive to provide increased axis bounds. When you use several DGRAPH statements with SCREEN=keep to build up a complex graph, the axes are drawn only the first time, and the same axes bounds are then used for the subsequent graphs. You should thus define axis limits that enclose all the subsequent data. Axes are drawn only if SCREEN=clear, or the specified window has not been used since the screen was last cleared, or the window has been redefined by a FRAME statement.
DGRAPH
can also be used to add error bars to the plot. You might want to use these, for example, to show confidence limits on points that have been fitted by a regression. Error bars are requested by setting the YLOWER and YUPPER parameters to variates defining the lower and upper values for the error bar to be drawn at each point. For example, if you know the standard error for each point, you could calculate and plot the bounds as follows:CALCULATE Barlow = Y - 1.96 * Err
& Barhigh = Y + 1.96 * Err
DGRAPH Y; X; YLOWER=Barlow; YUPPER=Barhigh
The error bar is drawn from the lower point to the upper point at the associated x-position; the bar will be drawn even if the corresponding y-value (or y-variate) is missing. If the lower value is missing, or the YLOWER parameter is not set, only the upper section of the bar is drawn; likewise if the upper value is missing only the lower section is drawn. The same pen is used to draw the error bars as is used for the y- and x-values. If you want to use a different pen for the error bars you can plot them separately: for example
DGRAPH Y,*; X,X; YLOWER=*,Barlow; YUPPER=*,Barhigh; PEN=2,6
Similarly, parameters
XLOWER and XUPPER allow you to plot horizontal bars at each point.The
KEYWINDOW option specifies the window in which the key appears; by default this is window 2. Alternatively, you can set KEYWINDOW=0 to suppress the key. The key contains a line of information for each pair of Y and X structures, written with the associated pen. This will indicate the symbol used, the line style (for a plotting method of line or curve) or a shaded block to illustrate the brush style (when METHOD=fill), the name of the structure (if any) defined by the LABELS parameter of PEN, and a description indicating the identifiers of the data plotted (for example Residuals v Fitted). Alternatively, you can supply your own key, using the DESCRIPTION parameter, and you can specify a title for the key using the KEYDESCRIPTION option. If you draw several graphs using SCREEN=keep and the same key window, each new set of information is appended to the existing key, until the window is full.If you have set the
PEN parameter to a variate or factor in order to plot independent subsets of the data, the key will contain information for each subset. If the LABELS parameter of PEN has been used to specify labels for the points, each line of the key will contain the label corresponding to the first value of the subset, rather than the identifier of the labels structure itself.The
TITLE option can be used to provide a title for the graph. You can also put titles on the axes by using the YTITLE and XTITLE parameters of the AXES directive. The SCREEN option controls whether the graphical display is cleared before the graph is plotted and the ENDACTION option controls whether Genstat pauses at the end of the plot.
DHISTOGRAM directive
Draws histograms on a plotter or graphics monitor.
Options
TITLE
= text General title; default *WINDOW
= scalar Window number for the histograms; default 1KEYWINDOW
= scalar Window number for the key (zero for no key); default 2LIMITS
= variate Variate of group limits for classifying variates into groups; default *NGROUPS
= scalar When LIMITS is not specified, this defines the number of groups into which a DATA variate is to be classified; default is the integer value nearest to the square root of the number of values in the variateLABELS
= text Group labels; default *APPEND
= string Whether or not the bars of the histograms are appended together (yes, no); default noSCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaKEYDESCRIPTION
= text Overall description for the key; default *ENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
DATA
= identifiers Data for the histograms; these can be either a factor indicating the group to which each unit belongs, a variate whose values are to be grouped, or a one-way table giving the height of each barNOBSERVATIONS
= tables One-way table to save numbers in the groupsGROUPS
= factors Factor to save groups defined from a variatePEN
= scalars Pen number for each histogram; default * uses pens 1, 2, and so on for the successive structures specified by DATADESCRIPTION
= texts Annotation for key
Description
The data for
DHISTOGRAM can be specified in several ways. You can supply variates, which are sorted into groups defined by the LIMITS option or determined automatically if NGROUPS is set. Alternatively, the groups can be defined by factors. Finally, the data can be supplied as a one-way table giving the sizes of the groups; the table is then constrained to contain positive integers. In DHISTOGRAM the table to contain any numbers, positive or negative, thus allowing bar charts to be drawn. Details of the groups can be saved using the NOBSERVATIONS and GROUPS parameters.The
WINDOW option defines the window where the histogram is plotted, and the KEYWINDOW option similarly specifies where the key should appear. You can set either of these to zero if you want to suppress the corresponding output. Titles can be added to the histogram and key using the TITLE and KEYDESCRIPTION options respectively.The
SCREEN option controls whether the graphical display is cleared before the histogram is plotted and the ENDACTION option controls whether Genstat pauses at the end of the plot.The
APPEND option controls the form of display to be used when the DATA parameter specifies a list of structures. These parallel histograms can be produced in one of two styles. By default (APPEND=no), the histogram contains a set of bars for each structure, drawn in parallel groups. Alternatively, if you set APPEND=yes, the bars for the structures are concatenated into a single bars for each group. The top portion of each bar then corresponds to the first structure, and the bottom to the last structure.The bars for each structure are all shaded according to the pen that has been specified for that structure, using the
PEN parameter. If the PEN parameter is not set, Genstat uses the pens in turn, pen 1 for the first structure, pen 2 for the second structure, and so on, so that a different shading is used for each of the structures. The relevant aspects of the pens should be set in advance, if required, using the BRUSH and COLOUR parameters of the PEN directive. Often, however, the default attributes of the pens will be satisfactory.The bars are drawn with equal width and the length of each bar is proportional to the number of values that it represents, irrespective of the width of the corresponding group. Thus, the areas of the bars will represent the relative frequencies of the groups only if the groups are of equal width.
The axes of the histogram are formed automatically from the data. By default, the upper bound of the y-axis is set to be five percent greater than the height of the longest bar. If any of the bars is of negative height the lower bound is adjusted in a similar way, otherwise it is set to zero. When the histogram is formed from a variate, the x-axis markings are set to indicate the limits of each bar or set of bars; when the data are provided in a factor the factor labels or levels are used to label the histogram bars, and when the bar heights are provided directly in a table the classifying factor of the table is used. You can control the form of the axes by using the
AXES directive to set the required attributes before the DHISTOGRAM directive is used.The
WINDOW parameter of AXES should be set to the window in which the histogram is to be plotted (controlled by the WINDOW option of DHISTOGRAM). The STYLE parameter then controls which axes are drawn: x-axis only (by specifying x or none), x- and y-axes (y, xy, or grid), or x- and y-axes with a box (box). The YTITLE, YLOWER, YUPPER, YMARKS, and YLABELS parameters control annotation of the y-axis. The YUPPER parameter is particularly useful when you are plotting a series of histograms; by setting YUPPER to a value larger than any of the bars in any of the histograms, you can ensure that they are all plotted on the same scale. However, Genstat ignores the setting of this parameter if the longest bar is greater than the value supplied and, when checking this, you must be careful to allow for the effect of the APPEND option (see above). The x-axis bounds are defined by the data and cannot be altered. However, you can use the LABELS option (of DHISTOGRAM) to specify labels, and XMPOSTION and XLPOSITION to control the positioning of tick marks and labels.The histogram key consists of the title, if set by
KEYDESCRIPTION, followed by a legend for each structure plotted. This consists of a small rectangle that is drawn in the same colour and brush style as that used in the histogram, followed by the identifier name or the piece of text specified by the DESCRIPTION parameter.
DIAGONALMATRIX directive
Declares one or more diagonal matrix data structures.
Options
ROWS
= scalar, vector, or pointer Number of rows, or labels for rows (and columns); default *VALUES
= numbers Values for all the diagonal matrices; default *MODIFY
= string Whether to modify (instead of redefining) existing structures (yes, no); default no
Parameters
IDENTIFIER
= identifiers Identifiers of the diagonal matricesVALUES
= identifiers Values for each diagonal matrixDECIMALS
= scalars Number of decimal places for printingEXTRA
= texts Extra text associated with each identifierMINIMUM
= scalars Minimum value for the contents of each structureMAXIMUM
= scalars Maximum value for the contents of each structure
Description
Diagonal matrices are square matrices that have zero entries except on their leading diagonals: for example,
2 0 0
0 1 0
0 0 3
Another example is the identity matrix, which has a diagonal of values equal to 1. To save space, Genstat has a special structure for diagonal matrices, and these can be declared using the
DIAGONALMATRIX directive.Because a diagonal matrix is square, Genstat requires you to specify only the number of rows. This is done using the
ROWS option. The simplest method is to use a scalar to define the number of rows explicitly. Alternatively, you can set ROWS to a variate, text, or pointer, whose length then defines the number of rows and whose values will then be used as labels, for example when the matrix is printed. Finally, if you specify a factor, the number of levels defines the number of rows and the labels if available, or otherwise the levels, are used for labelling.When you give the values of a diagonal matrix, either in a declaration or when its values are read, you should specify only the diagonal elements. (Genstat does not store the off-diagonal elements, but assumes them to be zero.) Similarly, when a diagonal matrix is printed it appears as a column of numbers; Genstat omits the off-diagonal zeros. For example:
DIAGONALMATRIX [ROWS=3; VALUES=2,1,3] D
declares the diagonal matrix D and gives it the values shown above.
Values can be assigned to the diagonal matrices by either the VALUES option or the VALUES parameter. The option defines a common value for all the matrices in the declaration, while the parameter allows them each to be given a different value. If both the option and the parameter are specified, the parameter takes precedence.
If the MODIFY option is set to yes any existing attributes and values of the diagonal matrices are retained (if still appropriate); otherwise these are lost.
The DECIMALS parameter allows you to define a number of decimal places to be used by default when each diagonal matrix is printed. You can associate a text of extra annotation with each diagonal matrix using the EXTRA parameter. The MINIMUM and MAXIMUM parameters allow you to define lower and upper limits on the values in each diagonal matrix. Genstat then prints warnings if any values outside that range are allocated to the matrix.
DISPLAY directive
Prints, or reprints, diagnostic messages.
Options
CHANNEL
= identifier Channel number of file, or identifier of a text to store output; default current output fileFAULT
= text Specifies the fault message to print (for example, FAULT='VA 4' prints the message "Values not set"); default is to print the last diagnostic message
No parameters
Description
By default,
DISPLAY reprints the most recent disgnostic. Alternatively, you can use the FAULT option of DISPLAY to print any particular Genstat diagnostic. The CHANNEL option controls where the information is printed; the default is the current output file.
DISTRIBUTION directive
Estimates the parameters of continuous and discrete distributions.
Options
CBPRINT
= strings Printed output required from a fit combining all the input data (parameters, samplestatistics, fittedvalues, proportions, monitoring); default *DISTRIBUTION
= string Distribution to be fitted (Poisson, geometric, logseries, negativebinomial, NeymanA, PolyaAeppli, PlogNormal, PPascal, Normal, dNvequal, dNvunequal, logNormal, exponential, gamma, Weibull, b1, b2, Pareto); default * i.e. fit nothingCONSTANT
= string Whether to estimate a location parameter for the gamma, logNormal, Pareto, or Weibull distributions (estimate, omit); default omitLIMITS
= variate Variate to specify or save upper limits for classifying the data into groups; default *NGROUPS
= scalar When LIMITS is not specified, this defines the number of groups (of approximately equal size) into which the data are to be classified; default is the integer value nearest to the square root of the number of data valuesXDEVIATES
= variate Variate to specify points up to which the CUMPROPORTIONS are to be estimatedJOINT
= string Requests joint estimates from the combined fit to be used for a re-fit to the separate data sets (dispersion, variancemeanratio, Poissonindex); default *PARAMETERS
= variate Estimated parameters from the combined fitSE
= variate Standard errors for the estimated parameters of the combined fitVCOVARIANCE
= symmetric matrix Variance-covariance matrix for the estimated parameters of the combined fitCUMPROPORTIONS
= variate Estimated cumulative proportions of the combined distribution up to the values specified by the XDEVIATES optionMAXCYCLE
= scalar Maximum number of iterations; default 30TOLERANCE
= scalar Convergence criterion; default 0.0001
Parameters
DATA
= variates or tables Data values either classified (table) or unclassified (variate)NOBSERVATIONS
= tables One-way table to save the data classified into groupsRESIDUALS
= tables Residuals from each (individual) fitFITTEDVALUES
= tables Fitted values from each fitPARAMETERS
= variates Estimated parameters from each fitSE
= variates Standard errors of the estimatesVCOVARIANCE
= symmetric matricesVariance-covariance matrix for each set of estimated parameters
CUMPROPORTIONS
= variates Estimated cumulative proportions of each distribution up to the values specified by the XDEVIATES optionCBRESIDUALS
= tables Residuals from the combined fitCBFITTEDVALUES
= tables Fitted values from the combined fitSTEPLENGTH
= variates Initial step lengths for each fitINITIAL
= variates Initial values for each set fit
Description
The
DISTRIBUTION directive is used to fit an observed sample of data to a theoretical distribution function, in order to obtain maximum-likelihood estimates of the parameters of the distribution and test the goodness of fit. The data consists of observations xi of a random variable X, which has a distribution function F(x) defined by F(x)=Pr(X£ x). A selection of both discrete and continuous distributions are available; full details are given below.For discrete distributions X may take non-negative integer values only, except for the log-series distribution where only positive integer values are allowed. For continuous distributions the random variable X may take any values, subject to constraints for certain distributions, for example, data values must be strictly positive in order to fit a log-Normal distribution. Constraints are detailed with the individual distributions described below.
The data can be supplied to
DISTRIBUTION as a variate or as a one-way table of counts. If the raw data are available, then these should be supplied (as a variate), since the raw data contains more information than grouped data.If raw data are not available, then a one-way table of counts, or frequencies, should be given. The factor classifying the table must have its levels vector declared explicitly, since the levels are used to indicate the boundary values of the raw data used to create the grouping. For example, if the discrete variable X takes the values 0...8, with numbers of observations 2,6,7,4,2,1,0,1,0 respectively, a table of counts can be declared by
FACTOR [LEVELS=!(0...8)] F
TABLE [CLASSIFICATION=F; VALUES=2,6,7,4,2,1,0,1,0] T
The factor levels do not have to specify single data values: often it will be desirable to group certain values together, and indeed for continuous data this is the only sensible way to proceed. In general, for a classifying factor with levels l1, l2, ... , lf, the count nk for the kth cell of the table will be the number of observations xi such that
xi £ l1, k=1
lk-1 < xi £ lk, 2£ k£ f-1
lf-1 < xi, k=f
This means that for all except the last cell of the table, the factor level represents the upper limit on values in that cell. The final class of the table is termed the tail; it is formed by combining the frequencies for all values of X greater than lf-1, and the upper limit on values in the tail is infinity. For continuous distributions with no lower bound, the first class will be the lower tail. You will often want to form the tail(s) by amalgamating groups with low numbers of counts. In the example above, you might amalgamate the groups for values 6-8:
FACTOR [LEVELS=!(0...5,99)] F2
TABLE [CLASSIFICATION=F2; VALUES=2,6,7,4,2,1,1] T2
Note that the final factor level, for the tail, can be given a dummy value of 99 to indicate that it has no upper limit, since this value is never used in calculations.
When data is supplied as a table instead of as a variate, the computed log-likelihood is only an approximation to the full log-likelihood and the solution obtained will depend to some extent on the choice of class limits. More reliable results will be achieved with a larger number of classes, since this gives more information on the data distribution, so only classes with very few observations should be amalgamated. In general, care should be taken to choose class limits that give a reasonable number of counts in each class, but with none of the individual classes holding a disproportionately large number of observations.
The
DISTRIBUTION option should be set to indicate which distribution is to be fitted to the data. The following distributions are available:|
Discrete |
Continuous |
|
Binomial (as a special case |
Normal |
|
Poisson |
Double Normal (unequal variances) |
|
Geometric |
Log-Normal |
|
Log-series |
Exponential |
|
Negative binomial |
Gamma |
|
Neyman type A |
Weibull |
|
Pólya-Aeppli |
Beta type I |
|
Poisson-log-Normal |
Beta type II |
|
Poisson-Pascal |
Pareto |
The first step of the fitting process is to compute and print various sample statistics. Examining these may help in the selection of appropriate distributions for fitting - properties of the various distributions are listed at the end of this section. The setting
DISTRIBUTION=* can be used to produce this output without any model fitting. The following sample statistics are calculated:|
Sample size |
n |
|
|
Sample mean |
m = å xi/n |
|
|
Sample variance |
s 2 = å xi2/n - m2 |
discrete distributions |
|
s 2 = å (xi-m)2 / (n-1) |
continuous distributions |
|
|
Sample skewness |
g 1 = å (xi-m)3 / (n-1)s3 |
|
|
= m3/s3 |
||
|
Sample kurtosis |
g 2 = å{(xi-m)4/(n-1)s4} - 3 |
continuous distributions only |
|
Sample quartiles |
xp : F(xp)=p |
|
|
Poisson index |
(s2-m)/m2 |
discrete distributions only |
|
Negative binomial index |
m (m3-3s2+2m)/(s2-m)2 |
discrete distributions only |
If the original data are not available, the sample statistics are calculated by substituting class mid-points in place of the data. For the lower tail, the class "mid-point" is taken to be l1-1/2(l2-l1) and for the upper tail, lf-1+1/2(lf-1-lf-2). No corrections are made for groupings. When a distribution has been fitted to data, the relevant theoretical statistics of that distribution are printed for comparison with the sample statistics, as a check on the appropriateness of the model for the data.
A summary is given of the fit: the parameter estimates are printed with their standard errors and correlations, including the working parameters, which are stable functions of the parameters defining the distribution and are used in the internal algorithm. The goodness of fit to the chosen distribution is indicated by the residual deviance which has an asymptotic chi-squared distribution with the specified degrees of freedom. The deviance is also the preferred statistic for comparison of nested models, for example the double Normal distribution with equal and unequal variances. This is followed by a table of observed and fitted values (expected frequencies), together with weighted residuals. If raw data are supplied, by default this table is formed by dividing the data into Ö n groups of approximately equal observed frequency, which are therefore likely to be of unequal widths. The
NGROUPS option may be used to set the number of groups for this table. If data are supplied as a table, the fitted values use the classification from that table. In either case the LIMITS option may be used to supply a different set of limits; with the constraint that if tabulated data are analysed these limits should be a subset of the original limits so that the new groups are formed by aggregation.The
NOBSERVATIONS, RESIDUALS, and FITTEDVALUES parameters can be used to save the number of observations in each cell, the fitted number, and the residual respectively (all in tables). The parameter estimates and their standard errors can be saved in variates specified by PARAMETERS and SE. The variance-covariance matrix for the estimated parameters can be saved as a symmetric matrix using the VCOVARIANCE parameter.Having fitted the required distribution, the estimated cumulative distribution function (CDF) can be evaluated at specified values of X. These are defined using the
XDEVIATES option. The values of the CDF can be printed (by selecting PRINT=proportions) or saved in a variate by setting the CUMPROPORTION parameter.If you have several sets of data you may be interested in fitting the distribution individually to each set; this can be done by setting the
DATA parameter to a list of identifiers. A separate analysis is then performed for each set of data, but of course any option settings are common to all the data sets. The data sets should all be specified in the same way, either as raw data or as tabulated counts. For tabulated counts, the same categories must be used for defining every table. You can also carry out one final fit to the combined data set, in order to investigate whether the data can be adequately modelled as coming from a single population. This combined fit is produced if any of the options relating to the combined fit have been set (that is, options CBPRINT, PARAMETERS, SE, VCOVARIANCE, or CUMPROPORTION which print or save information from the combined analysis). For each individual data set you can also save fitted values and residuals based on the parameters estimated from the combined data set, using the CBRESIDUALS and CBFITTEDVALUES parameters. The JOINT option can be used to specify that certain parameters should be held constant at their estimated values from the combined analysis during refits to the individual data sets. For continuous distributions only, a common dispersion parameter can be requested; for discrete distributions a common value can be requested for either the Poisson index or the ratio of variance to mean. An analysis of deviance is printed to compare the nested models.If the original data is available, the full log-likelihood is used in the optimization algorithm. Otherwise, an approximate log-likelihood is optimized, using representative values for each class. For some distributions, it is necessary to use stable working parameters in the optimization algorithm (Ross 1990), and the defining parameters for the distribution are then evaluated by a simple transformation.
The deviance and corresponding degrees of freedom that are printed as part of the model summary are based on the table of fitted values, and thus may be affected by the choice of limits. The residuals computed are deviance residuals (McCullagh and Nelder 1989), and the deviance is therefore the sum of squared residuals. The degrees of freedom are n-p-1, where n is the number of cells in the table of fitted values and p is the number of parameters estimated in the model. The default limits for grouping the raw data are designed to avoid small expected frequencies (for example in the tail cells) which can have an inflationary affect on the deviance; however, if the tails are important, because of the origin of the data, it may be important to specify the limits explicitly.
An iterative Gauss-Newton optimization method is used to estimate the parameters of the distribution. The parameterization is chosen for each model so that the optimization is stable, but if there are any problems with particular data sets it may be necessary to control this process. The
MAXCYCLE and TOLERANCE options allow you to increase the number of iterations and alter the convergence criterion for data sets that fail to converge. You can also specify initial values and step lengths for the parameters for each set of data using the STEPLENGTH and INITIAL parameters. These parameters should be set to variates of length appropriate for the distribution being fitted; for example, if DISTRIBUTION=Poisson they should have just one value. Another use of INITIAL and STEPLENGTH is to constrain a parameter to a particular value; for example when fitting a double Normal the proportion parameter p could be fixed at 0.5 by setting the initial value to 0.5 and the steplength to 0, thus fitting a double Normal in equal proportions. Note that the degrees of freedom are not adjusted to take account of this.
References
McCullagh, P. and Nelder, J.A. (1989). Generalized linear models (second edition). Chapman and Hall, London.
Ross, G.J.S. (1990). Nonlinear estimation. Springer-Verlag, New York.
DKEEP directive
Saves information from the last plot on a particular device.
No options
Parameters
DEVICE
= scalars The devices for which information is required, if the scalar is undefined or contains a missing value, this returns the current device numberWINDOW
= scalars Window about which the information is required; default * gives information about the last windowYLOWER
= scalars Lower bound for the y-axis in last graph in the specified device and windowYUPPER
= scalars Upper bound for the y-axis in last graph in the specified device and windowXLOWER
= scalars Lower bound for the x-axis in last graph in the specified device and windowXUPPER
= scalars Upper bound for the x-axis in last graph in the specified device and windowFILE
= scalars Returns the value 1 or 0 to indicate whether a file is required for this deviceDESCRIPTION
= texts Description of the deviceDREAD
= scalars Returns the value 1 or 0 to indicate whether graphical input is possible from this deviceENDACTION
= texts Returns the current ENDACTION setting ('continue' or 'pause')
Description
DKEEP
provides information that can be used in general programs and procedures to control the graphical output. For the specified device you can determine whether it generates screen output or uses a file, whether graphical input is possible, a description of the device (as printed by HELP), the current ENDACTION setting, and details of the axis bounds.The device for which the information is required is specified by the
DEVICE parameter. If you specify a scalar containing a missing value, this will be set to the number of the current graphics device. You can then test whether an output file is needed and open one accordingly.When writing a procedure you can find out if axes bounds have been set explicitly, using the
SAVE parameter of AXES. This information may then be used when setting up the axes for other graphs. However, if the bounds were not set, but have been evaluated from the data (or if the axes have subsequently been redefined) the information in the save structure will not be of any use. The actual values used when plotting are recorded internally, for each window of each device, and can be accessed using the YLOWER, YUPPER, XLOWER, and XUPPER parameters of DKEEP.
DPIE directive
Draws a pie chart on a plotter or graphics monitor.
Options
TITLE
= text General title; default *WINDOW
= scalar Window number for the pie chart; default 1KEYWINDOW
= scalar Window number for the key (zero for no key); default 2SCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaKEYDESCRIPTION
= text Overall description for the keyENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
SLICE
= scalars Amounts in each of the slices (or categories)PEN
= scalars Pen number for each slice; default * uses pens 1, 2, and so on for the successive slicesDESCRIPTION
= texts Description of each slice
Description
A pie chart is formed by taking the values of the scalars in the
SLICE parameter, in order, and representing them by segments of a circle starting at "three o'clock" and working in an anti-clockwise direction. The angle subtended by each segment (and thus the area of the segment) is proportional to the value of the corresponding scalar. The values may be raw data or can be expressed as percentages (by ensuring they total 100).The brush style used for each segment can be controlled using the
PEN parameter. By default, pen 1 is used for the first segment, pen 2 for the second segment, and so on. The default attributes of the pens are device specific, so that on a colour display the segments will be solid-filled using different colours, and on a monochrome device different hatching styles will be used. These can be modified using the PEN directive.Individual segments can be displaced outwards from the centre, to obtain an "exploded" pie chart. The chosen segments are indicated by setting the corresponding scalars in the
SLICE parameter list to negative values.The
WINDOW and KEYWINDOW options specify the windows in which the pie chart and key are to be displayed. The shape of the pie chart is determined by the dimensions of the window; if it is not square the resulting pie chart will be elliptical.Titles can be added using the
TITLE and KEYDESCRIPTION options. The key produced for the pie chart is similar to that produced by the DHISTOGRAM directive: a shaded block is drawn for each segment, followed by the identifier name or the piece of text specified by the DESCRIPTION parameter.The
SCREEN option controls whether the graphical display is cleared before the histogram is plotted and the ENDACTION option controls whether Genstat pauses at the end of the plot.
DREAD directive
Reads the locations of points from an interactive graphical device.
Options
CHANNEL
= scalar Number of the graphics device from which to read; default * takes the current graphics deviceWINDOW
= scalar Window from which to read; default 1CURSORTYPE
= scalar Type of cursor; default 1SETNVALUES
= string Whether to set number of values of structures from the number of values read (yes, no); default no causes the number of values to be set only for structures whose lengths are not defined alreadyENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
Y
= variates Variate to receive the y-values that have been readX
= variates Variate to receive the x-values that have been readYGIVEN
= variates Y-coordinates of points that may be located on the graphXGIVEN
= variates X-coordinates of points that may be locatedSAVESET
= variates Unit numbers of the located pointsPEN
= scalars Pen number to use to echo points; default 0YSAVE
= variates Variate to receive the y-coordinates of the located pointsXSAVE
= variates Variate to receive the x-coordinates of the located points
Description
The
DREAD directive allows you to input information about the positions of points on interactive graphical terminals. The exact details of how this directive operates will vary slightly from one system to another, so this section attempts to outline the basic principles involved. If you encounter any difficulties using DREAD you should refer to the Users' Note supplied with your version of Genstat.When you type
DREAD, a cursor should appear on the graphics screen. This can be moved to the chosen position by using the cursor keys or a mouse; the coordinates of this point can then be read by pressing a key or mouse button (normally the left hand mouse button). The cursor can then be moved to another position to read the next point. You can use graphical input within any window that contains a graph or contour plot, but you cannot input data from an "empty" window or one containing other forms of graphical output. In addition you can identify particular points from those plotted on an existing graph and you can mark the points that you have read.The
CHANNEL and WINDOW options are used to specify the device and the window from which the information is to be read; the default is to read from window 1 of the current device. The values that are read are converted to the scale of the data that was previously plotted in that window, and are then stored in the pair of variates specified by the Y and X parameters.Any number of points may be read in one
DREAD statement. If the required number of points is known in advance, the Y and X variates can be declared with the appropriate length, and the input will terminate automatically when sufficient points have been read. Alternatively, if the lengths of the variates have not been defined in advance, points are read until you terminate the input, and the variates are defined accordingly. This action can be requested explicitly by setting option SETNVALUES=yes; the existing variate lengths are then ignored and points are read until the input is terminated. Graphical input can usually be terminated in two ways, either by pressing a mouse button (usually the right button) or a key that has been specifically defined for this purpose, or by attempting to read a point lying outside the current axes. In case of difficulty you should refer to the Users' Note which will explain how to terminate the input on specific devices. The final point read as a terminator is not included in the Y and X variates. If you try to terminate input prematurely when a set number of values is to be read, the corresponding Y and X values are set to missing values.The
PRINT option of DREAD is similar to the PRINT option of READ. Putting PRINT=data lists the y- and x-values of the points that have been read, while PRINT=summary generates the usual summary of mean, minimum, maximum, and number of values.Several types of cursor may be available; again this will depend on the graphics device. The cursor is selected by setting the
CURSORTYPE option to an integer between 1 and 10. Normally cursors 1, 2, and 3 are different graphics cursors; for example, large cross-hair, arrow, and small cross. Cursors 4 and 5 may be set up to provide special functions called rubber-band and rubber-rectangle.A rubber-band cursor works by reading one point in the normal way (as if
CURSORTYPE was set to 1). This defines an anchoring point for a line whose other end is attached to the cursor. As you move the cursor, the line will change direction and contract or expand, but always linking the fixed point to the current cursor position: hence the term "rubber-band". When you read the next point this will become the anchor point for a new rubber-band segment which you use whilst locating a third point, and so on until the required number of points have been read.The rubber-rectangle works in a similar way, with the first point being read with a normal cursor. This defines the fixed point and the cursor is now regarded as being attached to the diagonally opposite corner of a rectangle which will contract and expand as you move the cursor around the screen. Reading the second point terminates the input; with a rubber-rectangle cursor Genstat will always read exactly two values, ignoring the
SETNVALUES option and any predefined length of Y and X.The rubber-band and rubber-rectangle types of cursors may not be available on all devices, in which case setting
CURSORTYPE to 4 or 5 will use one of the simpler cursors. However, setting CURSORTYPE to 5 will always read just two points, regarded as being diagonally opposite corners of a rectangle, whether or not the rubber-rectangle appears on the screen.Some devices may have more than one method of manipulating the graphics cursor, for example by use of a joystick or mouse. In this case, cursor-types 1 to 5 will be set up as described above for the joystick, say, and types 6 to 10 will be the same types of cursor but controlled by the mouse. Usually, however, there will be only one method of control, in which case cursor-types 6 to 10 will be the same as types 1 to 5.
The
PEN parameter of DREAD can be used to specify a pen which will be used to plot each point as its position is read. The various attributes of this pen determine how the points are plotted; these can be modified, in the usual way, using the PEN directive. If the pen method is set to line, monotonic, open, or closed, then straight line segments will be drawn between the points; otherwise just the points themselves are plotted. If the points are to be joined by lines and a rubber-rectangle cursor is being used, the rectangle will be drawn rather than the diagonal line. If labels are set for the pen, they will be used in turn to mark the points as they are read; if the number of points exceeds the number of labels the labels will be recycled.The
YGIVEN and XGIVEN parameters allow you to identify points that have been plotted in an existing graph. They should be set to the y- and x-variates that were plotted on the graph. Each point that is read by DREAD is then located within this pair of variates, by finding the original point that is physically nearest to the new point, ignoring any differences in the scales of the y- and x-values. The unit number of the located points can be saved in a variate specified by the SAVESET parameter, and their coordinates in a pair of variates supplied by the YSAVE and XSAVE parameters. The length of the variates is defined in the same way as for the Y and X variates. The variates saved by YSAVE and XSAVE contain the actual coordinates of the plotted points that were selected by DREAD; whereas the Y and X variates contain the coordinates of the exact position of the cursor. The SAVESET variate indicates the unit numbers of the selected points. This information could be used, for example, in CALCULATE or RESTRICT statements to refer to the units that have been identified on the graph. For example,DREAD U; V; YGIVEN=Y; XGIVEN=X; SAVESET=SS
RESTRICT Y,X; .NOT.EXPAND(SS; NVALUES(X))
would have the effect of excluding the points identified by DREAD; in this example the exact cursor locations recorded in U and V are not of interest.
When the PEN parameter is being used to mark the points that are read, you may want to pause at the end of the read so that you can inspect the modified graph. This is controlled by the ENDACTION parameter.
DROP directive
Drops terms from a linear, generalized linear, generalized additive, or nonlinear model.
Options
NONLINEAR
= string How to treat nonlinear parameters between groups (common, separate, unchanged); default unchCONSTANT
= string How to treat the constant (estimate, omit, unchanged); default unchFACTORIAL
= scalar Limit for expansion of model terms; default * i.e. that in previous TERMS statementPOOL
= string Whether to pool ss in accumulated summary between all terms fitted in a linear model (yes, no); default noDENOMINATOR
= string Whether to base ratios in accumulated summary on rms from model with smallest residual ss or smallest residual ms (ss, ms); default ssNOMESSAGE
= strings Which warning messages to suppress (dispersion, leverage, residual, aliasing, marginality, df, inflation); default *FPROBABILITY
= string Printing of probabilities for variance and deviance ratios (yes, no); default noTPROBABILITY
= string Printing of probabilities for t-statistics (yes, no); default noSELECTION
= strings Statistics to be displayed in the summary of analysis produced by PRINT=summary, the first four are relevant only for a Normally distributed response, and the last only for a gamma-distributed response (%variance, %ss, adjustedr2, r2, seobservations, dispersion, %cv); default %var,seob if DIST=normal, %cv if DIST=gamma, and disp for other distributions
Parameter
formula List of explanatory variates and factors, or model formula
Description
DROP
deletes terms from the current regression model, which may be linear, generalized linear, generalized additive, standard curve, or nonlinear. It is best to give a TERMS statement before investigating sequences of models using DROP, in order to define a common set of units for the models that are to be explored. If no model is fitted after the TERMS statement, the current model is taken to be the null model.The model fitted by
DROP will include a constant term if the previous model included one, and will not include one if the previous model did not. You can, however, change this using the CONSTANT option.The options of
DROP are the same as those of the FIT directive, but with the extra NONLINEAR option which is relevant when fitting curves. For example, if we have a variate Dilution and a factor Solution, the program below will fit curves with separate linear and nonlinear parameters for the different solutions.MODEL Density
TERMS Dilution * Solution
FITCURVE [PRINT=model,estimates; CURVE=logistic; \
NONLINEAR=separate] Dilution * Solution
If we then put
DROP [NONLINEAR=common]
the curves will be constrained to have common nonlinear parameters, but all linear parameters will still be estimated separately for each group.
DSURFACE directive
Produces perspective views of a two-way arrays of numbers.
Options
TITLE
= text General title; default *WINDOW
= scalar Window number for the plots; default 1ELEVATION
= scalar The elevation of the viewpoint relative to the surface; default 25 (degrees)AZIMUTH
= scalar Rotation about the horizontal plane; the default of 225 degrees ensures that, with a square matrix M, the element M$[1;1] is nearest to the viewpointDISTANCE
= scalar Distance of the viewpoint from the centre of the grid on the base plane; default * gives a distance of 25 times the number of y points in the gridZORIGIN
= scalar Defines the origin of the diagram along the z-axis; default * takes the value defined by LOWERCUTOFFZSCALE
= scalar defines the scaling of the z-axis relative to the horizontal (x-y) axes; default 1LOWERCUTOFF
= scalar Lower cut-off for array values; default *UPPERCUTOFF
= scalar Upper cut-off for array values; default *SCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
GRID
= identifier Pointer (of variates representing the columns of a data matrix), matrix, or two-way table specifying values on a rectangular gridPEN
= scalar Pen number to be used for the plot; default 1
Description
The
DSURFACE directive produces a perspective (or conical) projection of a surface, showing the view from a particular viewpoint. The surface is represented by a grid of z-values or heights. The grid can be a rectangular matrix, a two-way table, or a pointer to a set of variates; the y-dimension is represented by the rows of the structure and the x-dimension by the columns. In each case there must be at least three rows and three columns of data (after allowing for any restrictions on a set of variates). Missing values are not permitted; that is, only complete grids can be displayed. If the grid is supplied as a table with margins, these will be ignored when plotting the surface.The position of the viewpoint is specified in polar coordinates, using the options
ELEVATION, DISTANCE, and AZIMUTH. These define the angle of elevation, in degrees, above the base plane of the surface, distance from the centre of this plane, and angular position relative to the vertical z-axis, respectively.The default settings of
ELEVATION, DISTANCE, and AZIMUTH have been chosen to produce a reasonable display of most surfaces; but if, for example, some parts of the surface are obscured by high points they can be modified to obtain a better view. Altering the value of AZIMUTH will, in effect, rotate the surface in the horizontal plane about a vertical axis drawn through the centre of the grid; the default value of 225 degrees ensures that the element in the first row and column of the grid is at the corner nearest the viewpoint.The
LOWERCUTOFF and UPPERCUTOFF options specify lower and upper bounds for the z-axis. You can use these to truncate the grid or, alternatively, you can set them to values outside the range of the data to obtain compatible scales when you are plotting several grids. The ZORIGIN option allows the origin of the z-axis to be specified. By default, LOWERCUTOFF and UPPERCUTOFF will be set to the minimum and maximum grid values, and ZORIGIN to the value of LOWERCUTOFF.The
ZSCALE option specifies a scaling factor for the z-axis (or vertical axis) of the plotted surface. Generally values between 0.5 and 2.0 are most successful; large values result in a flatter surface, while smaller values produce a steep surface, accentuating changes in the data.The
TITLE, WINDOW, SCREEN, and ENDACTION options are used to specify a title, the plotting window, whether the screen should be cleared first, and whether there should be a pause once the plotting is finished; as in other graphics directives.The
PEN parameter specifies the pen to be used to plot the surface (by default, pen 1). The PEN directive can be used to modify the colour and the thickness of the pen, but the other attributes of the pen are ignored.Simple axes are drawn to indicate the directions in which x and y increase. The
YTITLE and XTITLE parameters of the AXES directive can be used to add further annotation.
DUMMY directive
Declares one or more dummy data structures.
Options
VALUE
= identifier Value for all the dummies; default *MODIFY
= string Whether to modify (instead of redefining) existing structures (yes, no); default no
Parameters
IDENTIFIER
= identifiers Identifiers of the dummiesVALUE
= identifiers Value for each dummyEXTRA
= texts Extra text associated with each identifier
Description
The
IDENTIFIER parameter lists the identifiers of the dummies that are to be declared. Dummies store the identifiers of other structures. These are particularly useful when you want the same series of statements to be used with several different data structures. By using a dummy structure within the statements, you can make them apply to whichever structure you require. The dummy structure is like a plug which can be connected to the structure that you need; the important point is that you can then connect another structure without changing the statements themselves. In nearly all identifier lists Genstat will replace a dummy by the identifier that it stores. The only exceptions are the IDENTIFIER parameter of the DUMMY directive itself, the STRUCTURE parameter of ASSIGN, the parameters of FOR, and in the UNSET function in expressions. (The most obvious occasions where this is useful are in loops and procedures, and there the dummies are declared automatically.)Values can be assigned to the dummies by either the
VALUE option or the VALUE parameter. The option defines a common value for all the structures in the declaration, while the parameter allows the structures each to be given a different value. If both the option and the parameter are specified, the parameter takes precedence.You can associate a text of extra annotation with each dummy using the
EXTRA parameter.If
MODIFY is set to yes any existing attributes and values of the dummies are retained; otherwise these are lost.
DUMP directive
Prints information about data structures, and internal system information.
Options
CHANNEL
= identifier Channel number of file, or identifier of a text to store output; default current output fileINFORMATION
= strings What information to print for each structure (brief, full, extended); default brieTYPE
= strings Which types of structure to include in addition to those in the parameter list (all, diagonalmatrix, dummy, expression, factor, formula, LRV, matrix, pointer, scalar, SSPM, symmetricmatrix, table, text, TSM, variate); default * i.e. noneCOMMON
= strings Which internal Fortran commons to display (all, banks, fdg, ich, iin, iot, jdd, jix, jrt, lcp, lfn, opr, out, ucs, usy, uws); default * i.e. noneSYSTEM
= string Whether to display Genstat system structures (yes, no); default noUNNAMED
= string Whether to display unnamed structures (yes, no); default no
Parameter
identifiers or numbers Identifier or reference number of a structure whose information is to be printed
Description
The structures for which the information is to be displayed are specified by the parameter of
DUMP. The PRINT option indicates what is to be presented: you can ask for just the identifiers, or values and identifiers, or attributes (the identifier is itself an attribute), or for all three. For example, to get all three for the structures A and B you would put:DUMP [PRINT=attributes,values] A,B
If the CHANNEL option is set to a scalar, this specifies the output channel to which the information is sent. Alternatively, if you specify the identifier of a text structure, the lines of information will be stored in the text instead of being printed; likewise if you specify the identifier of a structure that has not yet been declared, it will be defined automatically as a text to store the information. If CHANNEL is not specified, the information is displayed on the current output channel.
The INFORMATION option selects which attributes are presented. The default setting brief selects only the most important ones. The setting full causes all the attributes to be presented, and the setting extended also gives details of the structures associated with listed structures.
Some of the attributes may be set to unnamed structures. You can obtain further information about any of these by giving its (negative) reference number (as displayed by DUMP when indicating its association with another structure) in the parameter list. This is likely to be useful mainly to advanced users.
The TYPE option lets you display, in addition, lists of all structures of a particular type, or of several types. For example, if you had forgotten the identifier of a factor, you could give the statement
DUMP [TYPE=factor; PRINT=identifiers]
This lists all the current factors. When PRINT=attributes or values (or both), the setting TYPE=all provides a list of all named and unnamed structures, except system structures. Setting PRINT=identifiers with TYPE=all lists only named structures.
The COMMON option is provided to allow those developing or extending Genstat to display useful internal information. Similarly, the SYSTEM option allows all the system structures to be dumped: there are many of these, so it is not a good idea to set this option frivolously.
DUPLICATE directive
Forms new data structures with attributes taken from an existing structure.
Option
ATTRIBUTES
= strings Which attributes to duplicate (all, nvalues, values, nlevels, levels, labels (of factors or pointers), extra, decimals, characters, rows, columns, classification, margins, suffixes, minimum, maximum, restriction); default all
Parameters
OLDSTRUCTURE
= identifiers Data structures to provide attributes for the new structuresNEWSTRUCTURE
= identifiers Identifiers of the new structuresVALUES
= identifiers Values for each new structureDECIMALS
= scalars Number of decimals for printing numerical structuresCHARACTERS
= scalars Number of characters for printing texts or labels of a factorEXTRA
= texts Extra text associated with each identifierMINIMUM
= scalars Minimum value for numerical structuresMAXIMUM
= scalars Maximum value for numerical structures
Description
The
DUPLICATE directive allows you to define new data structures with attributes like those of existing structures. The attributes to be duplicated are defined by the ATTRIBUTES option. The structures from which the attributes are to be taken are specified by the OLDSTRUCTURES parameter, while the structures that are to be defined are specified by the NEWSTRUCTURES parameter. The other parameters allow some of the more important attributes to be reset at the same time. For example, here the factor Species2 takes its levels (and thus its number of levels) from the factor Species1. However, the labels are not transferred, and other values are defined using the VALUES parameter.FACTOR [LEVELS=!(0,1); LABELS=!T(absent,present); \
VALUES=0,1,1,0,0,0,1] Species1
DUPLICATE [ATTRIBUTES=levels] Species1; \
NEWSTRUCTURE=Species2; VALUES=!(1,0,1,1,0,1,0)
D3HISTOGRAM directive
Produces three-dimensional histograms.
Options
WINDOW
= scalar Window number for the plots; default 1TITLE
= text General title; default *ELEVATION
= scalar The elevation of the viewpoint relative to the surface; default 25 (degrees)AZIMUTH
= scalar Rotation about the horizontal plane; the default of 225 degrees ensures that, with a square matrix M, the element M$[1;1] is nearest to the viewpointDISTANCE
= scalar Distance of the viewpoint from the centre of the grid on the base plane; default * gives a distance of 25 times the number of y points in the gridLOWERCUTOFF
= scalar Lower cut-off for array values; default *UPPERCUTOFF
= scalar Upper cut-off for array values; default *SCREEN
= string Whether to clear the screen before plotting or to continue plotting on the old screen (clear, keep); default cleaENDACTION
= string Action to be taken after completing the plot (continue, pause); default * uses the setting from the last DEVICE statement
Parameters
GRID
= identifier Pointer (of variates representing the columns of a data matrix), matrix, or two-way table specifying values on a regular gridPEN
= scalar Pen number to be used for the plot; default 1
Description
D3HISTOGRAM
plots a surface as a three-dimensional (or bivariate) histogram. The surface is represented by a grid of z-values or heights. The grid can be a rectangular matrix, a two-way table, or a pointer to a set of variates; the y-dimension is represented by the rows of the structure and the x-dimension by the columns. In each case there must be at least three rows and three columns of data (after allowing for any restrictions on a set of variates). Missing values are not permitted; that is, only complete grids can be displayed. If the grid is supplied as a table with margins, these will be ignored when plotting the surface.The position of the point from which the histogram is viewed is specified in polar coordinates, using the options
ELEVATION, DISTANCE, and AZIMUTH. These define the angle of elevation, in degrees, above the base plane of the surface, distance from the centre of this plane, and angular position relative to the vertical z-axis, respectively, similarly to the DSURFACE directive.The
LOWERCUTOFF and UPPERCUTOFF options specify lower and upper bounds for the z-axis. You can use these to truncate the grid or, alternatively, you can set them to values outside the range of the data to obtain compatible scales when you are plotting several grids. The ZORIGIN option allows the origin of the z-axis to be specified. By default, LOWERCUTOFF and UPPERCUTOFF will be set to the minimum and maximum grid values, and ZORIGIN to the value of LOWERCUTOFF.The
ZSCALE option specifies a scaling factor for the z-axis (or vertical axis) of the plotted surface. Generally values between 0.5 and 2.0 are most successful; large values result in a flatter surface, while smaller values produce a steep surface, accentuating changes in the data.The
TITLE, WINDOW, SCREEN, and ENDACTION options are used to specify a title, the plotting window, whether the screen should be cleared first, and whether there should be a pause once the plotting is finished; as in other graphics directives.The
PEN parameter specifies the pen to be used to plot the histogram (by default, pen 1). The PEN directive can be used to modify the colour and the thickness of the pen, but the other attributes of the pen are ignored.Simple axes are drawn to indicate the directions in which x and y increase. The
YTITLE and XTITLE parameters of the AXES directive can be used to add further annotation.
EDIT directive
Edits text vectors.
Options
CHANNEL
= scalar or text Text structure containing editor commands or a scalar giving the number of a channel from which they are to be read; default is the current input channelEND
= text Character(s) to indicate the end of the commands read from an input channel; default is the character colon (:)WIDTH
= scalar Limit on the line width of the text; default *SAVE
= text Text to save the editor commands for future use; default *
Parameters
OLDTEXT
= texts Texts to be editedNEWTEXT
= texts Text to store each edited text; if any of these is omitted, the corresponding OLDTEXT is used
Description
The
EDIT directive edits each text in the OLDTEXT list, storing the results in the corresponding structure in the NEWTEXT list. It both edits and stores each text before moving on to the next. If you have not already declared any of the texts in the NEWTEXT list, it will be declared implicitly. If you give a missing identifier (*) in the NEWTEXT list, the edited version simply replaces the values of the original: thus the old text will be overwritten by the new text. You can also omit a text from the OLDTEXT list; you might do this if you wanted to form the values of the new text entirely from within the editor. If any of the old texts are restricted, they must all be restricted to exactly the same set of units. Then only those units will be involved in the edit. When a restriction is in force, you cannot add or delete any units (or lines).The
CHANNEL option tells Genstat where to find the editing commands. A scalar specifies the number of an input channel from which the commands are to be read. Alternatively, you can specify a text structure containing the commands. In either case the commands should be terminated by the string specified by the END option. The end string can be more that one character; the default is the single character colon (:). Genstat gives a warning if you have forgotten to specify the end string in a text of commands. The default for the CHANNEL option is to take input from the current input channel.The
WIDTH option specifies the maximum line length for vectors of commands and of text, the default being 80 and the maximum being 255.The
SAVE option allows you to specify a text structure to store the edit commands, so that you can save them for future EDIT statements.Commands for the editor can be given in either upper or lower case. You can put as many commands as you like on a line, subject only to the width restriction set by the
WIDTH option. Commands must be separated by at least one space. You cannot put spaces into the middle of a command, unless they are part of a character string (or part of a sequence of commands).The character that separates the parts of a command is written here as
/, but you can use any character for this other than a space or a digit.Genstat puts the lines from the old text into an internal buffer, where they are modified according to the commands that you specify. While you are editing, Genstat moves a notional marker around the buffer. The marker can be moved backwards or forwards along a line or between lines. So you can move around the text and modify the lines in any order. Some commands move the marker automatically, as explained in the definitions below. If the marker is before the first line of text it is at the
[start] position; if it is after the last line of text it is at the [end] position. The line that currently contains the marker is called the current line. Genstat does not write anything to the new text until the edit has been completed (so if you use the Q command, the new text is left unaltered).Some commands allow you to specify a number: for example
Dn deletes the next n lines. Genstat gives a warning message if this number is zero or is not an integer.The commands are as follows.
A
Insert the next line of text from the buffer, immediately after the marker within the current line.B
Break the current line at the marker position. Text before the marker is written as a new line to the internal buffer and text after the marker becomes the new current line with the marker at character position 1.C
Cancel edits performed on the current line by restoring it to the form in which it was most recently read from the buffer. Note that if you have previously edited the line and then moved to some other line, it is the previously edited form that will be given, not the form as originally in the old text; also, if you have given any A or B commands during your modification of the current line, their effects are not negated, so for example any lines that have been inserted into the current line by A will be lost.D
Delete the current line, and make the next line the current line with the marker at character position 1.Dn
Delete the next n lines (including the current line), making the next line after that the current line with the marker at position 1.D+n
Synonymous with Dn.D+
is a synonym for D or D+1.D+*
Delete from the current line to end of text. The current line is then [end].D*
Synonymous with D+*.D-
Delete the current line, making the previous line the current line with the marker at character position 1.D-n
Delete the current and previous n lines, making the line before that the current line with the marker at character position 1.D-
is a synonym for D-1.D-*
Delete the current line and all previous lines, the current line is then [start].D/s/
Delete from the current line to the line with the next occurrence of the character string s. The marker is placed immediately before the character string s in the located line. If s occurs after the marker on the current line, the marker is moved up to s and no lines are deleted.D-/s/
The same as D/s/, except that it moves backwards through the text, deleting all lines from and including the current one until the first occurrence of a line containing the character string s. The marker is placed immediately before the located character string s. If s occurs before the marker on the current line, the marker placed before s and no lines are deleted.F/i/
Inserts the contents of the text structure with identifier i immediately before the current line. The marker is not moved.G+/s/t/
substitutes string s for all occurrences of string t found after the marker on the current and subsequent lines, and moves the marker to the end of the text.G/s/t/
is a synonym for G/s/t/.G-/s/t/
substitutes string s for all occurrences of string t found before the marker on the current and previous lines, and moves the marker to the start of the text.I/s/
Inserts the string s as a new line immediately before the current line. The marker is not moved.L
Moves the marker to the start of the next line, which can be [end].Ln
Moves the marker to the start of the nth line after the current line. So L1 gives the next line.L+n
Is synonymous with Ln.L+
Is synonymous with L or L+1.L+*
Moves the marker to [end].L*
is a synonym for L+*.L-n
Moves the marker to the start of the nth line before the current line, which can be [start]. L-1 gives the line immediately before the current line.L-
Is synonymous with L-1.L-*
Moves the marker to [start].L+/s/
Moves the marker to the position immediately before the next occurrence of the character string s after the current marker position; this occurrence need not be on the current line. If the string s is not found, the marker will be located at [end].L-/s/
Moves the marker to the position immediately before the first occurrence of the string s before the current marker position; this occurrence need not be on the current line. If the string s is not found, the marker will be located at [start].P
moves the marker one character to the right along the current line.P+n
Moves the marker n characters to the right of the current position within the current line. You cannot move the marker beyond the maximum line length (which will vary between computers, but is normally the same as the width of your local line-printer).P+
is a synonym for P or P+1.P+*
Moves the marker to the position immediately after the last non-blank character in the current line. This can be to the left of the current marker position.P-n
Moves the marker n characters to the left of the current position within the current line. The marker cannot be moved to the left of character position 1.P-
is a synonym for P-1.P-*
Moves the marker to the position immediately before the first non-blank character after character position 1. This can be to the right of the current marker position.Pn
Moves the marker to the character position n within the current line, counting from the left and starting at 1. The maximum value of n varies between computers but is normally the same as the width of your local line-printer.Q
Abandons the current edit, leaving the original text unaltered.R+/s/t/
substitutes character string t for the next occurrence of character string s after the marker on the current or subsequent lines, and moves the marker to the position immediately after t.R/s/t/
is a synonym for R+/s/t/.R-/s/t/
substitutes string t for the nearest occurrence of string s before the marker on the current or previous lines; the marker moves to be immediately before string t.S/s/t/
Substitutes the string t for the next occurrence of string s after the marker within the current line. The marker is moved to the character position immediately after the last character in t. If s is null (when the command is S//t/) then t is inserted immediately after the marker. If t is null (when the command is S/s//), then s is deleted from the line.V
Turns on the verification mode. Then, if you are working interactively, the current line will be displayed each time that Genstat prompts you for commands. By default the marker is indicated by the character > but you can change this by the command Vc or V+c.Vc
Turns on the verification mode (see V), and changes the marker character to c.V+c
Is synonymous with Vc.V-
Turns verification mode off (see V).(cseq)n
Repeats the command sequence, cseq, n times. The command sequence cseq can be any valid combination of editing commands, each separated by at least one space. The complete sequence, including brackets and repeat count, must all be on a single line. You can nest sequences up to a depth of 10.(cseq)*
Repeats the command sequence cseq until [end] or [start] is encountered. In all other respects (cseq)* behaves exactly as (cseq)n; so it would be equivalent to putting n equal to some very large number.
ELSE directive
Introduces the default set of statements in block-if or in multiple-selection control structures.
No options or parameters
Description
The use of
ELSE in block-if structures is explained in the description of the IF directive. Its use in multiple-selection control structures is explained in the description of CASE.
ELSIF directive
Introduces a set of alternative statements in a block-if control structure.
No options
Parameter
expression Logical expression to indicate whether or not the set of statements is to be executed.
Description
A block-if structure consists of one or more alternative sets of statements. The first of these is introduced by an
IF statement. There may then be further sets introduced by ELSIF statements. Then you can have a final set introduced by an ELSE statement, and the whole structure is terminated by an ENDIF statement. Full details are given in the description of the IF directive.
ENDBREAK directive
Returns to the original channel or control structure and continues execution.
No options or parameters
Description
ENDBREAK
ends breaks instigated by the BREAK directive, where more details are given.
ENDCASE directive
Indicates the end of a "multiple-selection" control structure.
No options or parameters
Description
A multiple-selection control structure consists of several alternative blocks of statements. The first of these is introduced by a
CASE statement. This has a single parameter, which is an expression that must yield a single number. Subsequent blocks are each introduced by an OR statement. There can then be a final block, introduced by an ELSE statement, as in the block-if structure. The whole structure is terminated by an ENDCASE statement. Full details are given in the description of the CASE directive.
ENDDEBUG directive
Cancels a
DEBUG statement.
No options or parameters
Description
ENDDEBUG
ends a debugging session instigated by the DEBUG directive, where more details are given.
ENDFOR directive
Indicates the end of the contents of a loop.
No options or parameters
Description
Loops are introduced by the
FOR directive, where full details are given.
ENDIF directive
Indicates the end of a block-if control structure.
No options or parameters
Description
A block-if structure consists of one or more alternative sets of statements. The first of these is introduced by an
IF statement. There may then be further sets introduced by ELSIF statements. Then you can have a final set introduced by an ELSE statement, and the whole structure is terminated by an ENDIF statement. Full details are given in the description of the IF directive.
ENDJOB directive
Ends a Genstat job.
No options or parameters
Description
A Genstat program can be split up into individual jobs. These can each be terminated by the
ENDJOB directive. Full details are given in the description of the JOB directive, which can be used to start a new job.
ENDPROCEDURE directive
Indicates the end of the contents of a Genstat procedure.
No options or parameters
Description
ENDPROCEDURE
ends the definition of a Genstat procedure. Full details are given in the description of the PROCEDURE directive
ENQUIRE directive
Provides details about files opened by Genstat.
No options
Parameters
CHANNEL
= scalars Channel numbers to enquire about; for FILETYPE=input or output, a scalar containing a missing value will be set to the number of the current channel of that type and a negative value can be used to check the existence of a file that is not yet connected to a channelFILETYPE
= strings Type of each file (input, output, unformatted, backingstore, procedurelibrary, graphics); default inpuOPEN
= scalars To indicate whether or not the corresponding channels are currently open (0=closed, 1=open)NAME
= texts External name of the file, if channel is openEXIST
= scalars To indicate whether files on corresponding channels currently exist (0=not yet created, 1=exist)WIDTH
= scalars Maximum width of records in each file (only relevant for input and output files, set to * for other types)PAGE
= scalars Number of lines per page (relevant only for output files)ACCESS
= texts Allowed type of access: set to 'readonly', 'writeonly' or 'both'LINE
= scalars Number of the current line (input files only)
Description
ENQUIRE
allows you to ascertain whether a particular channel is already in use and, if so, what properties are defined for aspects like the width of each line or the number of lines per page. This is likely to be of most use within general programs and procedures.You specify the channel using the parameters
CHANNEL and FILETYPE; the other parameters allow you to save the required information in data structures of the appropriate type.ENQUIRE
can also be used to discover whether a file exists. You simply set the CHANNEL option to a negative number. The result of the enquiry is supplied by the EXIST parameter. So, for exampleENQUIRE CHANNEL=-1; NAME='lost.dat'; EXIST=Found
will set the scalar
Found to one if the file lost.dat exists, or to zero otherwise.
EQUATE directive
Transfers data between structures of different sizes or types (but the same modes i.e. numerical or text) or where transfer is not from single structure to single structure.
Options
OLDFORMAT
= variate Format for values of OLDSTRUCTURES; within the variate, a positive value n means take n values, -n means skip n values and a missing value means skip to the next structure; default * i.e. take all the values in turnNEWFORMAT
= variate Format for values of NEWSTRUCTURES; within the variate, a positive value n means fill the next n positions, -n means skip n positions and a missing value means skip to the next structure; default * i.e. fill all the positions in turnFREPRESENTATION
= string How to interpret factor values (labels, levels, ordinals); default leve
Parameters
OLDSTRUCTURES
= identifiers Structures whose values are to be transferred; if values of several structures are to be transferred to one item in the NEWSTRUCTURES list, they must be placed in a pointerNEWSTRUCTURES
= identifiers Structures to take each set of transferred values; if several structures are to receive values from one item in the OLDSTRUCTURES list, they must be placed in a pointer
Description
The
EQUATE directive copies values from one set of data structures to another. For example, you may wish to copy the values from a one-way table into a variate, or from a matrix into a set of variates (one variate for each row, or for each column), or the other way round, from variates into a matrix. Alternatively, you may want to append values from several data structures into a single one. The only constraint is that the structures in the respective sets must all contain the same kind of values.The general idea with
EQUATE is that the values in the structures in the OLDSTRUCTURES list are copied into the structures in the NEWSTRUCTURES list. Each item in OLDSTRUCTURES list specifies a single data structure, or a single set of data structures, containing the values to be transferred. A single structure can be a factor, or a text, or any one of the structures that contain numbers (scalar, variate, rectangular matrix, diagonal matrix, symmetric matrix, or table). If you want to give a set of structures you must put them into a pointer. As already mentioned, all the structures in the set must contain the same kind of values: that is, they must all be texts, or all factors, or must all contain numbers (but they need not all be the same kinds of numerical structure - they could, for example, be a mixture of variates and matrices).The corresponding entry in the
NEWSTRUCTURE list indicates where the transferred data are to be placed. It is either a single structure or a pointer to a set of structures; the structures must be of a type suitable to store the values to be transferred.Except with a format Genstat ignores where each structure within a set from the
OLDSTRUCTURES list ends and another one begins: that is, it treats the set as being a concatenated list of values. Similarly, it treats the structures in each NEWSTRUCTURES set as an unstructured list of positions that are to receive values. The old values are repeated as often as is necessary to traverse all the new positions.You can use the
OLDFORMAT and NEWFORMAT options to control how the old values and new positions are traversed. The setting for each of these is a variate whose values are interpreted as follows:(a) a positive integer n means take the next n values (
(b) a negative integer -n means skip the next n values or positions;
(c) a missing value
* means skip to the end of the structure.As usual, Genstat recycles when it runs out of values. That is, if the contents of one of the variates is exhausted before all the
NEWSTRUCTURES positions have either been filled or skipped, then that variate is repeated.If you are transferring values between factors, Genstat will check that each value to be transferred is valid for the factor in the
NEWSTRUCTURES list. By default, Genstat will try to match the values using the levels of the factors, but you can set option FREPRESENTATION=labels to match by their labels, or FREPRESENTATION=ordinals to match them merely according to the ordinal position in the levels vector of each factor.The values of factors that have labels can be copied into texts. In addition, values of texts can be copied into factors, provided all the strings are valid labels for the factor concerned.
Factor values can also be copied into variates; the
FREPRESENTATION option controls whether Genstat uses the levels or the ordinal values.
ESTIMATE directive
Estimates parameters in Box-Jenkins models for time series.
Options
LIKELIHOOD
= string Method of likelihood calculation (exact, leastsquares, marginal); default exacCONSTANT
= string How to treat the constant (estimate, fix); default estiRECYCLE
= string Whether to continue from previous estimation (yes, no); default noWEIGHTS
= variate Weights; default *MVREPLACE
= string Whether to replace missing values by their estimates (yes, no); default noFIX
= variate Defines constraints on parameters (ordered as in each model, tf models first): zeros fix parameters, parameters with equal numbers are constrained to be equal; default *METHOD
= string Whether to carry out full iterative estimation, to carry out just one iterative step, to perform no steps but still give parameter standard deviations, or only to initialize for forecasting by regenerating residuals (full, onestep, zerostep, initialize); default fullMAXCYCLE
= scalar Maximum number of iterations; default 15TOLERANCE
= scalar Criterion for convergence; default 0.0004SAVE
= identifier To name save structure, or supply save structure with transfer-functions; default * i.e. transfer-functions taken from the latest model
Parameters
SERIES
= variate Time series to be modelled (output series)TSM
= TSM Model for output seriesBOXCOXMETHOD
= string How to treat transformation parameter in output series (fix, estimate); default fixRESIDUALS
= variate To save residual series
Description
The main use of
ESTIMATE is to fit parameters to time-series models, although you can also use it to initialize for the FORECAST directive, even when the model parameters are already known. You need to define a TSM structure beforehand, for use as input to the TSM parameter. You may also wish to give a TRANSFERFUNCTION statement for example if you wish to specify explanatory variables for regression with ARIMA errors or to define transfer-function models. In many applications of estimating a univariate ARIMA model, you will need only a simple form of the directive, such as:ESTIMATE Daylength; TSM=Erp
The SERIES parameter specifies the variate holding the time series data to which the model is to be fitted.
The TSM parameter specifies the ARIMA model that is to be fitted to the time-series data. This TSM must already have been declared and its ORDERS must have been set. If the LAGS parameter of the TSM has been set, the lags must have been given values. However, if the PARAMETERS of the TSM model have been set, these need not have been declared previously nor given values. When the parameter values are not set, default values are used: these are all zero, except for the transformation parameter, which is set to 1.0 if it is not to be estimated (see BOXCOXMETHOD and FIX below). Any parameter values that you do specify will be used as initial values for the parameters in the model; Genstat replaces any missing values by the default values. If any group of autoregressive or moving-average parameters do not satisfy the required conditions for stationarity or invertibility, all the parameters to be estimated are reset by Genstat to the default values. After ESTIMATE, the parameters of the TSM contain the estimated parameter values.
The BOXCOXMETHOD parameter allows you to estimate the transformation parameter l.
The RESIDUALS parameter saves the estimated innovations (or residuals). The residuals are calculated for t=t0...N, where t0=1+p+d-q for a simple ARIMA model. If t0>1, missing values will be inserted for t=1...t0-1.
The PRINT option controls printed output. If you specify monitoring, then at each cycle of the iterative process of estimation, Genstat prints the deviance for the current fitted model, together with the current estimates of model parameters. The format is simple with the minimum of description, to let you judge easily how quickly the process is converging. The other settings of PRINT control output at the end of the iterative process. If you specify model, the model is briefly described, giving the identifier of the series and the time-series model, together with the orders of the model. If you specify summary, the deviance of the final model is printed, along with the residual number of degrees of freedom. If you specify estimates, the estimates of the model parameter are printed in a descriptive format, together with their estimated standard errors and reference numbers. If you specify correlations, the correlations between estimates of parameters are printed, with reference numbers to identify the parameters.
The LIKELIHOOD option specifies the criterion that Genstat minimizes to obtain the estimates of the parameters: this is described in the next section. The default setting exact is recommended for most applications.
You can use the CONSTANT option to specify whether Genstat is to estimate the constant term c in the model. If CONSTANT=fix, the constant is held at the value given in the initial parameter values; this need not be zero.
The RECYCLE option allows a previous ESTIMATE statement to continue; this can save computing time. If RECYCLE=yes, the most recent ESTIMATE statement is continued, unless the SAVE option has been set to the save structure from some other ESTIMATE statement. The SERIES and TSM settings are then taken from this previous ESTIMATE statement: Genstat ignores any specified in the current statement. Most of the settings of other parameters and options are carried over from the previous statement, and new values are ignored. However, there are some exceptions. You can change the RESIDUALS variate, you can reset MAXCYCLE to the number of further iterations you require, and you can change the settings of TOLERANCE and PRINT. You can also change the values of the variate in the WEIGHTS option; you can thus get reweighted estimation. You can change the values of the SERIES itself, although you cannot change missing values; if the MVREPLACE option was previously set to yes, you must put the original missing values back into the SERIES variate before the new ESTIMATE statement.
The WEIGHTS option includes in the likelihood a weighted sum-of-squares term
where wt, t=1...N are provided by the
WEIGHTS variate. The values of wt must be strictly positive. If t0<1, where t0=1+d+p-q, then wt is taken as 1 for t<1.The
MVREPLACE option allows you to request any missing values in the time-series to be replaced by their estimates after estimation. Genstat will always estimate the missing values, irrespective of the setting of MVREPLACE; so you can also obtain these estimates later from TKEEP.The
FIX option allows you to place simple constraints on parameter values throughout the estimation. The units of the FIX variate correspond to the parameters of the TSM, excluding the innovation variance. The values of the FIX variate are used to define the parameter constraints and must be integers. If an element of the FIX variate is set to 0, the corresponding parameter is constrained to remain at its initial setting. If an element is not 0, and the value is unique in the FIX variate, the parameter is estimated without any special constraint. If two or more values are equal, the corresponding parameters are constrained to be equal throughout the estimation. The number that you give to a parameter by FIX will appear as the reference number of the parameter in the printed model and correlation matrix. This option overrides any setting of CONSTANT and BOXCOXMETHOD.The
MAXCYCLE option specifies the maximum number of iterations to be performed.The
TOLERANCE option specifies the convergence criterion. Genstat decides that convergence has occurred if the fractional reduction in the deviance in successive iterations is less than the specified value, provided also that the search is not encountering numerical difficulties that force the step length in the parameter space to be severely limited. You can use monitoring to judge whether, for all practical purposes, the iterations have converged. Genstat gives warnings if the specified number of iterations is completed without convergence, or if the search procedure fails to find a reduced value of the deviance despite a very short step length. Such an outcome may be due to complexities in the likelihood function that make the search difficult, but can be due to your specifying too small a value for TOLERANCE.The
SAVE option allows you to save the time-series save structure produced by ESTIMATE. You can use this in further ESTIMATE statements with RECYCLE=yes, or in FORECAST statements. It can also be used by the TDISPLAY and TKEEP directives. Genstat automatically saves the structure from the most recent ESTIMATE statement, but this is over-written when the next ESTIMATE statement is executed, unless you have used SAVE to give it an identifier of its own. You can access the current time-series save structure by the SPECIAL option of the GET directive, and reset it by the TSAVE option of the SET directive.The
METHOD option has four possible settings. The default setting is full which gives the usual estimation to convergence or until the maximum number of iterations has been reached.With the setting
METHOD=initialize, ESTIMATE carries out only the residual regeneration steps (that is, calculation of at for t=t0...N) which are needed before FORECAST can be used. If the model has just been estimated using the default full setting, this is unnecessary. The setting initialize is useful when the time series is supplied with a known model and a minimal amount of calculation is wanted to prepare or initialize for forecasting. None of the model parameters are changed, and no standard errors of parameter estimates are available. Missing values in the series are estimated so this setting provides an efficient way of getting their values when the time series model is known; they can then be obtained using TKEEP. The deviance value is also available from TKEEP. This setting is therefore useful for efficient calculation of deviance values when you want to plot the shape of the deviance as a function of parameter values.With the setting
METHOD=zerostep the effect is the same as for initialize except that ESTIMATE also calculates the standard errors of the parameters as if they had just been estimated. These can be used together with other quantities available from TKEEP to construct confidence intervals and carry out tests on the parameter values, which remain unchanged except that the innovation variance in the ARIMA model is replaced by its estimate conditional on all other parameters.The setting
METHOD=onestep gives the same results as specifying the option MAXCYCLE=1 in ESTIMATE. It is convenient for carrying out quick tests of model parameters.To explain the
LIKELIHOOD option, we need to describe the estimation of ARIMA models in more detail. You may want to skip this if you are doing fairly routine work.The first step in deriving the likelihood for a simple model is to calculate
wt = Ñ dyt - c , t = 1+d ... N
This has a multivariate Normal distribution with dispersion matrix V
sa2, where V depends only on the autoregressive and moving-average parameters. The likelihood is then proportional to{ sa2m÷V÷ } -1/2 exp{ -w¢ V-1w/2sa2 }
where m=N-d. In practice Genstat evaluates this by using the formula
where t0=1+d+p-q. The term W is a quadratic form in the p values w1+d-q ... wp+d-q: it takes account of the starting-value problem for regenerating the innovations at, and avoids losing information as would happen if the process used only a conditional sum-of-squares function. If q>0, Genstat introduces unobserved values of w1+d-q ... wd in order to calculate the sum S. Genstat uses linear least-squares to calculate these q starting values for w, thus minimizing S. We shall call them back-forecasts, though if p>0 they are actually computationally convenient linear functions of the proper back-forecasts. We shall call S the sum-of-squares function: it is the sum of the quadratic form and the sum-of-squares term, and is identical to the value expressed by Box and Jenkins (1970) as
using infinite back-forecasting; that is, using:
The values at for t=t0...N agree precisely with those of Box and Jenkins.
To clarify all this, consider examples with no differencing; that is, d=0. If p=0 and q=1 then W=0 and t0=0, and one back-forecast w0 is introduced. If p=1 and q=0 then W=(1-
q12)w12 and t0=2, and no back-forecasts are needed. If p=q=1 then W=(1-q12)w02 and t0=1, and so one back-forecast w0 is needed. In this case the proper back-forecast is in fact w0 /(1-f1q1).The value of ÷V÷ is a by-product of calculating W and the back-forecast. For example, if p=0 and q=1, then
÷V÷ = (1 + f12 + ... + f12N)
If p=1 and q=0,
÷V÷ = 1 / (1 - q12)
and if p=q=1,
÷V÷ = 1 + (q1 - f1)2 (1 + f12 + ... + f12N-2) / (1 - q12)
Concentrating the likelihood over sa2 by setting sa2=S/m yields a value proportional to { ÷V÷1/m S }-m/2.
The default setting of the LIKELIHOOD option is exact. In this case the concentrated likelihood is maximized, by minimizing the quantity
D = ÷V÷1/m S
which is called the deviance.
The setting leastsquares specifies that Genstat is to minimize only the sum-of-squares term S. This criterion corresponds to the back-forecasting sum-of-squares used by Box and Jenkins (1970), and will in many cases give estimates close to those of the exact likelihood. However, some discrepancy arises if the series is short or the model is close to the invertibility boundary. This is because of limitations on the back-forecasting procedure, as described in the algorithms of Box and Jenkins (1970). The deviance value D that Genstat prints is, with this setting, simply S.
When you use exact likelihood, the factor ÷V÷1/m reduces bias in the estimates of the parameter; you would get bias if you used leastsquares instead. However, ÷V÷1/m is generally close to one, unless the series is short or the model is either seasonal or close to the boundaries of invertibility or stationarity. The leastsquares setting is therefore adequate for most long, non-seasonal sets of data; using it may reduce the computation time by up to 50%. When you specify that Genstat is to estimate the parameter l of the Box-Cox transformation, Genstat also includes the Jacobian of the transformation in the likelihood function. The result is an extra factor G-2(l-1) in the definition of the deviance, G being the geometric mean of the data,
Note that this is not included unless
l is being estimated, even if l 1.You can treat differences in Nlog(D) as a chi-squared variable in order to test nested models: this is supported by asymptotic theory, and by experience with models that have moderately large sample sizes. Similarly, you can select between different models by using Nlog(D)+2k as an information criterion, k being the number of estimated parameters. But both of these test procedures are questionable if the estimated models are close to the boundaries of invertibility or stationarity. Provided all the models that are being compared have the same orders of differencing, with the differenced series being of length m, it is recommended that mlog(D) be used rather than Nlog(D) in these tests since mlog(D) is precisely minus two multiplied by the log-likelihood as defined above.
The setting
marginal is relevant mainly when ESTIMATE is used for regression with ARIMA errors. (This requires a TRANSFERFUNCTION statement beforehand to specify the explanatory variables.) The likelihood for the model is defined as that of the univariate error series et which is defined in general byet = yt - b1x1,t - ... - bmxm,t
(the xi being m explanatory variables). The constant term therefore appears in the model after any differencing of et; for example
Ñ et = c + (1 -
You can get bias in the estimates of the parameters of an ARIMA model because the regression is estimated at the same time. You can guard against this by specifying LIKELIHOOD=marginal. This can be particularly important if the series are short or if you use many explanatory variables (Tunnicliffe Wilson 1989). The deviance is now defined as
D = S (÷X¢ V-1X÷ ÷V÷)1/m
where m is reduced by the number of regressors (including the constant term) and the columns of X are the differenced explanatory series: the other terms are as in the exact likelihood.
You can use the marginal setting also for univariate ARIMA modelling, when the constant term is the only explanatory term. Furthermore, Genstat deals with missing values in the response variate by doing a regression on indicator variates; these too are included in the X matrix. However, you cannot use marginal likelihood and estimate a transformation parameter in either the transfer-function model or an ARIMA model. Neither can you use it if you set the FIX option in ESTIMATE. In these cases Genstat automatically resets the LIKELIHOOD option to exact.
At every iteration with the setting LIKELIHOOD=marginal, the regression coefficients are the maximum-likelihood estimates conditional upon the estimated values of the parameters of the ARIMA model: these are also the generalized least-squares estimates, conditioned in the same way. This is so even if MAXCYCLE=0; that is, the coefficients of the regression are re-estimated even at iteration 0. Therefore you must not use the marginal setting with the option METHOD=initialize to initialize for FORECAST. You can compare deviance values that were obtained using marginal likelihood only for models with the same explanatory variables and the same differencing structure in the error model.
References
Box, G.E.P. and Jenkins, G.M. (1970). Time series analysis, forecasting and control. Holden-Day, San Francisco.
Tunnicliffe Wilson, G. (1989). On the use of marginal likelihood in time-series model estimation. Journal of the Royal Statistical Society, Series B 51, 15-27.
EXECUTE directive
Executes the statements contained within a text.
No options
Parameter
texts Statements to be executed
Description
EXECUTE
allows a set of Genstat statements placed in a text to be executed, for example inside loops or procdures.
EXIT directive
Exits from a control structure.
Options
NTIMES
= scalar Number of control structures, n, to exit; default 1. If n exceeds the number of control structures of the specified type that are currently active, the exit is to the end of the outer one; while for n negative, the exit is to the end of the -n'th structure (in order of execution)CONTROL
= string Type of control structure to exit (job, for, if, case, procedure); default forREPEAT
= string Whether to go to the next set of parameters on exit from a FOR loop or procedure (yes, no); default noEXPLANATION
= text Text to be printed if the exit takes place; default *
Parameter
expression Logical expression controlling whether or not an exit takes place
Description
Sometimes you may want simply to abandon part of a program: you may be unable to do any further calculations or analyses. For example, if you are examining several subsets of the units, you would wish to abandon the analysis of any subset that turned out to contain no observations. Another example would be if you wanted to abandon the execution of a procedure whenever an error diagnostic has appeared. The
EXIT directive allows you to exit from any control structure.In its simplest form
EXIT has no parameter setting, and the exit is unconditional: Genstat will always exit from the control structure or structures concerned. You are most likely to use this as part of an ELSE block of a block-if or multiple-selection structure. For exampleIF N.GT.0
CALCULATE Percent = R * 100 / N
ELSE
PRINT [IPRINT=*] 'Incorrect value ',N,' for N.'
EXIT [CONTROL=procedure]
ENDIF
prints an appropriate warning message for a zero or negative value of
N, and then exits from a procedure.If the warning message is simply a text or string, the
EXPLANATION option can be used to print it on exit. For exampleEXIT [CONTROL=procedure; EXPLANATION='Incorrect value for N.'] \
N.LE.0
CALCULATE Percent = R * 100 / N
has the same effect except that the actual value of
N is no longer printed.The
CONTROL option specifies the type of control structure from which to exit. The default setting is for, causing an exit from a FOR loop. For the other settings: if causes an exit from a block-if structure (as introduced by the IF directive), case exits from a multiple-selection structure (as introduced by CASE), procedure exits from a procedure (see the PROCEDURE directive), and job causes the entire job to be abandoned (see JOB). Sometimes, to exit from one type of control structure, others must be left too. To exit from the procedure in the above example, requires Genstat to exit also from the block-if structure. Generally, Genstat does these nested exits automatically, as required. However, inside a procedure, you can exit only from FOR loops and block-if or multiple-selection structures that are within the procedure. You cannot put, for example,EXIT [CONTROL=if]
within a part of the procedure where there is no block-if in operation, and then expect Genstat to exit both from the procedure and from a block-if structure in the outer program from which the procedure was called. Genstat regards a procedure as a self-contained piece of program.
The
NTIMES option indicates how many control structures of the specified type to exit from. If you ask Genstat to exit from more structures than are currently in operation in your program, it will exit from as many as it can and then print a warning. If NTIMES is set to zero or to missing value no exit takes place. If NTIMES is set to a negative value, say -n, the exit is to the end of the nth structure of the specified type, counting them in the order in which their execution began. Consider this example:FOR I=A[1...3]
FOR J=B[1...3]
FOR K=C[1...3]
FOR L=D[1...3]
"contents of the inner loop, including:"
EXIT [NTIMES=Nexit]
"amongst other statements"
ENDFOR "end of the loop over D[]"
ENDFOR "end of the loop over C[]"
ENDFOR "end of the loop over B[]"
ENDFOR "end of the loop over A[]"
If the scalar
Nexit has the value 2, the exit is to the end of the loop over C[]; so the two exits are from the loop over D[] and the loop over C[]. But if Nexit has the value -2 the exit is to the end of the loop over B[], as this is the second loop to have been started.A further possibility when
EXIT is used within a FOR loop is that you can choose either to go right out of the loop and continue by executing the statement immediately after the ENDFOR statement, or to go to ENDFOR and then repeat the loop with the next set of parameter values. To repeat the loop, you need to set option REPEAT=yes. For example, suppose that variates Height and Weight contain information about children of various ages, ranging from five to 11. The RESTRICT statement causes the subsequent GRAPH statement to plot only those units of Height and Weight where the variate Age equals Ageval. The EXIT statement ensures that the graph is not plotted if there are no units of a particular age; the program then continues with Ageval taking the next value in the list.FOR Ageval=5,6,7,8,9,10,11
RESTRICT Height,Weight; CONDITION=Age.EQ.Ageval
EXIT [REPEAT=yes] NVALUES(Height).EQ.0
GRAPH Height; X=Weight
ENDFOR
The
REPEAT option can also be used within a procedures to ask Genstat to call the procedure with the next set of parameter settings.The example of the heights and weights of children also illustrates the use of the parameter of
EXIT, to make the effect conditional. The parameter is an expression which must evaluate to a single number which Genstat interprets as a logical value. If the value is zero, the condition is false and no exit takes place; for other values the condition is true and the exit takes effect as specified. This is particularly useful for controlling the convergence of iterative processes: for exampleCALCULATE Clim = X/10000
FOR [NTIMES=999]
CALCULATE Previous = Root
& Root = (X/Previous + Previous)/2
PRINT Root,Previous; DECIMALS=4
EXIT ABS(Previous-Root) < Clim
ENDFOR
will calculate the square root of
X to four significant figures.
EXPRESSION directive
Declares one or more expression data structures.
Options
VALUE
= expression Value for all the expressions; default *MODIFY
= string Whether to modify (instead of redefining) existing structures (yes, no); default no
Parameters
IDENTIFIER
= identifiers Identifiers of the expressionsVALUE
= expression structures Expression data structures providing values for the expressionsEXTRA
= texts Extra texts associated with the identifiers
Description
The
IDENTIFIER parameter lists the identifiers of the expressions that are to be declared. The expression data structure stores a Genstat expression, for exampleHours = Minutes/60
Usually you will find it easiest to type out an expression like this explicitly whenever you need it. The main use, then, for this rather specialized data structure is to supply an expression as the argument of a procedure.
Values can be assigned to the expressions by either the
VALUE option or the VALUE parameter. The option defines a common value for all the structures in the declaration, while the parameter allows the structures each to be given a different value. If both the option and the parameter are specified, the parameter takes precedence.You can associate a text of extra annotation with each expression using the
EXTRA parameter. If MODIFY is set to yes any existing attributes and values of the expressions are retained; otherwise these are lost.Here are two examples using the
VALUE option:EXPRESSION [VALUE=Length*Width*Height] Vcalc
EXPRESSION [VALUE=Dose=LOG10(Dose)] Dtrans
These put the expression
Length*Width*Height into the identifier Vcalc, and the expression Dose=LOG10(Dose) into Dtrans. Both expressions could be declared simultaneously, using the VALUE parameter, by puttingEXPRESSION Vcalc,Dtrans; VALUE=!E(Length*Width*Height), \
!E(Dose=LOG10(Dose))
!E(Length*Width*Height), for example, is an unnamed expression.