Missing Field Values

In this section:

Missing values within segment instances occur when the instances exist, but some of the fields lack values.

When fields in instances lack values, numeric fields are assigned the value 0, and alphanumeric fields, the value blank. These default values appear in reports and are used in all calculations performed by the SUM and COUNT display commands, DEFINE commands, and prefix operators such as MAX. and AVE.

To prevent the use of these default values in calculations (which might then give erroneous results), you can add the MISSING attribute to the field declaration in the Master File, for either a real or a virtual field. When the MISSING attribute is set to ON, the missing values are marked with a special internal code to distinguish them from blanks or zeros, and the missing values are ignored in calculations. In reports, the internal code is represented by the SET NODATA value, a period (.), by default. See Setting the NODATA Character String.

For example, missing data for a field in a segment instance may occur when the data values are unknown, as in the following scenario. Suppose that the employees recorded in the EMPLOYEE data source are due for a pay raise by a certain date, but the amount of the raise has not yet been determined. The company enters the date for each employee into the data source without the salary amounts; the salaries will be entered later. Each date is an individual instance in the salary history segment, but the new salary for each date instance is missing. Suppose further that a report request averages the SALARY field (SUM AVE.SALARY). The accuracy of the resulting average depends on whether the missing values for the SALARY field are treated as zeros (MISSING=OFF), or as internal codes (MISSING=ON).

Note: When all of the field values used in the calculation of a numeric summary value, such as a subtotal, are missing, the summary value is assigned the missing data value, not the value zero (0). This includes summary values produced by the operators ST. and CT. used in a subfooting.


Top of page

Example: Counting With Missing Values

Suppose the CURR_SAL field appears in 12 segment instances. In three of those instances, the field was given no value. Nevertheless, the display command

COUNT CURR_SAL

counts 12 occurrences of the CURR_SAL field. This occurs because the MISSING attribute is OFF by default, so the missing values are included in the count. If you wanted to exclude the missing data from the count, you could set MISSING ON.


Top of page

Example: Averaging With Missing Values

Suppose you have the following records of data for a field:

.
.
1
3

The numeric values in the first two records are missing (indicated by the periods). The last two records have values of 1 and 3. If you average these fields without the MISSING attribute (MISSING OFF), the value 0 is supplied for the two records that are missing values. Thus, the average of the records is (0+0+1+3)/4, or 1. If you use the MISSING ON attribute, the two missing values are ignored, calculating the average as (1+3)/2, or 2.


Top of page

x
MISSING Attribute in the Master File

In some applications, the default values (blanks and zeros) may represent valid data rather than the absence of information. However, if this is not the case, you can include the MISSING attribute after the field format in the Master File declaration for the field with the missing values. The MISSING attribute can be used with an actual field in the data source, or a virtual field that you are defining in the Master File.

For example, the following field declaration specifies the MISSING attribute for the RETURNS field:

FIELDNAME=RETURNS, ALIAS=RTN, FORMAT=I4, MISSING=ON,$

The next declaration specifies the MISSING attribute for a virtual field called PROFIT:

DEFINE PROFIT/D7 MISSING ON NEEDS SOME DATA = RETAIL_COST - DEALER_COST;$

To ensure that missing values are handled properly for virtual fields, you can set the MISSING attribute ON for the virtual field in the DEFINE command, and specify whether you want to apply the calculation if some or all values are missing. For related information on the SOME and ALL phrases, see How to Specify Missing Values in a DEFINE or COMPUTE Command.

When the MISSING attribute is set to ON in a field declaration, the field containing no data is marked with a special internal code, rather than with blanks or zeros. During report generation, the SUM and COUNT commands and all prefix operators (for example, AVE., MAX., MIN.) exclude the missing data in their computations. For related information about the MISSING attribute and field declarations, see the Describing Data With WebFOCUS Language manual.

Note:



Example: Handling Missing Values With the MISSING Attribute

This example illustrates the difference between a field with MISSING ON and one without. In it a virtual field, X_RETURNS, without the MISSING attribute, is set to equal a real field, RETURNS, with the MISSING attribute declared in the Master File. When the field with the MISSING attribute (RETURNS) is missing a value, the corresponding value of X_RETURNS is 0, since a data source field that is missing a value is evaluated as 0 (or blank) for the purpose of computation (see MISSING Attribute in a DEFINE or COMPUTE Command).

The following request defines the virtual field:

DEFINE FILE SALES
X_RETURNS/I4 = RETURNS;
END

Now issue the following report request:

TABLE FILE SALES
SUM CNT.X_RETURNS CNT.RETURNS AVE.X_RETURNS AVE.RETURNS
END

Remember that the field X_RETURNS has the same value as RETURNS except when RETURNS is missing a value, in which case, the X_RETURNS value is 0.

The output is:

The count for the RETURNS field is lower than the count for X_RETURNS and the average for RETURNS is higher than for X_RETURNS because the missing values in RETURNS are not part of the calculations.

For an illustration in which the MISSING attribute is set for a virtual field, see Handling Missing Values for Virtual Fields With SOME and ALL.


Top of page

x
MISSING Attribute in a DEFINE or COMPUTE Command

How to:

You can set the MISSING attribute ON in a DEFINE or COMPUTE command to enable a temporary field with missing values to be interpreted and represented correctly in reports.

An expression used to derive the values of the temporary field can contain real fields that have missing values. However, when used to derive the value of a temporary field, a data source field that is missing a value is evaluated as 0 or blank for computational purposes, even if the MISSING attribute has been set to ON for that field in the Master File.

To ensure that missing values are handled properly for temporary fields, you can set the MISSING attribute ON for the virtual field in the DEFINE or COMPUTE command, and specify whether you want to apply the calculation if some or all values are missing. See How to Specify Missing Values in a DEFINE or COMPUTE Command.



x
Syntax: How to Specify Missing Values in a DEFINE or COMPUTE Command
field[/format] MISSING {ON|OFF} [NEEDS] {SOME|ALL} [DATA] = expression;

where:

field

Is the name of the virtual field created by the DEFINE command.

/format

Is the format of the virtual field. The default is D12.2.

MISSING

ON enables the value of the temporary field to be interpreted as missing (that is, distinguished by the special internal code from an intentionally entered zero or blank), and represented by the NODATA character in reports.

OFF treats missing values for numeric fields as zeros, and missing values for alphanumeric fields as blanks. This is the default value.

NEEDS

Is optional. It helps to clarify the meaning of the command.

SOME

Indicates that if at least one field in the expression has a value, the temporary field has a value (the missing values of the field are evaluated as 0 or blank in the calculation). If all of the fields in the expression are missing values, the temporary field is missing its value. SOME is the default value.

ALL

Indicates that if all the fields in the expression have values, the temporary field has a value. If at least one field in the expression has a missing value, the temporary field also has a missing value.

DATA

Is optional. It helps to clarify the meaning of the command.

expression

Is a valid expression from which the temporary field derives its value.



Example: Handling Missing Values for a Virtual Field With MISSING OFF

The following request illustrates the use of two fields, RETURNS and DAMAGED, to define the NO_SALE field. Both the RETURNS and DAMAGED fields have the MISSING attribute set to ON in the SALES Master File, yet whenever one of these fields is missing a value, that field is evaluated as 0.

DEFINE FILE SALES
NO_SALE/I4 = RETURNS + DAMAGED;
END
TABLE FILE SALES
PRINT RETURNS AND DAMAGED AND NO_SALE
BY CITY BY DATE BY PROD_CODE
END

The output is:

CITY             DATE   PROD_CODE  RETURNS  DAMAGED  NO_SALE
----             ----   ---------  -------  -------  -------
NEW YORK         10/17  B10              2        3        5
                        B17              2        1        3
                        B20              0        1        1
                        C13              .        6        6
                        C14              4        .        4
                        C17              0        0        0
                        D12              3        2        5
                        E1               4        7       11
                        E2               .        .        0
                        E3               4        2        6
NEWARK           10/18  B10              1        1        2
                 10/19  B12              1        0        1
STAMFORD         12/12  B10             10        6       16
                        B12              3        3        6
                        B17              2        1        3
                        C13              3        0        3
                        C7               5        4        9
                        D12              0        0        0
                        E2               9        4       13
                        E3               8        9       17
UNIONDALE        10/18  B20              1        1        2
                        C7               0        0        0

Notice that the products C13, C14, and E2 in the New York section all show missing values for either RETURNS or DAMAGED, because the MISSING ON attribute has been set in the Master File. However, the calculation that determines the value of NO_SALE interprets these missing values as zeros, because MISSING ON has not been set for the virtual field.



Example: Handling Missing Values for Virtual Fields With SOME and ALL

The following request illustrates how to use the DEFINE command with the MISSING attribute to specify that if either some or all of the field values referenced in a DEFINE command are missing, the virtual field should also be missing its value.

The SOMEDATA field contains a value if either the RETURNS or DAMAGED field contains a value. Otherwise, SOMEDATA is missing its value. The ALLDATA field contains a value only if both the RETURNS and DAMAGED fields contain values. Otherwise, ALLDATA is missing its value.

DEFINE FILE SALES
SOMEDATA/I5 MISSING ON NEEDS SOME=RETURNS + DAMAGED;
ALLDATA/I5 MISSING ON NEEDS ALL=RETURNS + DAMAGED;
END
TABLE FILE SALES
PRINT RETURNS AND DAMAGED SOMEDATA ALLDATA
BY CITY BY DATE BY PROD_CODE
END

The output is:


Top of page

x
Testing for a Segment With a Missing Field Value

How to:

You can specify WHERE criteria to identify segment instances with missing field values.

You cannot use these tests to identify missing instances. However, you can set the ALL parameter to PASS to test for missing instances. See Handling a Missing Segment Instance.



x
Syntax: How to Test for a Segment With a Missing Field Value

To test for a segment with missing field values, the syntax is:

WHERE field {IS|EQ} MISSING

To test for the presence of field values, the syntax is:

WHERE field {NE|IS-NOT} MISSING

A WHERE criterion that tests a numeric field for 0 or an alphanumeric field for blanks also retrieves instances for which the field has a missing value.



Example: Testing for a Missing Field Value

The following request illustrates the use of MISSING to display grocery items (by code) for which the number of packages returned by customers is missing.

TABLE FILE SALES
PRINT RETURNS
BY CITY BY DATE BY PROD_CODE
WHERE RETURNS IS MISSING
END

The output is:

CITY             DATE   PROD_CODE  RETURNS
----             ----   ---------  -------
NEW YORK         10/17  C13              .
                        E2               .


Example: Testing for an Existing Field Value

The following request illustrates the use of MISSING to display only those grocery items for which the number of packages returned by customers is not missing.

TABLE FILE SALES
PRINT RETURNS
BY CITY BY DATE BY PROD_CODE
WHERE RETURNS IS-NOT MISSING
END

The output is:



Example: Testing for a Blank or Zero

The following request displays grocery items that either were never returned or for which the number of returned packages was never recorded:

TABLE FILE SALES
PRINT RETURNS
BY CITY BY DATE BY PROD_CODE
WHERE RETURNS EQ 0
END

The output is:



Example: Excluding Missing Values From a Test

To display only those items that have not been returned by customers, you need two WHERE criteria. The first to restrict the number of returns to 0, the other to exclude missing values, as in the following request.

TABLE FILE SALES
PRINT RETURNS
BY CITY BY DATE BY PROD_CODE
WHERE RETURNS EQ 0
WHERE RETURNS IS-NOT MISSING
END

The output is:


Top of page

x
Preserving Missing Data Values in an Output File

How to:

Reference:

The ability to distinguish between missing data and default values (blanks and zeros) in fields can be carried over into output files. If the retrieved and processed information displayed the NODATA string in a report, by default the NODATA string can be stored in the output file. You can also use the SET HOLDMISS command to store the missing values rather than the NODATA character in an output file. For related information, see Saving and Reusing Your Report Output.



x
Syntax: How to Distinguish Missing Data in an Extract File
ON TABLE {HOLD|SAVE|SAVB} MISSING {ON|OFF}

where:

HOLD

Creates an extract file for use in subsequent reports. The default for MISSING is ON.

SAVE

Creates a text extract file for use in other programs. The default for MISSING is OFF.

SAVB

Creates a binary extract file for use in other programs. The default for MISSING is OFF.

HOLD files can be created with both the MISSING and FORMAT ALPHA options, specified in any order. For example:

ON TABLE HOLD FORMAT ALPHA MISSING OFF
ON TABLE HOLD MISSING OFF FORMAT ALPHA


Example: Incorporating MISSING Values in an Extract File

The following request specifies MISSING ON in the HOLD phrase:

TABLE FILE SALES
SUM RETURNS AND HOLD FORMAT ALPHA MISSING ON
BY CITY BY DATE BY PROD_CODE
END

The MISSING=ON attribute for the RETURNS field is propagated to the HOLD Master File. In addition, the missing data symbols are propagated to the HOLD file for the missing field values:

FILENAME=HOLD    , SUFFIX=FIX     , $
  SEGMENT=HOLD, SEGTYPE=S3, $
    FIELDNAME=CITY, ALIAS=E01, USAGE=A15, ACTUAL=A15, $
    FIELDNAME=DATE, ALIAS=E02, USAGE=A4MD, ACTUAL=A04, $
    FIELDNAME=PROD_CODE, ALIAS=E03, USAGE=A3, ACTUAL=A03, $
    FIELDNAME=RETURNS, ALIAS=E04, USAGE=I3, ACTUAL=A03,
      MISSING=ON, $

With MISSING OFF in the HOLD phrase, the MISSING=ON attribute is not propagated to the HOLD Master File and the missing data symbols are replaced with default values.



x
Syntax: How to Store Missing Data in HOLD Files
SET HOLDMISS={ON|OFF}
ON TABLE SET HOLDMISS {ON|OFF}

where:

ON

Allows you to store missing data in a HOLD file. When TABLE generates a default value for data not found, it generates missing values.

OFF

Does not allow you to store missing data in a HOLD file. OFF is the default value.



x
Reference: Usage Notes for Holding Missing Values


Example: Holding Missing Values Using HOLDMISS
SET HOLDMISS=ON
TABLE FILE MOVIES
 SUM WHOLESALEPR
BY CATEGORY ACROSS RATING
 ON TABLE HOLD AS HLDM
END
TABLE FILE HLDM
 PRINT *
 END

The output is:

CATEGORY  WHOLESALEPR  WHOLESALEPR  WHOLESALEPR  WHOLESALEPR  WHOLESALEPR
--------  -----------  -----------  -----------  -----------  -----------
ACTION              .            .        20.98            .        34.48
CHILDREN        54.49        51.38            .            .            .
CLASSIC         40.99       160.80            .            .            .
COMEDY              .            .        46.70        30.00        13.75
DRAMA               .            .            .            .        10.00
FOREIGN         13.25            .        62.00            .        70.99
MUSICALS        15.00            .        13.99         9.99        13.99
MYSTERY             .         9.00        18.00         9.00        80.97
SCI/FI              .            .            .        35.99        43.53
TRAIN/EX            .        60.98            .            .             .

Top of page

x
Propagating Missing Values to Reformatted Fields in a Request

How to:

Reference:

When a field is reformatted in a request (for example, SUM field/format), an internal COMPUTE field is created to contain the reformatted field value and display on the report output. If the original field has a missing value, that missing value can be propagated to the internal field by setting the COMPMISS parameter ON. If the missing value is not propagated to the internal field, it displays a zero (if it is numeric) or a blank (if it is alphanumeric). If the missing value is propagated to the internal field, it displays the missing data symbol on the report output.



x
Syntax: How to Control Missing Values in Reformatted Fields
SET COMPMISS = {ON|OFF}

where:

ON

Propagates a missing value to a reformatted field. ON is the default value.

OFF

Displays a blank or zero for a reformatted field.

Note: The COMPMISS parameter cannot be set in an ON TABLE command.



Example: Controlling Missing Values in Reformatted Fields

The following procedure prints the RETURNS field from the SALES data source for store 14Z. With COMPMISS OFF, the missing values display as zeros in the column for the reformatted field value. (Note: Before trying this example, you must make sure that the SALEMISS procedure, which adds missing values to the SALES data source, has been run.)

SET COMPMISS = OFF         
TABLE FILE SALES        
PRINT RETURNS RETURNS/D12.2 AS 'REFORMATTED,RETURNS'
BY STORE_CODE              
WHERE STORE_CODE EQ '14Z' 
END

The output is:

                        REFORMATTED
STORE_CODE  RETURNS     RETURNS    
----------  -------     -----------
14Z               2            2.00
                  2            2.00
                  0             .00
                  .             .00
                  4            4.00
                  0             .00
                  3            3.00
                  4            4.00
                  .             .00
                  4            4.00

With COMPMISS ON, the column for the reformatted version of RETURNS displays the missing data symbol when a value is missing:

SET COMPMISS = ON         
TABLE FILE SALES        
PRINT RETURNS RETURNS/D12.2 AS 'REFORMATTED,RETURNS'
BY STORE_CODE              
WHERE STORE_CODE EQ '14Z'  
END

The output is:

                        REFORMATTED 
STORE_CODE  RETURNS     RETURNS     
----------  -------     ----------- 
14Z               2            2.00 
                  2            2.00 
                  0             .00 
                  .               . 
                  4            4.00 
                  0             .00 
                  3            3.00 
                  4            4.00 
                  .               . 
                  4            4.00 


x
Reference: Usage Notes for SET COMPMISS

If you create a HOLD file with COMPMISS ON, the HOLD Master File for the reformatted field indicates MISSING = ON (as does the original field). With COMPMISS = OFF, the reformatted field does NOT have MISSING = ON in the generated Master File.


WebFOCUS