In this section: |
Missing data refers to variables that have no data value in the current observation or record. The missing or inapplicable value is indicated by the default character, a dot (.).
Some modeling algorithms cannot generate a score if any of the input parameters are missing and will return the score as missing. Other modeling algorithms can generate a score even if there are missing input parameter values.
Regression and clustering techniques will return a missing value for the score if any of the input parameters are missing.
Decision tree techniques will return a score even if there are missing input parameter values. If there is a missing value, the record is assigned to the majority class of the node in which the missing value occurs.
In order for a scoring routine to recognize missing input parameter values and for the algorithm to derive the score appropriately, you must add the SET MISSINGTEST command to the procedure (fex) and the MISSING attribute to the individual calculated field.
Adding the following SET command to the procedure (fex):
SET MISSINGTEST = SPECIAL
Or, adding the following ON TABLE SET command to the report request:
ON TABLE SET MISSINGTEST SPECIAL
COMPUTE PREDICTION/D20.8CM MISSING ON = W_REG_LINEAR(WRAIN, DEGREES_IN_C, HRAIN, TIME_SINCE_VINTAGE, CHATEAU,PREDICTION);
The following report request includes WebFOCUS syntax that calls the w_reg_linear function, a scoring routine built from a linear regression model. It is designed to account for the possibility that some of the input values may be missing.
Note: Scoring routine functions cannot be embedded in other formulas or expressions. The expression on the right side of the command must consist only of the function call.
SET MISSINGTEST=SPECIAL FILEDEF W_REG_LINEAR DISK C:\IBI\APPS\_rstat\w_reg_linear.CSV TABLE FILE W_REG_LINEAR PRINT ID CHATEAU AS 'Chateau' WRAIN/D6.0 AS 'Winter,Rain,(inches)' DEGREES_IN_C/D6.0 AS 'AvgTemp,(Celsius)' HRAIN/D6.0 AS 'Harvest,Rain,(inches)' TIME_SINCE_VINTAGE/I5 AS 'Years,Since,Vintage' COMPUTE PREDICTION/D20.8CM MISSING ON = W_REG_LINEAR(WRAIN, DEGREES_IN_C, HRAIN, TIME_SINCE_VINTAGE, CHATEAU,PREDICTION); HEADING "Regression Linear" ON TABLE SET PAGE-NUM OFF ON TABLE NOTOTAL ON TABLE PCHOLD AS W_REG_LINEAR.PDF FORMAT PDF ON TABLE SET STYLE * UNITS=IN, PAGESIZE='Letter', SQUEEZE=ON, ORIENTATION=LANDSCAPE, $ TYPE=REPORT, FONT='TREBUCHET MS', SIZE=9, COLOR=RGB(66 70 73), . . . ENDSTYLE END
The partial output is shown in the image below. By default, the missing value is represented by a dot (.) on the report output. You can change this character designation by using the SET NODATA command. For more information on changing the missing data character, see the Handling Records With Missing Field Values chapter in the Creating Reports With WebFOCUS Language manual.
For additional information and syntax on handling records with missing data in a report request, see the Handling Records With Missing Field Values chapter in the Creating Reports With WebFOCUS Language manual.
WebFOCUS |