Defining the Model Data

In this section:

Load your data, as detailed in Introducing WebFOCUS RStat.


Top of page

x
Defining Model Sampling

RStat provides random sampling. You can divide your data set into a training data set and a testing data set. The training data set will be used to build the model. The testing data set, also called the evaluation data set, can be used by the model evaluation techniques to test how well the model predicts.

Define the proportion of data to be included in each data set and the seed to be used to generate the random sample.

Sample check box

Note:


Top of page

x
Defining Variable Roles

For each of the variables within your data set, you can define the role it should play in the model by clicking the appropriate column within the Variable Grid.

RStat automatically assigns roles to variables based on the following variable prefixes.

Prefix

Role

ID

Identifier

IGNORE

Ignored

IMP

Imputed

RISK

Risk measure

You can have one Target and one Risk variable.

Target and Risk variable group box

You can override these default settings by clicking the appropriate role for each of your variables.



x
Setting Variable Roles for Groups

Input and Ignore option buttons

You can set a group of variables to a single role using the Input and Ignore buttons by:


Top of page

x
Setting the Target Type

Target types

The data type of the target variable determines the type of modeling available and the specific algorithms that will be used within the modeling process. The data type is defined based on the type of data RStat identifies and the quantity of unique values found in the actual data. In RStat, data types are defined as:

Note: The target setting does not change the actual data within the data grid. It will change only the way the target data is used when the model is built.


Top of page

x
Executing Data Settings

Executing Data Settings window

Once you have set or confirmed the Sampling, Data roles, and Target type, click Execute from the RStat toolbar to pass these settings to RStat.

Notice that the status bar will display the:


WebFOCUS