Grouping Numeric Data Into Ranges
 In this section: Grouping Numeric Data Into TilesHow to:

When you sort a report using a numeric sort field, you can group the sort field values together and define the range of each group.

There are several ways of defining groups. You can define groups of:

• Equal range using the IN-GROUPS-OF phrase.

Each report request can contain a total of five IN-GROUPS-OF phrases plus IN-RANGES-OF phrases. The IN-GROUPS-OF phrase can only be used once per BY field. The first sort field range starts from the lowest value of a multiple of the IN-GROUPS-OF value, and the value displayed is the start point of each range.

• Equal range using the IN-RANGES-OF phrase.

Each report request can contain a total of five IN-GROUPS-OF phrases plus IN-RANGES-OF phrases. The IN-RANGES-OF phrase can only be used once per BY field, and it generates an additional internal sort phrase that must be counted in the total number of sort phrases. The first sort field range starts from the lowest value of a multiple of the IN-GROUPS-OF value. No message is generated if you specify a range of zero, but the values displayed on the report are unpredictable.

• Unequal range using the FOR phrase.
• Tiles. These include percentiles, quartiles, or deciles. For details, see

The FOR phrase is usually used to produce matrix reports and is part of the Financial Modeling Language (FML). However, you can also use it to create columnar reports that group sort field values in unequal ranges.

The FOR phrase displays the sort value for each individual row. The ranges do not have to be contiguous, that is, you can define your ranges with gaps between them. The FOR phrase is described in more detail in

Note: If there is not any data for a group, a row for the group still appears in the report.

 Top of page
Syntax: How to Define Groups of Equal Range
`{BY|ACROSS} sortfield IN-GROUPS-OF value [TOP limit]`

where:

sortfield

Is the name of the sort field. The sort field must be numeric: its format must be I (integer), F (floating-point number), D (decimal number), or P (packed number).

value

Is a positive integer that specifies the range by which sort field values are grouped.

limit

Is an optional number that defines the highest group label to be included in the report.

 Top of page
Example: Defining Groups of Equal Ranges

The following illustrates how to show which employees fall into which salary ranges, and to define the ranges by \$5,000 increments.

```TABLE FILE EMPLOYEE
PRINT LAST_NAME
BY CURR_SAL IN-GROUPS-OF 5000
END```

The output is:

```  CURR_SAL  LAST_NAME
--------  ---------
\$5,000.00  SMITH
GREENSPAN
\$10,000.00  STEVENS
SMITH
\$15,000.00  JONES
MCCOY
MCKNIGHT
\$20,000.00  ROMANS
BLACKWOOD
\$25,000.00  BANNING
IRVING
CROSS```
 Top of page
Syntax: How to Define Equal Ranges
`{BY|ACROSS} sortfield IN-RANGES-OF value [TOP limit]`

where:

sortfield

Is the name of the sort field. The sort field must be numeric: its format must be I (Integer), F (floating-point), D (double-precision), or P (packed).

value

Is an integer greater than zero indicating the range by which sort field values are grouped.

limit

Is an optional number that defines the highest range label to be included in the report. The range is extended to include all data values higher than this value.

Note: IN-RANGES-OF generates an internal sort phrase that must be counted in the total number of sort phrases.

 Top of page
Example: Defining Equal Ranges
```TABLE FILE EMPLOYEE
PRINT LAST_NAME
BY CURR_SAL IN-RANGES-OF 5000
END```

The output is:

 Top of page
Syntax: How to Define Custom Groups of Data Values
```FOR sortfield
begin1 TO end1 [OVER begin2 TO end2 ... ]```

where:

sortfield

Is the name of the sort field.

begin

Is a value that identifies the beginning of a range.

end

Is a value that identifies the end of a range.

 Top of page
Example: Defining Custom Groups of Data Values

The following request displays employee salaries, but it groups them in an arbitrary way. Notice that the starting value of each range prints in the report.

```TABLE FILE EMPLOYEE
PRINT LAST_NAME
FOR CURR_SAL
9000 TO 13500 OVER
14000 TO 19700 OVER
19800 TO 30000
END```

The output is:

 Top of page
Grouping Numeric Data Into Tiles
 How to:Group Numeric Data Into TilesReference: Usage Notes for Tiles

You can group numeric data into any number of tiles (percentiles, deciles, quartiles, etc.) in tabular reports. For example, you can group student test scores into deciles to determine which students are in the top ten percent of the class, or determine which sales representatives are in the top half of all sales representatives based on total sales.

Grouping is based on the values in the selected vertical (BY) field, and data is apportioned as equally as possible into the number of tile groups you specify.

The following occurs when you group data into tiles:

• A new column, labeled TILE by default, is added to the report output and displays the tile number assigned to each instance of the tile field. You can change the column heading with an AS phrase. For details on the AS phrase, see
• Tiling is calculated within all of the higher-level sort fields in the request, and restarts whenever a sort field at a higher level than the tile field value changes.
• Instances are counted using the tile field. If the request prints fields from lower level segments, there may be multiple report lines that correspond to one instance of the tile field.
• Instances with the same tile field value are placed in the same tile. For example, consider the following data, which is to be apportioned into three tiles:

1 5 5 5 8 9

In this case, dividing the instances into groups containing an equal number of records produces the following:

Group

Data Values

1

1,5

2

5,5

3

8,9

However, because all of the same data values must be in the same tile, the fives (5) that are in group 2 are moved to group 1. Group 2 remains empty. The final tiles are:

Tile Number

Data Values

1

1,5,5,5

2

3

8,9

 Top of page
Syntax: How to Group Numeric Data Into Tiles
```BY [ {HIGHEST|LOWEST} [k] ] tilefield [AS 'head1']
IN-GROUPS-OF n TILES [TOP m] [AS 'head2']```

where:

HIGHEST

Sorts the data in descending order so that the highest data values are placed in tile 1.

LOWEST

Sorts the data in ascending order so that the lowest data values are placed in tile 1. This is the default sort order.

k

Is a positive integer representing the number of tile groups to display in the report. For example, BY HIGHEST 2 displays the two non-empty tiles with the highest data values.

tilefield

Is the field whose values are used to assign the tile numbers.

Is a heading for the column that displays the values of the tile sort field.

n

Is a positive integer not greater than 32,767, specifying the number of tiles to be used in grouping the data. For example, 100 tiles produces percentiles, while 10 tiles produces deciles.

m

Is a positive integer indicating the highest tile value to display in the report. For example, TOP 3 does not display any data row that is assigned a tile number greater than 3.

Is a new heading for the column that displays the tile numbers.

Note:

• The syntax accepts numbers that are not integers for k, n, and m. On z/OS, values with decimals are rounded to integers; on UNIX and Windows they are truncated. If the numbers supplied are negative or zero, an error message is generated.
• Both k and m limit the number of rows displayed within each sort break in the report. If you specify both, the more restrictive value controls the display. If k and m are both greater than n (the number of tiles), n is used.

 Top of page
Example: Grouping Data Into Five Tiles

The following illustrates how to group data into five tiles.

```TABLE FILE EMPLOYEE
PRINT LAST_NAME FIRST_NAME
BY DEPARTMENT
BY CURR_SAL IN-GROUPS-OF 5 TILES
END```

The output is:

Note that the tiles are assigned within the higher-level sort field DEPARTMENT. The MIS category does not have any data assigned to tile 3. The PRODUCTION category has all five tiles.

 Top of page
Example: Displaying the First Three Tile Groups

In this example, the employees with the three lowest salaries are grouped into five tiles.

```TABLE FILE EMPLOYEE
PRINT LAST_NAME FIRST_NAME
BY DEPARTMENT
BY LOWEST 3 CURR_SAL IN-GROUPS-OF 5 TILES
END```

The output is:

Note that the request displays three tile groups in each category. Because no data was assigned to tile 3 in the MIS category, tiles 1, 2, and 4 display for that category.

 Top of page
Example: Displaying Tiles With a Value of Three or Less

In this example, the employees with the three lowest salaries are listed and grouped into five tiles, but only the tiles that are in the top 3 (tiles 1, 2, or 3) are displayed in the report. Also, the heading for the TILES field has been renamed (using the AS phrase) to DECILES.

```TABLE FILE EMPLOYEE
PRINT LAST_NAME FIRST_NAME
BY DEPARTMENT
BY LOWEST 3 CURR_SAL IN-GROUPS-OF 5 TILES TOP 3 AS DECILES
END```

The output is:

Because no data was assigned to tile 3 in the MIS category, only tiles 1 and 2 display for that category.

 Top of page
Reference: Usage Notes for Tiles
• If a request retrieves data from segments that are descendants of the segment containing the tile field, multiple report rows may correspond to one instance of the tile field. These additional report rows do not affect the number of instances used to assign the tile values. However, if you retrieve fields from multiple segments and create a single-segment output file, this flat file will have multiple instances of the tile field, and this increased number of instances may affect the tile values assigned. Therefore, when you run the same request against the multi-level file and the single-segment file, different tile assignments may result.
• Tiles are always calculated on a BY sort field in the request.
• Only one tiles calculation is supported per request. However, the request can contain up to five (the maximum allowed) non-tile IN-GROUP-OF phrases in addition to the TILES phrase.
• Comparisons for the purpose of assigning tile numbers use exact data values regardless of their display format. Therefore, if you display a floating-point value as D7, you may not be showing enough significant digits to indicate why values are placed in separate tiles.
• The tile field can be a real field or a virtual field created with a DEFINE command or a DEFINE in the Master File. The COMPUTE command cannot be used to create a tile field.
• Empty tiles do not display in the report output.
• In requests with multiple sort fields, tiles are supported only at the lowest level and only with the BY LOWEST phrase.
• Tiles are supported with output files. However, the field used to calculate the tiles propagates three fields to a HOLD file (the actual field value, the tile, and a ranking field) unless you set HOLDLIST to PRINTONLY.
• Tiles are not supported with BY TOTAL, TABLEF, FML, and GRAPH.
 WebFOCUS