Creating an Equijoin

In this section:

How to:

Reference:

The most common joined structures are based on real fields that have been declared in the Master Files of the host and cross-referenced data sources, respectively.


Top of page

x
Syntax: How to Join Real Fields

The following JOIN syntax requires that the fields you are using to join the files are real fields declared in the Master File. This join may be a simple one based on one field in each file to be joined, or a multi-field join for data sources that support this type of behavior. The following syntax describes the simple and multi-field variations:

JOIN [LEFT_OUTER|INNER] hfld1 [AND hfld2 ...] IN hostfile [TAG tag1]
     TO [UNIQUE|MULTIPLE]
     crfield [AND crfld2 ...] IN crfile [TAG tag2] [AS joinname]
END

where:

JOIN hfld1
Is the name of a field in the host file containing values shared with a field in the cross-referenced file. This field is called the host field.
AND hfld2...
Can be an additional field in the host file, with the caveats noted below. The phrase beginning with AND is required when specifying multiple fields.
  • When you are joining two FOCUS data sources you can specify up to four alphanumeric fields in the host file that, if concatenated, contain values shared with the cross-referenced file. You may not specify more than one field in the cross-referenced file when the suffix of the file is FOC. For example, assume the cross-referenced file contains a phone number field with an area code-prefix-exchange format. The host file has an area code field, a prefix field, and an exchange field. You can specify these three fields to join them to the phone number field in the cross-referenced file. The JOIN command treats the three fields as one. Other data sources do not have this restriction on the cross-referenced file.
  • For data adapters that support multi-field and concatenated joins, you can specify up to 16 fields. See your data adapter documentation for specific information about supported join features. Note that FOCUS data sources do not support these joins.
INNER
Specifies an inner join. If you do not specify the type of join in the JOIN command, the ALL parameter setting determines the type of join to perform.
LEFT_OUTER
Specifies a left outer join. If you do not specify the type of join in the JOIN command, the ALL parameter setting determines the type of join to perform.

Note that in a left outer join, host records with a missing cross-referenced instance are included in the report output. To control how tests against missing cross-referenced segment instances are processed, use the SET SHORTPATH command described in Handling a Missing Segment Instance.

IN hostfile
Is the name of the host file.
TAG tag1
Is a tag name of up to 66 characters (usually the name of the Master File), which is used as a unique qualifier for fields and aliases in the host file.

The tag name for the host file must be the same in all the JOIN commands of a joined structure.

TO [UNIQUE|MULTIPLE] crfld1
Is the name of a field in the cross-referenced file containing values that match those of hfld1 (or of concatenated host fields). This field is called the cross-referenced field.

Note: Unique returns only one instance and, if there is no matching instance in the cross-referenced file, it supplies default values (blank for alphanumeric fields and zero for numeric fields).

Use the MULTIPLE parameter when crfld1 may have multiple instances in common with one value in hfld1. Note that ALL is a synonym for MULTIPLE, and omitting this parameter entirely is a synonym for UNIQUE. See Unique and Non-Unique Joined Structures for more information.

AND crfld2...
Is the name of a field in the cross-referenced file with values in common with hfld2.

Note: crfld2 may be qualified. This field is only available for data adapters that support multi-field joins.

IN crfile
Is the name of the cross-referenced file.
TAG tag2
Is a tag name of up to 66 characters (usually the name of the Master File), which is used as a unique qualifier for fields and aliases in cross-referenced files. In a recursive join structure, if no tag name is provided, all field names and aliases are prefixed with the first four characters of the join name. For related information, see Usage Notes for Recursive Joined Structures.

The tag name for the host file must be the same in all the JOIN commands of a joined structure.

AS joinname
Is an optional name of up to eight characters that you may assign to the join structure. You must assign a unique name to a join structure if:
  • You want to ensure that a subsequent JOIN command does not overwrite it.
  • You want to clear it selectively later.
  • The structure is recursive. See Recursive Joined Structures.

Note: If you do not assign a name to the join structure with the AS phrase, the name is assumed to be blank. A join without a name overwrites an existing join without a name.

END
Required when the JOIN command is longer than one line. It terminates the command. It must be on a line by itself.


Example: Creating a Simple Unique Joined Structure

An example of a simple unique join is shown below:

JOIN JOBCODE IN EMPLOYEE TO JOBCODE IN JOBFILE AS JJOIN


Example: Creating an Inner Join

The following procedure creates three FOCUS data sources:

The procedure then adds an employee to EMPINFO named Fred Newman who has no matching record in the JOBINFO or EDINFO data sources.

TABLE FILE EMPLOYEE
SUM LAST_NAME FIRST_NAME CURR_JOBCODE
BY EMP_ID
ON TABLE HOLD AS EMPINFO FORMAT FOCUS INDEX EMP_ID CURR_JOBCODE
END
-RUN
 
TABLE FILE JOBFILE
SUM JOB_DESC
BY JOBCODE
ON TABLE HOLD AS JOBINFO FORMAT FOCUS INDEX JOBCODE
END
-RUN
 
TABLE FILE EDUCFILE
SUM COURSE_CODE COURSE_NAME
BY EMP_ID
ON TABLE HOLD AS EDINFO FORMAT FOCUS INDEX EMP_ID
END
-RUN
 
MODIFY FILE EMPINFO
FREEFORM EMP_ID LAST_NAME FIRST_NAME CURR_JOBCODE
MATCH EMP_ID
ON NOMATCH INCLUDE
ON MATCH REJECT
DATA
111111111, NEWMAN, FRED, C07,$
END

The following request prints the contents of EMPINFO. Note that Fred Newman has been added to the data source:

TABLE FILE EMPINFO
PRINT *
END

The output is:

EMP_ID     LAST_NAME        FIRST_NAME  CURR_JOBCODE
------     ---------        ----------  ------------
071382660  STEVENS          ALFRED      A07
112847612  SMITH            MARY        B14
117593129  JONES            DIANE       B03
119265415  SMITH            RICHARD     A01
119329144  BANNING          JOHN        A17
123764317  IRVING           JOAN        A15
126724188  ROMANS           ANTHONY     B04
219984371  MCCOY            JOHN        B02
326179357  BLACKWOOD        ROSEMARIE   B04
451123478  MCKNIGHT         ROGER       B02
543729165  GREENSPAN        MARY        A07
818692173  CROSS            BARBARA     A17
111111111  NEWMAN           FRED        C07

The following JOIN command creates an inner join between the EMPINFO data source and the JOBINFO data source.

JOIN CLEAR *
JOIN INNER CURR_JOBCODE IN EMPINFO TO MULTIPLE JOBCODE IN JOBINFO AS J0

Note that the JOIN command specifies a multiple join. In a unique join, the cross-referenced segment is never considered missing, and all records from the host file display on the report output. Default values (blank for alphanumeric fields and zero for numeric fields) display if no actual data exists.

The following request displays fields from the joined structure:

TABLE FILE EMPINFO
PRINT LAST_NAME FIRST_NAME JOB_DESC
END

Fred Newman is omitted from the report output because his job code does not have a match in the JOBINFO data source:

LAST_NAME  FIRST_NAME  JOB_DESC
---------  ----------  --------
STEVENS    ALFRED      SECRETARY
SMITH      MARY        FILE QUALITY
JONES      DIANE       PROGRAMMER ANALYST
SMITH      RICHARD     PRODUCTION CLERK
BANNING    JOHN        DEPARTMENT MANAGER
IRVING     JOAN        ASSIST.MANAGER
ROMANS     ANTHONY     SYSTEMS ANALYST
MCCOY      JOHN        PROGRAMMER
BLACKWOOD  ROSEMARIE   SYSTEMS ANALYST
MCKNIGHT   ROGER       PROGRAMMER
GREENSPAN  MARY        SECRETARY
CROSS      BARBARA     DEPARTMENT MANAGER


Example: Creating a Left Outer Join

The following JOIN command creates a left outer join between the EMPINFO data source and the EDINFO data source:

JOIN CLEAR *
JOIN LEFT_OUTER EMP_ID IN EMPINFO TO MULTIPLE EMP_ID IN EDINFO AS J1

The following request displays fields from the joined structure:

TABLE FILE EMPINFO
PRINT LAST_NAME FIRST_NAME COURSE_NAME
END

All employee records display on the report output. The records for those employees with no matching records in the EDINFO data source display the missing data character (.) in the COURSE_NAME column. If the join were unique, blanks would display instead of the missing data character.

LAST_NAME  FIRST_NAME  COURSE_NAME
---------  ----------  -----------
STEVENS    ALFRED      FILE DESCRPT & MAINT
SMITH      MARY        BASIC REPORT PREP FOR PROG
JONES      DIANE       FOCUS INTERNALS
SMITH      RICHARD     BASIC RPT NON-DP MGRS
BANNING    JOHN        .
IRVING     JOAN        .
ROMANS     ANTHONY     .
MCCOY      JOHN        .
BLACKWOOD  ROSEMARIE   DECISION SUPPORT WORKSHOP
MCKNIGHT   ROGER       FILE DESCRPT & MAINT
GREENSPAN  MARY        .
CROSS      BARBARA     HOST LANGUAGE INTERFACE
NEWMAN     FRED        .


Example: Creating Two Inner Joins With a Multipath Structure

The following JOIN commands create an inner join between the EMPINFO and JOBINFO data sources and an inner join between the EMPINFO and EDINFO data sources:

JOIN CLEAR *
JOIN INNER CURR_JOBCODE IN EMPINFO TO MULTIPLE JOBCODE IN JOBINFO AS J0
JOIN INNER EMP_ID IN EMPINFO TO MULTIPLE EMP_ID IN EDINFO AS J1

The structure created by the two joins has two independent paths:

         SEG01
 01      S1
**************
*EMP_ID      **I
*CURR_JOBCODE**I
*LAST_NAME   **
*FIRST_NAME  **
*            **
***************
 **************
       I
       +-----------------+
       I                 I
       I SEG01           I SEG01
 02    I KM        03    I KM
..............    ..............
:EMP_ID      ::K  :JOBCODE     ::K
:COURSE_CODE ::   :JOB_DESC    ::
:COURSE_NAME ::   :            ::
:            ::   :            ::
:            ::   :            ::
:............::   :............::
 .............:    .............:
 JOINED  EDINFO    JOINED  JOBINFO

The following request displays fields from the joined structure:

SET MULTIPATH=SIMPLE
TABLE FILE EMPINFO
PRINT LAST_NAME FIRST_NAME IN 12 COURSE_NAME JOB_DESC
END

With MULTIPATH=SIMPLE, the independent paths create independent joins. All employee records accepted by either join display on the report output. Only Fred Newman (who has no matching record in either of the cross-referenced files) is omitted:

LAST_NAME  FIRST_NAME  COURSE_NAME                     JOB_DESC
---------  ----------  -----------                     --------
STEVENS    ALFRED      FILE DESCRPT & MAINT            SECRETARY
SMITH      MARY        BASIC REPORT PREP FOR PROG      FILE QUALITY
JONES      DIANE       FOCUS INTERNALS                 PROGRAMMER ANALYST
SMITH      RICHARD     BASIC RPT NON-DP MGRS           PRODUCTION CLERK
BANNING    JOHN        .                               DEPARTMENT MANAGER
IRVING     JOAN        .                               ASSIST.MANAGER
ROMANS     ANTHONY     .                               SYSTEMS ANALYST
MCCOY      JOHN        .                               PROGRAMMER
BLACKWOOD  ROSEMARIE   DECISION SUPPORT WORKSHOP       SYSTEMS ANALYST
MCKNIGHT   ROGER       FILE DESCRPT & MAINT            PROGRAMMER
GREENSPAN  MARY        .                               SECRETARY
CROSS      BARBARA     HOST LANGUAGE INTERFACE         DEPARTMENT MANAGER

With MULTIPATH=COMPOUND, only employees with matching records in both of the cross-referenced files display on the report output:

LAST_NAME  FIRST_NAME  COURSE_NAME                     JOB_DESC
---------  ----------  -----------                     --------
STEVENS    ALFRED      FILE DESCRPT & MAINT            SECRETARY
SMITH      MARY        BASIC REPORT PREP FOR PROG      FILE QUALITY
JONES      DIANE       FOCUS INTERNALS                 PROGRAMMER ANALYST
SMITH      RICHARD     BASIC RPT NON-DP MGRS           PRODUCTION CLERK
BLACKWOOD  ROSEMARIE   DECISION SUPPORT WORKSHOP       SYSTEMS ANALYST
MCKNIGHT   ROGER       FILE DESCRPT & MAINT            PROGRAMMER
CROSS      BARBARA     HOST LANGUAGE INTERFACE         DEPARTMENT MANAGER

Top of page

x
Reference: Requirements for Cross-Referenced Fields in an Equijoin

The cross-referenced fields used in a JOIN must have the following characteristics in specific data sources:


Top of page

x
Reference: Restrictions on Group Fields

When group fields are used in a joined structure, the group in the host file and the group in the cross-referenced file must have the same number of elements:


Top of page

x
Reference: Usage Notes for Inner and Outer JOIN Command Syntax

Top of page

x
Joining From a Virtual Field to a Real Field Using an Equijoin

How to:

Reference:

You can use DEFINE-based JOIN syntax to create a virtual host field that you can join to a real cross-referenced field. The DEFINE expression that creates the virtual host field may contain only fields in the host file and constants. (It may not contain fields in the cross-referenced file.) You can do more than one join from a virtual field.

You can create the virtual host field in a separate DEFINE command or in a Master File. For information on Master Files, see the Describing Data With WebFOCUS Language manual.

The same report request can use JOIN-based virtual fields, and virtual fields unrelated to the join.

Note that if you are creating a virtual field in a DEFINE command, you must issue the DEFINE after the JOIN command, but before the TABLE request since a JOIN command clears all fields created by DEFINE commands for the host file and the joined structure. Virtual fields defined in Master Files are not cleared.

Tip: If a DEFINE command precedes the JOIN command, you can set KEEPDEFINES ON to reinstate virtual fields during the parsing of a subsequent JOIN command. For more information, see Preserving Virtual Fields Using KEEPDEFINES.



x
Syntax: How to Join From a Virtual Field to a Real Field

The DEFINE-based JOIN command enables you to join a virtual field in the host file to a real field in the cross-referenced file. The syntax is:

JOIN [LEFT_OUTER|INNER] deffld WITH host_field ... 
     IN hostfile [TAG tag1]
     TO [UNIQUE|MULTIPLE] 
     cr_field IN crfile [TAG tag2] [AS joinname]
END

where:

JOIN deffld

Is the name of a virtual field for the host file (the host field). The virtual field can be defined in the Master File or with a DEFINE command. For related information, see Notes on Using Virtual Fields With Joined Data Sources.

WITH host_field

Is the name of any real field in the host segment with which you want to associate the virtual field. This association is required to locate the virtual field.

Note: The WITH field referenced in the JOIN command must be in the same segment as the WITH field referenced in the DEFINE that creates the virtual field or no output will be produced.

The WITH phrase is required unless the KEEPDEFINES parameter is set to ON and deffld was defined prior to issuing the JOIN command.

To determine which segment contains the virtual field, use the ? DEFINE query after issuing the DEFINE command. See the Developing Reporting Applications manual for details about Query commands.

INNER

Specifies an inner join. If you do not specify the type of join in the JOIN command, the ALL parameter setting determines the type of join to perform.

LEFT_OUTER

Specifies a left outer join. If you do not specify the type of join in the JOIN command, the ALL parameter setting determines the type of join to perform.

IN hostfile

Is the name of the host file.

TAG tag1

Is a tag name of up to 66 characters (usually the name of the Master File), which is used as a unique qualifier for fields and aliases in host files.

The tag name for the host file must be the same in all JOIN commands of a joined structure.

TO [UNIQUE|MULTIPLE] crfld1

Is the name of a real field in the cross-referenced data source whose values match those of the virtual field. This must be a real field declared in the Master File.

Note: Unique returns only one instance and, if there is no matching instance in the cross-referenced file, it supplies default values (blank for alphanumeric fields and zero for numeric fields).

Use the MULTIPLE parameter when crfld1 may have multiple instances in common with one value in hfld1. Note that ALL is a synonym for MULTIPLE, and omitting this parameter entirely is a synonym for UNIQUE. See Unique and Non-Unique Joined Structures for more information.

IN crfile

Is the name of the cross-referenced file.

TAG tag2

Is a tag name of up to 66 characters (usually the name of the Master File), which is used as a unique qualifier for fields and aliases in cross-referenced files. In a recursive joined structure, if no tag name is provided, all field names and aliases are prefixed with the first four characters of the join name. For related information, see Usage Notes for Recursive Joined Structures.

The tag name for the host file must be the same in all JOIN commands of a joined structure.

AS joinname

Is an optional name of up to eight characters that you may assign to the joined structure. You must assign a unique name to a join structure if:

  • You want to ensure that a subsequent JOIN command does not overwrite it.
  • You want to clear it selectively later.
  • The structure is recursive, and you do not specify tag names. See Recursive Joined Structures.

If you do not assign a name to the joined structure with the AS phrase, the name is assumed to be blank. A join without a name overwrites an existing join without a name.

END

Required when the JOIN command is longer than one line. It terminates the command. It must be on a line by itself.



x
Reference: Notes on Using Virtual Fields With Joined Data Sources

Requests reading joined data sources can contain virtual fields that are defined either:



Example: Creating a Virtual Host Field for a Joined Structure

Suppose that a retail chain sends four store managers to attend classes. Each person, identified by an ID number, manages a store in a different city. The stores and the cities in which they are located are contained in the SALES data source. The manager IDs, the classes, and dates the managers attended are contained in the EDUCFILE data source.

The following procedure lists the courses that the managers attended, identifying the managers by the cities in which they work. Note the three elements in the procedure:

The procedure is:

JOIN ID_NUM WITH CITY IN SALES TO ALL EMP_ID IN EDUCFILE AS SALEDUC
DEFINE FILE SALES
ID_NUM/A9 = DECODE CITY ('NEW YORK' 451123478 'NEWARK' 119265415
                         'STAMFORD' 818692173 'UNIONDALE' 112847612);
END
TABLE FILE SALES
PRINT DATE_ATTEND BY CITY BY COURSE_NAME
END

The output is:

CITY                    COURSE_NAME                 DATE_ATTEND
----                    -----------                 -----------
NEW YORK                FILE DESCRPT & MAINT           81/11/15
NEWARK                  BASIC RPT NON-DP MGRS          82/08/24
STAMFORD                BASIC REPORT PREP DP MGRS      82/08/02
                        HOST LANGUAGE INTERFACE        82/10/21
UNIONDALE               BASIC REPORT PREP FOR PROG     81/11/16
                        FILE DESCRPT & MAINT           81/11/15

Top of page

x
Join Modes in an Equijoin

How to:

The JOIN_LENGTH_MODE (JOINLM) parameter controls processing of equality joined field pairs for record-based non-SQL Adapters, such as DFIX, VSAM, and FIX. This setting controls processing when two alphanumeric fields of different lengths or two numeric fields of different data types and precisions are joined.

For SQL data sources, joins are normally either optimized (sent to the SQL engine for processing) or managed to comply with SQL processing rules.

There are two supported modes of handling compatible, but not identical, joined fields:



x
Syntax: How to Control the Join Mode for Record-Based Data Sources
SET JOIN_LENGTH_MODE|JOINLM} = {SQL|RANGE}

where:

SQL

Sets SQL compliant mode. which assures strict equality between host and cross-referenced field values. This is the default value.

RANGE

Sets FOCUS reporting mode, which supports partial key joins.


Top of page

x
Data Formats of Shared Fields

Generally, the fields containing the shared values in the host and cross-referenced files must have the same data formats.

If you specify multiple host file fields, the JOIN command treats the fields as one concatenated field. Add the field format lengths to obtain the length of the concatenated field. You must observe the following rules:


Top of page

x
Joining Fields With Different Numeric Data Types

How to:

You can join two or more data sources containing different numeric data types. For example, you can join a field with a short packed decimal format to a field with a long packed decimal format, or a field with an integer format to a field with a packed decimal format. This provides enormous flexibility for creating reports from joined data sources.

Note:



x
Syntax: How to Enable Joins With Data Type Conversion

To enable joins with data type conversion, issue the command

SET JOINOPT = [GNTINT|OLD]

where:

GNTINT
Enables joins with data type conversion.
OLD
Disables joins with data type conversion. This value is the default.


Example: Issuing Joins With Data Type Conversion

Since you can join a field with a short packed decimal format to a field with a long packed decimal format, a join can be defined in the following Master Files:

FILE=PACKED,SUFFIX=FIX,$
  SEGNAME=ONE,SEGTYPE=S0
   FIELD=FIRST,,P8,P4,INDEX=I,$
FILE=PACKED2,SUFFIX=FIX,$
  SEGNAME=ONE,SEGTYPE=S0
   FIELD=PFIRST,,P31,P16,INDEX=I,$

The JOIN command might look like this:

JOIN FIRST IN PACKED TO ALL PFIRST IN PACKED2 AS J1

When joining packed fields, the preferred sign format of X'C' for positive values and X'D' for negative values is still required. All other non-preferred signs are converted to either X'C' or X'D'.


WebFOCUS