Viewing Data Profiling Characteristics

In this section:

Data Profiling provides data characteristics for the columns in a synonym. You can display the characteristics for all the columns in a synonym or segment, or for an individual column.

For alphanumeric columns, Data Profiling provides the segment, format, count of distinct values, total count, patterns count, maximum, minimum, and average length, minimum and maximum values, and number of nulls. Patterns count shows the number of patterns found in each alphanumeric column.

For numeric columns, Data Profiling provides the segment, format, count of distinct values, total count, maximum, minimum, and average values, and number of nulls.

Data Profiling for an individual column provides access to Statistics, Patterns, Values, and Outliers reports.


Top of page

x
Data Profiling a Synonym or Segment

How to:

Data Profiling provides information on all the columns in a synonym or segment. You can also drill down to the Values or Patterns reports for an individual column from a synonym or segment Data Profiling report.

Note: Data Profiling is also available from the navigation pane by right-clicking a synonym, selecting Data Profiling, and then clicking Statistics.



x
Procedure: How to View Data Profiling for a Synonym or Segment

To view the Data Profiling information for a synonym or segment:

  1. Open the synonym in the Synonym Editor by double-clicking it from the navigation pane or right-clicking it and selecting Open.

    The Synonym Editor opens to the Field View tab.

  2. Right-click the synonym or segment name, select Data Profiling, and then click Statistics.

    The Data Profiling information displays in the workspace. The last four columns are shown below the rest of the information for illustrative purposes only. The actual report runs across the workspace.

    You may use the Data Profiling Results toolbar to view server messages, print the report, copy data as text, and export the report.

  3. Optionally, you can click a column name or patterns count (for alphanumeric columns) to drill down to the Values or Patterns reports, respectively.

    This is a partial Values report produced by clicking a column name.

    The image below is an example of a phone number column.

    For pattern analysis, a 9 represents a digit, an A represents any uppercase letter, and an a represents any lowercase letter. All printable special characters are represented by themselves, and unprintable characters are represented by an X.

Note: Data Profiling is also available from the navigation pane by right-clicking a synonym.



x
Procedure: How to View Key Analysis for a Synonym or Segment

Key Analysis provides a report that shows which columns in a data source can be used individually, or in combination, to uniquely identify a row. The columns identified in this report are candidates for key column(s).

Note: Key Analysis is also available from the navigation pane by right-clicking a synonym, selecting Data Profiling, and then Key Analysis.

To view key analysis for a synonym or segment:

  1. Open the synonym in the Synonym editor.
  2. Right-click the synonym or segment name, select Data Profiling and then Key Analysis.

    If you selected a segment name, skip to step 4. All columns in the segment will be selected.

  3. Select the columns that you would like to analyze as potential keys to the source table. Then click on sample data.
  4. The report shows:
    segment

    The selected segment.

    name

    Name of the segment.

    format

    The format of each column.

    elements

    The number of elements (columns) shown.

    count

    The number of rows.

    distinct count

    The number of distinct rows.

    distinct percent

    The percentage of rows that are distinct. This value must be 100% for a combination of columns to be used as key.

    duplicate count

    The number of duplicate values.

    duplicate percent

    The percentage of duplicate values. This value must be 0% for a combination of columns to be used as key.



Example: Select Table DMHR and All of the Columns EXCEPT ID_NUM and Capture the Output

By default, the report is sorted by the number of elements so the first rows in the report show one element each. This enables you to determine if any single column could be used by itself as a key. The report then shows all combinations of two columns, three columns, and so on.

To see the values in the report, right-click on any row.

The duplicate rows option shows all duplicate values, which prevent the desired column combination from being used as a key.


Top of page

x
Data Profiling a Single Column

How to:

Data Profiling for an individual column provides access to four reports:

These reports are available by right-clicking a column in the Synonym Editor and selecting Data Profiling.



x
Procedure: How to View Data Profile Statistics

To view the Statistical Data Profiling information for a single column:

  1. Open the synonym in the Synonym Editor by double-clicking it from the navigation pane or right-clicking it and selecting Open.

    The Synonym Editor opens to the Field View tab.

  2. Right-click a column, select Data Profiling, and select Statistics.

    The Statistical Data Profiling information displays in the workspace.

  3. Optionally, you can click a column name or patterns count (for alphanumeric columns) to drill down to the Values or Patterns reports, respectively.


x
Procedure: How to View Data Profile Patterns

Data Profile Patterns shows patterns of letters, digits, and special characters, as well as counts. This is only available for alphanumeric columns.

To view the Patterns Data Profiling information for a single column:

  1. Open the synonym in the Synonym Editor by double-clicking it from the navigation pane or right-clicking it and selecting Open.

    The Synonym Editor opens to the Field View tab.

  2. Right-click a column, select Data Profiling, and select Patterns.

    The Patterns Data Profiling information displays.

For pattern analysis, a 9 represents a digit, an A represents any uppercase letter, and an a represents any lowercase letter. All printable special characters are represented by themselves, and unprintable characters are represented by an X.



x
Procedure: How to View Data Profile Values

Data Profile Values shows unique values.

To view the Values Data Profiling information for a single column:

  1. Open the synonym in the Synonym Editor by double-clicking it from the navigation pane or right-clicking it and selecting Open.

    The Synonym Editor opens to the Field View tab.

  2. Right-click a column, select Data Profiling, and select Values.

    The Values Data Profiling information displays.



x
Procedure: How to View Data Profile Outliers

Data Profile Outliers shows the 10 highest and lowest distinct values.

To view the Outliers Data Profiling information for a single column:

  1. Open the synonym in the Synonym Editor by double-clicking it from the navigation pane or right-clicking it and selecting Open.

    The Synonym Editor opens to the Field View tab.

  2. Right-click a column, select Data Profiling, and select Outliers.

    The Outliers Data Profiling information displays.

    Note: Outliers produce a maximum of 10 highest and lowest distinct values, if they exist.


iWay Software