iWay DataMigrator User's Guide

Feedback

Copyright © 2013 iWay Software

This documentation describes how to install and use iWay DataMigrator, formerly ETL Manager. It is intended for the person responsible for transforming data by designing and executing data and process flows.

In some instances DataMigrator is abbreviated as DM.

For up-to-the-minute information, please refer to the release notes.

How This Manual Is Organized

This manual includes the following chapters:

Chapter/Appendix

Contents

1

Introduction to DataMigrator

Provides background information about DataMigrator architecture, components, concepts, and uses.

2

Setting Up DataMigrator

Provides setup information for DataMigrator developers and administrators, including overviews for local and remote DataMigrator Server configurations.

3

Planning a DataMigrator Application

Offers conceptual and practical information to be considered before designing a DataMigrator application.

4

Working in the Data Management Console

Describes the Data Management Console (DMC) interface.

5

Designing a Data Flow

Provides detailed information about all aspects of Data Flow design.

6

Designing a Process Flow

Provides detailed information about all aspects of Process Flow design.

7

Tutorial: Creating Data and Process Flows With DataMigrator

Provides step-by-step instructions for building a Data Flow and a Process Flow using sample data.

8

Generating Reports, Viewing the Log File, and Printing Flows

Enables you to view reports on how flows run, what the DataMigrator Server environment looks like, and what error messages the DataMigrator Server generates.

9

Optimizing Performance

Provides tips for ensuring successful deployment of your data mart and data warehouse implementations.

10

Working With Synonyms

Provides information about creating synonyms in the Data Management Console and using the Synonym Editor.

11

Advanced DataMigrator Tips and Techniques

Provides a series of examples to demonstrate DataMigrator’s capabilities.

12

Change Data Capture

Describes how to use Change Data Capture to enable reading from database logs to determine what rows in a table have changed.

A

DataMigrator Log Statistics and Tables

Provides information about DataMigrator internal information.

B

Event-Based Scheduling

Describes event-based scheduling options that enable execution a flow outside of DataMigrator.

C

Problems, Errors, and Troubleshooting

Assists you in resolving problems that may occur during the development of your DataMigrator application.

D

Customizing Your Environment

Provides information about customizing the DataMigrator environment.

E

Source Management

Source control programs enable you to manage flows, stored procedures, and synonyms, typically providing file backup, change tracking, and version control

F

Adapter for AddressDoctor

Provides information on the Adapter for AddressDoctor.

Summary of New Features

This table describes new features and enhancements for Version 7.7.05.

New Feature

See...

DataMigrator Flows

Direct Load Flows

A Direct Load flow can now have a load type of Slowly Changing Dimensions. This means that a source synonym for table log records used for Change Data Capture can be used as input.

 

Flow Properties E-Mail

E-mail messages configured from Flow Properties, which include the log on completion or failure of the flow, can now have a customized subject line, message and importance.

Flow Properties Pane - E-Mail Attribute

Column Naming Strategy

The column Alias (for relational databases this is the actual name) can be used.

 

Prior to load - Truncate

This option is now available for DB2 and UDB tables on all platforms, except IBM i which still does not support it.

 

Show/Modify Data

Data in a table can be viewed and edited.

 

Default DBMS Error Limit Set

You can specify when to stop processing the number of DBMS errors received if the number of errors specified is exceeded

User Preferences - Run Options

Sending Messages in Process Flows

You can now send email messages to email addresses specificied in a procedure.

Flow Properties Pane - E-Mail Attribute

Data Management Console - General

New Files

The New menu now lets you create a file on the server. The Save As button or menu options lets you save it as any supported file type.

 

Traces

You can view traces of the application from the DMC.

Setting User Preferences

New Column Option

There is a new Index option that shows the name of any indexes on the underlying relational database table

How to Change the Information Displayed for Columns

Backup and Restore Server Configuration Files

There are new options to backup and restore server configuration files.

How to Create a Backup of Server Configuration Files

and

How to Restore Server Configuration Files

Star Schemas and SCD

Change Flag

A new SCD type of Change Flag for Type 1 changes records the date or date and time when a row is updated.

 

This table describes the new features and enhancements for Version 7.7.04.

New Feature

See...

DataMigrator Flows

Procedures in Parallel Group

A parallel group can now contain procedures in addition to flows.

 

Conditional Action After E-mail Objects

Multiple objects with different conditions can be used after an e-mail object.

 

Select Multiple Objects

You can select multiple objects with Ctrl-Click to move them as a group.

Moving Objects in the Workspace

Recreate DBMS Table

Recreate DBMS table creates tables with indexes when included in the synonym.

Navigation Pane: Synonyms

Parallelization and Partitioning

Quick ETL Copy for a single source table can now split the table into a user specified number of partitions based on a user selected numeric key column and run them in parallel.

Parallelization and Partitioning

Calculator

Where Condition Sample Values

The list of columns displayed in the where condition calculator can be expanded to show sample values retrieved from the data source or the synonym.

 

Function Assist for EXISTS

A new function assistant for the SQL EXISTS clause simplifies building this test for rows containing specified value(s).

 

Subquery in Select List

A subquery can be used in the list of columns in the select list.

Using Subqueries

HAVING Filter

A condition can be applied to restrict data retreived after aggregation has been performed with a HAVING filter.

 

Data Management Console - Browser

Additional File Types

The browser now displays additional types of files, including text and XML documents that can be managed, as well as opened and edited with the text editor.

 

Data Profiling - Row Count

When only the count of rows is needed, this new data profiling option is quicker than running the statistics report.

 

File Types

Additonal file types are now displayed in the browser. Files with text data can be edited in the text editor.

 

Data Management Console - General

Change Case in Text Editor

When editing files in the text editor, selected text can be changed to all upper case, all lower case, or the case can be inverted.

Text Editor Context Menu

Print Preview

When viewing a report, Print Preview now provides a more accurate image of what the report will look like when printed.

 

Core Engine Settings

Certain core engine settings can now be set from a user dialog without the need to edit a profile.

How to Use Core Engine Settings

Line Colors

You can now control the line colors using Tools Options.

Setting User Preferences

Recent Files

A list of recently used files is now retained and can be selected from the File menu choice Recent Files. The number of files to display can be selected.

 

Autosave

Open files can be automatically saved after a specified number of minutes have elapsed.

User Preferences - General

Synonym Editor

Indexes

Indexes, in addition to the primary key, are shown in the synonym when created. Additional indexes can be added to the synonym when creating a table from a synonym are created too.

 

Column Field Names

The maximum length of field names in a synonym and column names supported in an RDBMS table has been increased from 64 to 128 characters.

 

Undo/Redo

Editor actions can be undone by clicking the Undo button on the toolbar in the Synonym Editor or DataFlow Designer. The maximum number of undo operations can be specified.

User Preferences - Synonym Editor or User Preferences - Data Flow Designer

Database Default Values

If a database column has default value assigned in the database, and no value is assigned in Target Transformations, the database will now assign the default value. In previous releases a NULL value was always assigned for NULLABLE columns and space or zero for NOT NULL columns.

Unassigned Fields

DataMigrator Reporting

Printing Reports

A new Print Preview option gives a more accurate view of what the report output will look like when printed. Forward and Back buttons enable paging through the report.

The print dialog box allows printing the entire report or just the selected range.

Working With a Report

Report Output

Output from stored procedures that generate reports can now either be sent as text to the log as in prior release or formatted and sent to a special output queue.

Scheduler Configuration Window

Backup

When a backup is made of the DataMigrator Log and Statistics tables a synonym is also created that can be used to read the tables. A message appears in the Console Log with the tables locations and names.

DataMigrator Scheduler

Start or Stop Scanning

You can start or stop the scheduler from scanning from the Data Management Console.

How to Start or Stop the Scheduler

Run When Server Starts

Flows can be scheduled to start whenever the server is started. This is used for flows that need to run all the time, such as those using Change Data Capture or the File Listener capabilities.

 

File Listener

File Listener can now monitor a directory for files or sub-directories that are added, chanegd, or deleted.

See File Listener Attributes

This table describes the new features and enhancements for Version 7.7. to 7.7.03.

New Feature

See...

DataMigrator

Change Data Capture

Detect changes to source tables as they occur using Change Data Capture.

Change Data Capture

Star Schema Load

New sample files and a tutorial show how to load data into a Star Schema.

How to Create Sample Procedures and Data for Star Schema and Loading a Star Schema

Iterator

New iterator simplifies processing using multiple tables with the same layout.

How to Create Sample Procedures and Data for Iteration and Iterating Through Multiple Tables

File Listener

New adapter based File Listener facilitates processing files as they appear in a specified directory or FTP server.

Loading a Target Table Using the File Listener and File Listener Attributes

LOOPBACK Server

Improved diagnostic messages are displayed when attempting to start a local LOOPBACK server.

DataMigrator Flows

Validate Flow

You can validate a flow by clicking the Validate Flow button. Any errors found will appear in a dialog window with an error message.

How to Validate a Flow

Direct Load Flows

Direct Load flows now capture FIELD… IS TRUNCATED errors and continue processing, while adding the number of records to the count of Row(s) rejected due to format error and in &FORMAT.

 

Direct load flow sources

Direct Load Flow supports any source.

Creating a Direct Load Flow

Wait after parallel group

Proceed to next object regardless as to whether the wait was satisfied.

Wait Condition Type Dialog Box

Duplicate field names

For hierarchical source tables with repeated field names, the segment names are used to distinguish them.

Target Transformation Documentation

You can add a description to a target transformation.

How to Document Transformations

Copy and Paste Transformations

Transformations can now be copied from a target and pasted to the same, or another, target.

Source Transformations Context Menu

Adapter type display

The adapter type (such as Microsoft SQL Server) is optionally displayed on the workflow for sources and targets.

User Preferences - Data Flow Designer

New function assist for COALESCE

The function assist has been enhanced to support the SQL function COALESCE, which can take two or more arguments.

SQL Functions

New function assist for CASE

The function assist has been enhanced to support the SQL simple and searched CASE expressions.

 

E-mail notification

E-mail notification from within a process flow can now send both a message and an attached file.

How to Add an E-Mail Object to the Process Flow

Run as Stored Procedure

The option Execute as RPC has been renamed Run as Stored Procedure and a new option can be set so that all flows are added with this option is checked.

How to Maintain Continuous Processing in a Single Agent and User Preferences - Process Flow Designer

Nested Select Statements

Additional SQL objects can be added to a data flow. This allows joining to a select statement and avoids the need to create intermediate tables and use multiple flows.

Adding a SELECT to a Data Flow

Union

A UNION object can be added to a data flow. This allows concatenating the results of multiple select statements, avoiding the need for multiple flows. This object also supports the SQL operations MINUS and INTERSECT.

Adding a UNION to a Data Flow

Join Assist

When you add a synonym to a Join object that already has one table selected, the Select Synonym dialogue can now be filtered to show only those synonyms with matching columns. Candidate synonyms may have either matching key column names, or the synonym may contain a primary key/foreign key relationship.

Select Synonym Dialog Box

Help with Errors

Errors in a transformation can be viewed by selecting Errors details from the transformation.

Error message numbers in these and other dialogues are a hyperlink to the message text.

How to View Error Details

New XML Targets

DataMigrator can create new XML documents without requiring a predefined synonym, as name value pairs.

Target Properties for New XML Targets

Bulk Load

New bulk load options are available for Sybase, Teradata, and ORACLE.

Teradata Target Bulk Load Options and Oracle Target Bulk Load Options

Insert Records from Memory

This load option is available for IBM i V6R1 CLI and Teradata.

How to Set Options for Relational Targets

Where Condition with Sub-select

A where condition with an IN clause and a subselect can now use multiple columns.

SQL Functions

Calculator

 

DBCS Functions

New functions support operations on double-byte character data.

Using Functions

Function Assist

Additional column information is now displayed.

How to Use Function Assist

Relational Expression

The Relational Expression tab Values Selection now allows multiple values to be selected.

Data Management Console - Browser

Browser views

Separate tabs for procedures, tabs, and adapters have been replaced by folders in the browser.

Navigation Pane

Workspace Configuration

Server configuration, which previously required using the web console, can now be done from a new Workspace folder.

Configuring DataMigrator Server Operations

Descriptions

Descriptions can be stored and displayed for directories and stored procedures.

Directory Shortcut Menu Options

Home Application Directory

Each user can have a Home application directory, visible only to themselves and the Server Administrator.

Applications Settings and How to Create and Configure User Home Directory

Nested Application Directories

Application directories can now have subdirectories.

Applications Settings

Foccache Temporary Directory

A temporary directory is added to the application path on connection to the server that exists for the duration of the session.

Applications Settings

Explorer View

A new explorer view shows details of files and directories viewed in the browser window.

 

Schedule and Set E-mail

A new window allows adding schedule and E-Mail settings to a stored procedure, and editing for procedures and flows without opening the flow.

Navigation Pane: Procedures

Copy, Cut, and Paste Multiple Server Objects

Using standard Windows shortcut keys, you can select multiple server objects to copy, cut, and then paste in another directory or another server.

Copying, Cutting, and Deleting Procedures, Synonyms, and Other Files

Source Code Control

Source code control applications ChangeMan, Perforce, and SubVersion are now supported.

Improved support for all source code control systems, including use of private home directories for development and public directories for deployment, as well as a get latest version option. There is also support for operations in Visual SourceSafe, such as creating the folder during association.

Configuring Source Management and Setting Up Source Control

Data Management Console - General

Property Sheets

Property sheets are used instead of dialogues throughout the Data Management Console.

Improved Application Path Configuration

The user interface for configuring the application path has more options and allows configuring the path for specific users.

Managing Application Directories and Configuring the Application Path

Reusable Transformations

A new user interface simplifies creating and editing reusable transformations (functions).

Creating and Using Reusable Transformations

Replace in Text Editor

When editing flows, synonyms, or configuration files in the text editor, you can find a text string and replace it.

Text Editor

Aerial View

A new aerial view gives an overview of flows and synonyms in the modeling view. For large and complex flows this simplifies selecting an area of the object to view.

 

Save As Toolbar Button

All editors have a toolbar button Save as to save a file under a new name.

Ribbon Tabs

Close all but current window

A main menu option closes all open windows except for the current one.

 

Viewing TX Fields

Sample data, Test SQL, and Test Transformations for fields described as TX now displays the first 50 characters of those field values.

Guidelines for Variable Length Character and Text Columns in Synonyms

Synonym Editor

Enhanced Modeling View

A new concise segment display by default shows only column names. It can be customized to show additional information.

Creating, Enhancing, and Editing Synonyms Using Modeling View

Cluster Joins insert by Reference

A Cluster Join is a synonym that describes multiple tables which previously required a copy of a synonym for each table. Now synonyms can be included by reference to another synonym, thus ensuring that changes are propagated automatically.

Creating, Enhancing, and Editing Synonyms Using Modeling View

Variables in Synonyms

Variables can be added to a synonym to represent table names and connection names.

Using Variables in a Synonym

Support database generated auto increment columns

Most relational databases support a column that automatically increments when a row is inserted. These columns are now identified when synonyms are created. Tables can be created from a synonym that identifies such columns.

Segment Attributes Summary

Filename

For flat file data sources, a field with an ALIAS value of INSTANCE will contain the file name.

Column/Field Attribute Summary and List item.

Multiple Fields

Selected attributes for multiple fields can be changed in a single operation.

How to View and Edit Synonym Attributes

Connection

The connection entry is a combination box. You can enter the name of a connection or select it from a drop-down menu.

Lookup

DB_LOOKUP for FOCUS files

FOCUS files can now be used with DB_LOOKUP.

Cache for DB_LOOKUP

A new setting enables a cache for DB_LOOKUP to improve performance.

Improving Lookup Performance

Star Schemas and Slowly Changing Dimensions

Specify begin and end dates

For Slowly Changing Dimension Type II, changes override the default begin and end dates for active and/or inactive records.

Slowly Changing Dimension Variables

Support for Auto Increment Columns

Database-generated automatically incremented columns can be used as surrogate keys in dimension tables.

Column/Field Attribute Summary

DataMigrator Reporting

View Log and Statistics Reports Enhancements

The View Log and Statistics selection panel has been enhanced so that you can select a specific flow name and view scheduled or immediate flows.

DataMigrator Log, Statistics, and Summary Flow Reports

Format Errors

Format errors, reported only for direct load flows, now include truncation errors, in addition to conversion errors. For a delimited flat file source, where a field contains a longer value than specified as the ACTUAL length of the field, the record is rejected and the count of format errors is increased by one. Previously, processing would stop when such an error occurred, but now processing continues. If record logging is in effect for format errors, the record is logged to the specified file.

 

Flow Logging and Monitoring

You can control how often and the maximum number of lines are written to the LOG table for each flow.

A flow can now be monitored as log messages are written while the flow is running.

Scheduler Configuration Window

Send warning e-mail for ETLLOG table errors

An automatic e-mail message can be sent if an error occurs while writing to the ETLLOG file. Such an error could occur, for example, if the file is full.

How to Send an Email Message When There Is an Error Writing to the ETL Log

Maximum size of Log and Statistics tables increased

The ETLLOG and ELTSTATS files were previously limited to two gigabytes. The limit is now 16 gigabytes.

Flow Name

For flows saved in Release 7.7 or later, the name of the flow that appears in logs and the variable &&CM__REQEUST now includes the application directory.

DataMigrator Variables

DataMigrator Scheduler

Intraday Scheduling

Flows can be scheduled to run during a specified time period every day.

Schedule Attribute Pane

Restart Interval

You can specify the time between restart attempts for all flows with a restart.

Scheduler Configuration Window

Restart Count

You can specify the number of times a failed schedule agent should be restarted.

Scheduler Configuration Window

Scheduler ID

You can explicitly specify the user ID of the scheduler. This will control the application path that is scanned for scheduled flows to run.

Scheduler Configuration Window

Scheduled Events Report

The scheduled events report lets you specify a time range to view all jobs that are specified to start in that time period.

DM Scheduled Events Window

Run on Holidays

You can specify the option to run on Holidays.

How to Schedule a Process Flow to Run on Holidays

Running Multiple Instances

Running multiple instances of a flow can be prevented with new parameters for CMRUN and CMASAP.

Event-Based Scheduling

Variables

Assign Variables

A new variable object in a process flow simplifies assigning values for local and global variables.

Assigning and Using Variables in a Process Flow

Use local variables in a data flow

You can now use local variables in a data flow for where conditions and other instances. For running flows, a value can be set in a process flow.

Using a Parameter With a Flow and Using Variables in Data Flows

Test local variables

The DMC now prompts for a local variable if it is needed while testing a SQL or transformation.

How to Test a Local Variable

Use variables for target table name

A variable can be used for a target table name so that the name can be assigned at run time.

How to Create a New Data Target With a Variable Name and How to Submit or Schedule a Data Flow With a Variable Name

Adapters

Excel

Now only a single ODBC connection is required with a named adapter for Excel® that allows creating synonyms for tables in multiple worksheets.

An iWay adapter enabled for IWAF can be used as a source in a DataMigrator flow.

Managing Excel Metadata in the Adapter Administrationfor UNIX, Windows,OpenVMS, IBM i, and z/OS manual

IWAF - iWay Adapter Framework adapters

A new direct retrieval adapter for Excel can read worksheet files directly, on any platform, without using ODBC.

Managing IWAF Metadata in the Adapter Administrationfor UNIX, Windows,OpenVMS, IBM i, and z/OS manual

Flat and Delimited Flat Files

A flat file server source can reside on an FTP server.

The following synonyms of different types can be created by an adapter for a Delimited Flat File: Multiple Synonyms, Single Synonym for a file that resides on a server or on a local machine, Synonym for a collection of files with a one-time retrieval, and Synonym for a collection of files for a File Listener.

Multiple consecutive delimiters in a delimited flat file can be optionally treated as one. For example, multiple spaces can be used as a delimiter.

File Listener Attributes and Managing the Adapters for Flat and Delimited Flat Files in the Adapter Administrationfor UNIX, Windows,OpenVMS, IBM i, and z/OS manual

Documentation Conventions

The following conventions apply throughout this manual:

Convention

Description

THIS TYPEFACE

or

this typeface

Denotes syntax that you must enter exactly as shown.

this typeface

Represents a placeholder (or variable) in syntax for a value that you or the system must supply.

underscore

Indicates a default setting.

this typeface

Represents a placeholder (or variable), a cross-reference, or an important term. It may also indicate a button, menu item, or dialog box option that you can click or select.

Key + Key

Indicates keys that you must press simultaneously.

{  }

Indicates two or three choices. Type one of them, not the braces.

[  ]

Indicates a group of optional parameters. None are required, but you may select one of them. Type only the parameter in the brackets, not the brackets.

|

Separates mutually exclusive choices in syntax. Type one of them, not the symbol.

...

Indicates that you can enter a parameter multiple times. Type only the parameter, not the ellipsis (...).

.
.
.

Indicates that there are (or could be) intervening or additional commands.

Related Publications

Visit our Technical Documentation Library at http://documentation.informationbuilders.com. You can also contact the Publications Order Department at (800) 969-4636.

Customer Support

Do you have any questions about this product?

Join the Focal Point community. Focal Point is our online developer center and more than a message board. It is an interactive network of more than 3,000 developers from almost every profession and industry, collaborating on solutions and sharing tips and techniques, http://forums.informationbuilders.com/eve/forums.

You can also access support services electronically, 24 hours a day, with InfoResponse Online. InfoResponse Online is accessible through our website, http://www.informationbuilders.com. It connects you to the tracking system and known-problem database at the Information Builders support center. Registered users can open, update, and view the status of cases in the tracking system and read descriptions of reported software issues. New users can register immediately for this service. The technical support section of www.informationbuilders.com also provides usage techniques, diagnostic tips, and answers to frequently asked questions.

Call Information Builders Customer Support Service (CSS) at (800) 736-6130 or (212) 736-6130. Customer Support Consultants are available Monday through Friday between 8:00 a.m. and 8:00 p.m. EST to address all your questions. Information Builders consultants can also give you general guidance regarding product capabilities. Please be ready to provide your six-digit site code number (xxxx.xx) when you call.

To learn about the full range of available support services, ask your Information Builders representative about InfoResponse Online, or call (800) 969-INFO.

Information You Should Have

To help our consultants answer your questions most effectively, be ready to provide the following information when you call:

User Feedback

In an effort to produce effective documentation, the Technical Content Management staff welcomes your opinions regarding this document. Please use the Reader Comments form at the end of this document to communicate your feedback to us or to suggest changes that will support improvements to our documentation. You can also contact us through our website http://documentation.informationbuilders.com/connections.asp.

Thank you, in advance, for your comments.

iWay Software Training and Professional Services

Interested in training? Our Education Department offers a wide variety of training courses for iWay Software and other Information Builders products.

For information on course descriptions, locations, and dates, or to register for classes, visit our website (http://education.informationbuilders.com) or call (800) 969-INFO to speak to an Education Representative.

Interested in technical assistance for your implementation? Our Professional Services department provides expert design, systems architecture, implementation, and project management services for all your business integration projects. For information, visit our website (http://www.informationbuilders.com/support).