Improving Data Loading Performance

You can improve performance when loading data into the data target. Keep the following suggestions in mind.

Prior to load options. Before you load data into an existing table (for most RDBMS targets), DataMigrator gives you the option to Delete all rows from table or Truncate table. When deleting the rows, the relational database internally logs the rows deleted in case a ROLLBACK is issued, which for large tables can be slow. Truncating the table does not do any logging, so it is much quicker, but then a ROLLBACK cannot be issued.
Commit. For relational database and FOCUS targets, DataMigrator lets you specify the number of rows before a COMMIT (or CHECK) is issued. The default value is 1000. A higher value generally improves throughput, but this depends on the memory available on the server, the database, the number of columns loaded, and their sizes. Increasing the commit value too much can slow performance. Try different values to see what works best.
Insert/Update. This is the default load method. For a relational database target:
When include duplicates is selected each incoming row is sent to the target table without first checking to see if the key value is found. This generally provides the best loading time. If there is a duplicate key, the database will reject the row with a unique constraint violation. When reject or update duplicates is selected, the keys of each incoming row are matched against the target table. This additional step does slow processing, but allows for matched rows in the target table to be updated. In this case, care should be taken to use as few columns as keys as possible to uniquely identify the row. Note that if no columns are identified as key columns, the entire column is considered to be the key, which slows processing.
Using insert records from memory. This load option uses a database vendor bulk or array insert API when available. This enables inserting a block of rows at a time. The default block size is 1000 rows, but a higher or lower value may provide better throughput, depending on the memory available on the server, the database, the number of columns loaded and their sizes. Try different values to see what works best.
Note that incoming rows are not matched against the target table as with insert/update logic, so the number of rows inserted or rejected are not recorded. There are no flow statistics available.

This method loads faster than the default load logic. For most databases, this is the fastest method, when the query can use automatic passthrough (so no intermediate flat file is created).
Using Bulk Load Utility using a disk file. This load option uses a database vendor bulk load program. As the name implies, this method always creates an intermediate flat file. It is still faster than the default load logic. The options available depend on the database. For most databases, it is not quite as fast as Insert Records from Memory, when that option is available. However, for some databases designed for data warehousing (such as Sybase IQ and IBM Redbrick Warehouse), this is by far the fastest option. Additional tuning options are available depending on the database.

iWay Software