Using JOIN to Clean Multiple Addresses

Using JOIN allows cleaning multiple addresses contained in any data source for which you can create a synonym.


Top of page

Example: Cleaning a Table of Addresses

This example shows how to clean a table that contains address data by using a JOIN to pass parameters to AddressDoctor.

  1. Create a data flow. Right-click on an application directory and select New, Flow.
  2. Add the source table. Right-click on the left hand side of the Data Flow and select Add Source. In the Select Synonym window Look in pull-down, select ibisamp. Scroll down and select persinfo.
  3. Add the AddressDoctor synonym as a source. Expand the application directory where you created the AddressDoctor synonym, drag the synonym to the left side of the data flow.
  4. A Join object is automatically added. (If not in Tools, Options, Data Flow, check Add Join Object. Then delete the synonym and add it again).

    Flow

  5. Double click on the Join object. The Join Editor opens.
  6. Under Right Source, right-click on Name and select Segment and Field View.
  7. Under Left Source, click on PIN, and under Right Source, click on KEY1. Click on to add the fields to Join Conditions. Note that this field is not cleansed by AddressDoctor, it is used for identification.
  8. Repeat for each of the following pairs of fields:

    T1.STREETNO

    T2.STREET1

    T1.APT

    T2.STREET2

    T1.STATE

    T2.PROVINCE1

    T1.POSTALCODE

    T2.POSTALCODE1

    T1.CITY

    T2.LOCALITY1

    T1.COUNTRY

    T2.COUNTRY1

  9. In the Join Type pull-down, select Left Outer Join. When you are complete and window looks like this, click OK.

  10. Right-click on the SQL object and select WHERE Filter.
  11. The Where Filter Calculator opens. Enter this condition, then press Ok.
    T1.COUNTRY = 'USA'
  12. Right-click on the right side of the Data Flow, select Add Target, New.
  13. A new target object is added to the Data Flow. Right-click on the object and select Properties and change the database, synonym name, and other options as desired.
  14. Click on the run button and wait for the flow to complete.
  15. Right-click on the target object and select Toggle. Click on the Sample Data tab to see the results.

    Note that some of the addresses were corrected (PROCESSSTATUS of C4) others could not be (PROCESSSTATUS of I2 and I4). From this, we can conclude that this file has a lot of bad addresses.


iWay Software