A Step by Step Guide to Data Migration with Talend ETL

From ADempiere
Revision as of 01:09, 28 May 2012 by Kittiu (Talk) (Start with Talend and ADempiere)

Jump to: navigation, search
This Wiki is read-only for reference purposes to avoid broken links.
Note.gif Note:

DISCLAIMER - This is an implementation guide written by Kittiu, from ecosoft Bangkok, Thailand. It is not an official guide nor meant to be comprehensive yet. Other contributors are welcome to discuss on improving it.

Status: Draft

Overview

In ERP Implementation, one of the major activity is Data Migration. This task is no doubt, tedious and time consuming, and no one want to work on it. I have been looking for tools to help in this task since I start working with ADempiere, I.e., Excel, ADempiere's Data Import Tools, or even 2Pack. And yes, we provide templates and hoping the customers will fill them in correctly, so that we can load them using ADempiere's Data Import Tools. But in reality, most data from customers are far from usable. And most of the time, we need to modify or rewrite the whole import process by ourselves.

I have been thinking about tools like ETL (i.e., Pentaho Kettle, etc.) to ease the process for a while. While most use case, are direct import to tables. Without ability to call ADempiere API to importing data is not very useful.

Until very recently, I just found that Talend has been used for ADempiere with ability to call ADempiere API some time back at Talend Open Studio. Later is better than never. So far, it is the best tools to work with ADempiere I have found. And in this article, I will try to help explaining how to use Talend with Adempiere in a bit more detail.

Start with Talend and ADempiere

The good thing about Talend that make it useful for our cases (ADempiere), is ability to write ADempiere Connector and Import Components. Following picture depict the overview of how it will work for you,

Talend overview1.jpg

  1. Initialize database connection
  2. Login to Adempiere using tAdempiereConnection component to receive ADempiere Context (i.e., Env.getCtx()). Now, we considered Logged In. This will also trigger the main import subjob.
  3. This component will will initialize Delimited File (point to local csv file), read the file and send columns and data (row by row) to to tMap component
  4. tMap component (the following pic), will map the input schema to the output schema. We can use expression to modify / filter / lookup data before sending to output.
  5. Provide lookup to other table, in this case, we have Org Name and we want to lookup for AD_Org_ID.
  6. Output data will be mapped to Adempiere's Model using tAdempiereOut component. As you can guess, data will be created using Adempiere's API (i.e., object.save()).
  7. Just for information, the tLogRow_1 will display the raw output data.

TMap.jpg

Now, if you are convinced that Talend can help your project, we can move on (if not, and you have better solution, please let me know too. :).

Before continue reading, I recommend you to read the following list of articles. In this tutorial, I will not go into details of why I am doing the way I am doing, but rather provide you with useful examples.

Data Migration Use Cases

Case 1: Simple Data Import - Unit of Measure

Case 2: Complex Data Import - Business Partners

Case 3: Document Import with DocAction - Import and Complete Sales Order

Case 4: Import XML Data generated from 2Pack

Case 5: Logging Unsuccessful Imports