Difference between revisions of "A Step by Step Guide to Data Migration with Talend ETL"

From ADempiere
Jump to: navigation, search
This Wiki is read-only for reference purposes to avoid broken links.
(Setup Talend for ADempiere)
(Data Migration Use Cases)
Line 73: Line 73:
 
= Data Migration Use Cases =
 
= Data Migration Use Cases =
  
== Case 1: Simple Data Import - Unit of Measure ==
+
== Case 1: Simple Data Import - Business Partner Group ==
 +
 
 +
In this case, we will learn how to import data into 1 table. I.e., Business Partner Group. Though it is a simple case, you will see how other data, in this case, Accounting tab will be created as well.
 +
 
 +
Source Data: [[Media:bp_group.csv]]
 +
 
 +
'''Steps:'''
 +
* Create new job named "MBPGroup", click finish.
 +
** Double click on the new job to open the canvas.
 +
* Create database connection to adempiere (postgres)
 +
** On the left panel, > Metadata > Db Connections, right click > Create Connection
 +
** Create new connection named "adempiere", click next.
 +
** DB Type: "PostgreSQL", fill in all information, Login, Password, Server, Port, Database (i.e., adempiere), Schema (adempeire)
 +
** Click check button, it should show connection successful. Click finish.
 +
** Drag the newly created connection, i.e., adempeire, to the canvas.
 +
** Choose to create component as "tPostgresqlConnection"
 +
[[File:bp_group_db.jpg|500px]]
 +
* Create adempeire connection
 +
** From the right palette, drag component "tAdempiereConnection" into the canvas
 +
** Double click on the icon, you will see component properties in below panel. Fill in Adempiere connection as following,
 +
<pre>
 +
Property Type: Build-In
 +
Login:"SuperUser" Password:"System"
 +
Language:"English" Role:"GardenWorld Admin"
 +
Client:"GardenWorld" Organization:"HQ"
 +
Warehouse:"" Printer:""
 +
Property File:"adempiere.properties"
 +
</pre>
 +
* Test Adempiere Connection
 +
** Connect database connection with tAdempiereConnection, right click on "adempiere" (db connection) > Trigger > On Subjob OK, and drag it to tAdempiereConnection_1
 +
** On the Run tab, click Run, if connection is valid, it should show no exception error.
 +
[[File:test_adempiere_connection.jpg|500px]]
 +
** '''Note:''' Just like when connect Adempiere for the first time, it will show Adempiere Connection dialog.
 +
 
 +
* Connect to data source file (c_bp_group.csv)
 +
 
  
 
== Case 2: Complex Data Import - Business Partners ==
 
== Case 2: Complex Data Import - Business Partners ==

Revision as of 03:40, 28 May 2012

Note.gif Note:

DISCLAIMER - This is an implementation guide written by Kittiu, from ecosoft Bangkok, Thailand. It is not an official guide nor meant to be comprehensive yet. Other contributors are welcome to discuss on improving it.

Status: Draft

Overview

In ERP Implementation, one of the major activity is Data Migration. This task is no doubt, tedious and time consuming, and no one want to work on it. I have been looking for tools to help in this task since I start working with ADempiere, I.e., Excel, ADempiere's Data Import Tools, or even 2Pack. And yes, we provide templates and hoping the customers will fill them in correctly, so that we can load them using ADempiere's Data Import Tools. But in reality, most data from customers are far from usable. And most of the time, we need to modify or rewrite the whole import process by ourselves.

I have been thinking about tools like ETL (i.e., Pentaho Kettle, etc.) to ease the process for a while. While most use case, are direct import to tables. Without ability to call ADempiere API to importing data is not very useful.

Until very recently, I just found that Talend has been used for ADempiere with ability to call ADempiere API some time back at Talend Open Studio. Later is better than never. So far, it is the best tools to work with ADempiere I have found. And in this article, I will try to help explaining how to use Talend with Adempiere in a bit more detail.

Start with Talend and ADempiere

The good thing about Talend that make it useful for our cases (ADempiere), is ability to write ADempiere Connector and Import Components. Following picture depict the overview of how it will work for you,

Talend overview1.jpg

Following will briefly discuss how it works

  1. Initialize database connection
  2. Login to Adempiere using tAdempiereConnection component to receive ADempiere Context (i.e., Env.getCtx()). Now, we considered Logged In. This will also trigger the main import subjob.
  3. This component will will initialize Delimited File (point to local csv file), read the file and send columns and data (row by row) to to tMap component
  4. tMap component (the following picture), will map the input schema to the output schema. We can use expression to modify / filter / lookup data before sending to output.
  5. Provide lookup to other table, in this case, we have Org Name and we want to lookup for AD_Org_ID.
  6. Output data will be mapped to Adempiere's Model using tAdempiereOutput component. As you can guess, data will be created using Adempiere's API (i.e., object.save()).
  7. Just for information, the tLogRow_1 will display the raw output data.

TMap.jpg

Now, if you are convinced that Talend can help your project, we can move on (if not, and you have better solution, please let me know too. :).

Before continue reading, I recommend you to read the following list of articles. In this tutorial, I will not go into details of why I am doing the way I am doing, but rather provide you with useful examples.

Setup Talend for ADempiere

Note: You can also refer to Talend Open Studio as the original document.

  • Provide Adempiere core libraries
    • Copy ADEMPIERE_HOME/lib/Adempiere.jar and ADEMPIERE_HOME/lib/CCTools.jar to TOS_HOME/lib/java
    • Note: For customization.jar or other additional library, you can put it here as well. But be remembered that we will have to tell about it in the next Optional step.
  • Install Adempiere Talend's components
    • Download the Adempiere components Media:Adempiere_talend_0.2.zip
    • Extract components (tAdempiereXXX) and put it in any directory, i.e., TOS_HOME/custom_components
    • Point TOS to the directory containing the components.
      • Open TOS, > Window > Preferences, in tree, > Talend > Components
      • Update field "User component folder:" to your custom_component folder.
      • Apply the change. Now, if you create new job, you should now see Adempiere components in the Palette area under Business > Adempiere
  • (Optional), if you also installed customization.jar, make sure your Adempiere components know about it.
    • With TOS, > Window > Perspectives > Component's Designer
    • You should see 3 Adempiere components here, tAdempiereConnection, tAdempiereProcess, tAdempiereOutput. Each component will have a config file (tAdempiereXXX_java.xml).
    • Add new line for your customiizatoin.jar in <CODEGENERATION> part.
	<CODEGENERATION>
		<IMPORTS>
			<IMPORT NAME="Adempiere" MODULE="Adempiere.jar" REQUIRED="true" />
			<IMPORT NAME="Adempiere Libraries" MODULE="CCTools.jar" REQUIRED="true" />
			<IMPORT NAME="Customization" MODULE="customization.jar" REQUIRED="true" />
		</IMPORTS>
	</CODEGENERATION>
  • Install Samples, for samples job discussed in this tutorial.
    • Download the sample jobs Media:Adempiere_talend_example.zip
    • Switch back to perspective "Integration", on "Job Design" not, right click and "Import Items"
    • Select archive file to import.

Data Migration Use Cases

Case 1: Simple Data Import - Business Partner Group

In this case, we will learn how to import data into 1 table. I.e., Business Partner Group. Though it is a simple case, you will see how other data, in this case, Accounting tab will be created as well.

Source Data: Media:bp_group.csv

Steps:

  • Create new job named "MBPGroup", click finish.
    • Double click on the new job to open the canvas.
  • Create database connection to adempiere (postgres)
    • On the left panel, > Metadata > Db Connections, right click > Create Connection
    • Create new connection named "adempiere", click next.
    • DB Type: "PostgreSQL", fill in all information, Login, Password, Server, Port, Database (i.e., adempiere), Schema (adempeire)
    • Click check button, it should show connection successful. Click finish.
    • Drag the newly created connection, i.e., adempeire, to the canvas.
    • Choose to create component as "tPostgresqlConnection"

Bp group db.jpg

  • Create adempeire connection
    • From the right palette, drag component "tAdempiereConnection" into the canvas
    • Double click on the icon, you will see component properties in below panel. Fill in Adempiere connection as following,
Property Type: Build-In
Login:"SuperUser"		Password:"System"
Language:"English"		Role:"GardenWorld Admin"
Client:"GardenWorld"		Organization:"HQ"
Warehouse:""			Printer:""
Property File:"adempiere.properties"
  • Test Adempiere Connection
    • Connect database connection with tAdempiereConnection, right click on "adempiere" (db connection) > Trigger > On Subjob OK, and drag it to tAdempiereConnection_1
    • On the Run tab, click Run, if connection is valid, it should show no exception error.

Test adempiere connection.jpg

    • Note: Just like when connect Adempiere for the first time, it will show Adempiere Connection dialog.
  • Connect to data source file (c_bp_group.csv)


Case 2: Complex Data Import - Business Partners

Case 3: Document Import with DocAction - Import and Complete Sales Order

Case 4: Import XML Data generated from 2Pack

Case 5: Logging Unsuccessful Imports