Difference between revisions of "Sponsored Development: Replication"

From ADempiere
Jump to: navigation, search
This Wiki is read-only for reference purposes to avoid broken links.
(Example: Org_Value - Export Format Line window)
(Replication Case Study)
Line 557: Line 557:
  
 
= Replication Case Study =
 
= Replication Case Study =
[[Replication_HowTo]]
+
[[Replication_HowTo Replication How To]]

Revision as of 23:56, 17 September 2009

Contents

Plan

Implemented in stages.

  • 2 weeks. Research and Proof of Concept.

License

GPL2 - the same as Adempiere.

Project Team

Coordinator

Victor Perez

Functional Specs

Acrhictect

Developers

Testers

  • Trifon Trifonov
  • Red1 together with own company to document it. (if any like to sponsor that pls do! But nah, its ok! :) )

Implementation

  • This functionality is include in Adempiere350

Review and Documentation

Sponsors

  • e-evolution paid $3600 USD + 3600 USD will be paid when working.
  • Red1 promised US$2,000 (now that others are encouraged, i like to focus on good peer review practice such as DB migration script)
    • Proposing sponsorship rules to ensure ethical and certainty of delivery from all parties.
  • metas paid US$2,000 and contributes work to help this killer feature making ADempiere a killer application

Requirements

Finish off the beta Replication functionality inherited from Compiere.

  • Need replication master to master asynchronously.
    • Must work with Postgres!
    • 38 stores working with local servers must synchronize with one central server.
    • Connection between remote servers is 128 kbs.
    • Records created on Central server: Product, Price, Term Credit, Business Partner.
    • Records created on Remote servers: Sales Order, Shipment, Invoice (Customer), Payment, Journal Cash and Business Partner , Inventory
  • Must have migration script that affects DB changes (not 2pack or ADCK) (my sponsorship is going towards that - Red1)

Scenario

The Company has 38 stores with a local adempiere and postgresql server and a 128kb internet conection , these stores must be able to operate even if they are not online, when they are online the stores will automatically replicate to the central server

Design

What we need for Replication

  • A. Way to export information from Adempiere.
    • I think that there must be mechanism inside Adempiere to define Export message.

This is what i managed to develop with 'Export Format' window. User can define XML format. XML Format can have tree structure.

  • B. Triger which starts Export process.
    • See 'When to triger export event'
  • C. Place where all Export messages are stored till Adempiere is disconnected from Internet.
    • I think that this must be some external Server with proven capabilities.
      • Like JMS. JMS can guarantee delivery of messages. So task of Admepiere is just to deliver message to JMS Server.
  • D. Place where all Incoming messages are stored till Adempiere process them.
    • I think that this must be some external Server with proven capabilities.
      • Like JMS. JMS can guarantee delivery of messages. So task of Admepiere is just to read Incoming message from JMS Server.
  • E. Process which listen for new incoming messages.
    • I think this also must be process inside Adempiere which connects to Incoming Message storage and process messages(Calls Import functionality).
  • F. Way to import information in Adempiere.
    • I think that this also must be functionality inside Admepiere.

This is what i managed to develop with ImportProcess.

File:Replication.pdf
File:Replication-model.pdf

What is way to implement the Replication ?

Based in my business issue,


  • Issue 1:
    • A Product is created in Central Node so that a message is forward for each Remote Node, the message is receipt for each node, when this happen Adempiere create the Product in Remote Node, if a Remote Node do not receipt the broker try send again until receipt is complete, this way we are ensuring that all have this product.
  • Solution
    • Is necessary have a broker (Central Node) that forward the message to multi consumers (Remote Node) , the broker need ensure that message is receipt by all the consumers.
  • Issue 2:
    • A Sales Order,Shipment,Invoice and Payment is created in Remote Node, the broker of the Remote Node send a message to Central Node, The Central node receipt the message and documents are create in Central node, again if a message do not is receipt by Central Node then the broker of the Remote Node send again the mesage until is complete receipt.
  • Solution:
    • Is necessary also have a broker to each Remote Node, the reason is because each Remote Node need replicate the data operation (Order,Shipment,Invoices,Payments) to Central Node, it issue is easy solve because we only need a connection peer to peer and ensure that the Central Node receipted the message
  • Issue 3:
    • A Business Partner is created in any Remote Node then we need to forward to every the rest Remote nodes even in the Central Node
  • Solution:
    • It is the more complex business issue, because it message need are replicate every nodes, so the Apache MQ have many way to resolve but I do not sure what is the way to implement?
    • So, I think the Broker the Remote Node can send broadcasting message to all nodes and have a way to ensure that all nodes receiving the message.
    • Well I think we need fist is define the method to distributed queues and which is topologies

we need to use for each business issue..

References

How do distributed queues work

Topologies

FAQs


Kind Regards

Victor Perez

www.e-evolution.com

When to triger export event

When record is saved in ModelValidator
  • Low Hengsin's opinion:
I don't think using ModelValidator to generate the xml export file is a good approach. 
Replication should be a background process that have little performance impact on normal transaction 
and the frequency of replication should also be user-configurable.
I think we should have a background process that read the changelog and replication configuration 
to generate the replication message ( xml or otherwise ) and send that out.
    • My opinion:
I agree. But ModelValidator is one possible option. 
At the same time Replication MUST be guaranteeed. Which means that record should not be saved if Change Message 
can't be sent to Message Storage.
JMS guarantee delivery once message is stored into Queue or Topic.
I think to install on each local Adempiere instance one JMS Server which will collect all messages from local clients.
It will be the task of the JMS Server to transfer messages to other ADempiere Instances once Internet connection is online.
At some predefined schedule & AD_ChangeLog
  • Extend the changelog (AD_ChangeLog) feature in Adempiere and use that to generate the replication message.

Also, it might be good to add versioning support for all tables which will help in implementing conflicts-resolution.

For the cycle issue, if the changelog mechanism is use, we can add a way to turn on or off the change log management during the save operation.

  • My opinion
I think that it is possible to be done!
According to me require some additional processing. We must take care to collect all non-sent to JMS Server Change Messages,
which duplicates work of JMS server. 
  • There are cases in which AD_ChangeLog do not work.
    • Create new Business Partner. AD_ChangeLog do not contain information that new record was created.

I think that this aplies for all other tables.

    • Delete existing Business Partner. AD_ChangeLog do not contain information that record was deleted.
Use Workflow functionality and add new Node which will be responsible for sending XML Message

Thiw will allow to have Approaval and User configuirable process, but will make priocess slow.

Tasks


Existing functionality

  • Post from Carlos - here

There is a seed from JJ currently in windows "Replication Strategy" and "Replication"

    • Configuration: AD_ReplicationStrategy + AD_ReplicationTable + AD_Table
    • Execution: AD_Replication + AD_ReplicationRun + AD_ReplicationLog
    • There is a hidden field "Replication Type" in AD_Table.
    • There is code to manage different sequences in different installations.
    • There is code to replicate in ReplicationLocal.java -> looks like it manages the replication set based on the Updated column of replicated tables.
    • There is some code (looks unused) in ReplicationRemote.java - I suppose this code is intended to be used in the remote installations.

I'm not saying this is a complete solution, it must be enhanced: i.e. Updated column must be guaranteed or changed to manage AD_ChangeLog instead of AD_ChangeLog is just another seed that needs to be enhanced. There are known flaw points.

Known Issue

  • Circular Link in some tables (i.e. C_Invoice - C_CashLine_ID & C_CashLine - C_Invoice_ID) --Armen 01:38, 10 July 2007 (EDT)
  • Many places are using Update sql (instead of PO) resulting unreliable modified date --Armen 01:38, 10 July 2007 (EDT)

Tables which do not have IsIdentifier set

AD_Package_Imp_Detail
AD_Package_Imp_Proc
AD_UserBPAccess
CM_Container_URL
CM_MediaDeploy
CM_NewsItem
CM_WebAccessLog
C_TaxDeclarationAcct
C_TaxDeclarationLine
M_AttributeSetExclude
M_CostDetail
M_CostQueue
M_LotCtlExclude
M_SerNoCtlExclude

Issue during importing process

Example xml file which must be imported.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld">
    <AD_Client>
	<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client>
    <Value>GardenAdmin-7</Value>
    <Name>GardenAdmin BP</Name>
</C_BPartner>
How to understand which column(columns) is/are Unique for given table?
  • Could add sub tab in Export format and add all coulmns for givien table which make records uniquely identifieable.
  • Unfortunately AD (Application dictionary) do not store any information which can be used.
    • For example: C_BPartner table. In AD we have Name column set as IsIdentifier, but Unique columns at DB level are (AD_Client_ID + Value)
  • Could read Meta information from DB, but in this case will need Oracle specific and Postgre specific handling.
  • Export Format Line could contain additional field: IsPartUniqueIndex. All Columns which have this flag will form Unique Key of Record.
  • Implemented Option 4.
How import process could FIND proper Export Format

Importer has as input XML document. From XML Document importer must FIND proper Export Format. Which means that all information must be kept inside XML file.

  • 1. Upon save of XML file, ExportModelValidator can add 2 xml attributes. Both attributes makes Export format unique.
    • AD_Client_Value
    • EXP_Format_Value
  • 2. Upon save of XML file, ExportModelValidator can add only 1 xml attributes. Root XML node name and AD_Client_Value attributes makes Export format unique.
    • AD_Client_Value
  • Implemented option 2 as option 1 leads to duplication of information.


Order of messages

One of the biggest problems in replication is that many times you CANNOT simply send the transactions grouped by table, but you must send the transactions in the same order as they happen (to avoid referential integrity problems in the target system). So, you have to make sure you can replicate in the same order (AD_ChangeLog.Created?)

Questions and Answers

Q.1 What is the Role of Export Process Type?

Export Processor Type defines java class which is responsible to send Export Message.

In the case of JMS:

'Local JMS' Server stores received messages and will transfer them to 'Remote JMS' server when network connection is online. If 'Local JMS' server is down then 'Local Adempiere' instance will not be able to work as sending of JMS messages from 'Local Adempiere' instance to 'Local JMS' server will fail.


Two Export Processor Types are defined in default examples:

  • org.adempiere.server.rpl.exp.HDDExportProcessor
    • Uses shared file system to store exported messages.
  • org.adempiere.server.rpl.exp.TopicExportProcessor
    • Uses JMS server to send exported messages.
    • TopicExportProcessor is a JMS client that sends messages to Local JMS Server.

Q.2 Should I define a Export Processor for each store?

Yes.

Each store will be defined as new organization in Adempiere. That's why we need new Export Processor for each Adempiere Organization/Store.


Q.3 Do i need Import Processor?

Yes.

Each Adempiere instance must have Import Processor defined. We need to define new Import Processor for each Adempiere Organization/Store. This Import Processor import messages from 'Local JMS Server'.


Q.4 What happens when record can't be replicated?

Record can't be replicated when 'Local JMS Server' is not working. In this cases Adempiere will show error message to the user.

Answer of this question is the same as answer of question: 'What happens when my Database server stop working?'


Q.5 What happens if JMS server is down?

See answer of Question 4.


Q.6 How DB record is marked as replicated?

It is not necessary to mark record as exported. Marking record as exported can be done but this is redundant step. Once JMS message is sent to 'Local JMS Server' we are guaranteed that record will be transfered to 'Remote JMS Server'. JMS protocol is responsible to handle this.

Of course we can create functionality which send confirmation from 'Remote Server' to 'Local Server' when record is saved, but this will require additional development effort.


Q.6 Is the queue constructed with records or transactions?

Export can be configured as per record export or as per Document export or mixed. Queue is inside JMS Server. JMS messages are stored in Queue.


Q.7 What example transactions are provided?

- Create a Business Partner Group......... -- DONE.

- Create a Business Partner............... -- DONE.

- Create an Order......................... -- DONE.

- Create an OrderLine..................... -- DONE.

- Update the Order (complete) ............ -- DONE.

- Create an Invoice (based on the Order).. -- TO BE DONE.

- Create corresponding Invoice Lines...... -- TO BE DONE.

- Update the Invoice (complete)........... -- TO BE DONE.

Q.8 How are those events being sent to the queue?

All cases are possible it depends how export format is defined.

>a) like single records?
>Insert BPGroup XML
>Insert BP XML
>Insert Order XML
>Insert OrderLines XML (one for each record?)
>Update Order XML
>Insert Invoice XML
>Insert InvoiceLines XML
>Update Invoice XML
>Update BPartner? -> maybe to update the openbalance because of the invoice completed?
>
>b) like transactions? 
>transaction inserting BPGroup 
>transaction inserting BPartner 
Export format must defined to export only BPGroup/BPartner.

>transaction when completing the order sending the order + lines in one single XML?
Export format must be defined to export Order and all lines.

>transaction when completing the invoice sending the invoice + lines + BP in one single >XML?
Export format must be defined to export invoice + lines + BP in one single XML.


Q.9 What happens if a record fails to be replicated on slave?

i.e. in the previous example what will happen if the BPGroup can't be inserted (i.e. because of primary key broken)

Record will stay on JMS Server. Notification mechanism must be created in order to notify administrator and take appropriate actions.

>This question is important because: 
>a) if the process continues then the rest of the records can fail (because of foreign key >issues) 
>b) if the process stops then it needs special attention to failures - because one failure >stop all the replication process 
>--> I suppose is better and safer a) 

At the moment dependent transactions will fail.

=Excuse me, just rereading I noticed I gave wrong advice.
=
=I think is SAFER to stop the replication process when a transaction fails (or at least make it configurable).
=
=Failure of dependent transactions is just optimistic scenario.
=The worst scenario is that continuing the process can insert corrupted data.
=I mean, suppose process fail to insert BP CarlosRuiz because the ID is already used for BP Trifon.
=If the next record is an invoice to CarlosRuiz in master - it could finish in slave as an invoice for Trifon (this is just a supposition =example, but you can find more data corruption possibilities in more cases)

Q.10 How is the ID's issue on bidirectional replication resolved?

I mean if you can for example import BPGroups on master and slave at the same time, you'll have conflict with ID's.

Problem can arise when Value columns are duplicated. Replication do not transfer IDs. IDs are internal for DB and i do not Export/Import them.

Conflict resolution process must be created.

Sorry for asking too much, I didn't find design details in wiki or requirements.txt. Obviously you're free to answer or not (I know I could simply download and review your code). These questions are trying to figure also the answer for the next one:

4 - What's the status and plan of this development? a) what's the status? alpha? beta? release-candidate? b) is Victor planning to include it in adempiere350? c) are there plans to be included in trunk before 3.4 - we're on a freeze with possible votings for new functionalities d) are there plans to document the steps needed to set up replication? e) are there plans to document the scope of replication?

Example XML documents created by Export process

This examples are intended to show that Adempiere XML Export/Import functionality is flexible enought to support any kind of XML structure.

  • XML Documents are created by Export process(ExportModelValidator class).
  • Examples 1, 2, 3 export the same record but in different XML format.


First Example

<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client>
	<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client>
    <AD_Org>
	<Value>HQ</Value>
	<AD_Client_Value>
	    <AD_Client_Value>GardenWorld</AD_Client_Value>
	</AD_Client_Value>
    </AD_Org>
    <Value>Test-Replication-BP-3</Value>
    <Name>Test-Replication-BP-3</Name>
    <DUNS>Duns-3     </DUNS>
    <Created>2007-08-05 21:21:37.0</Created>
    <CreatedBy>
	<Name>SuperUser</Name>
	<AD_Client_Value>
	    <AD_Client_Value>SYSTEM</AD_Client_Value>
	</AD_Client_Value>
    </CreatedBy>
    <Updated>2007-08-05 21:21:37.0</Updated>
    <UpdatedBy>
	<Name>SuperUser</Name>
	<AD_Client_Value>
	    <AD_Client_Value>SYSTEM</AD_Client_Value>
	</AD_Client_Value>
    </UpdatedBy>
</C_BPartner>


Second example

The same document but different name of XML Elements.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client_ID>
		<AD_Client_Value>GardenWorld</AD_Client_Value>
    </AD_Client_ID>
    <AD_Org_ID>
		<Value>0</Value>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </AD_Org_ID>
    <Value>GardenAdmin-17</Value>
    <Name>GardenAdmin BP-17</Name>
    <DUNS>Duns-----17</DUNS>
    <Created>2003-03-27 15:44:25.0</Created>
    <CreatedBy>
		<Name>System</Name>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </CreatedBy>
    <Updated>2007-08-06 00:30:31.0</Updated>
    <UpdatedBy>
		<Name>SuperUser</Name>
		<AD_Client_ID>
	    		<AD_Client_Value>SYSTEM</AD_Client_Value>
		</AD_Client_ID>
    </UpdatedBy>
</C_BPartner>


Third example

The same document but stores IDs instead of record unique key.

<?xml version="1.0" encoding="UTF-8"?>
<C_BPartner AD_Client_Value="GardenWorld" Version="3.2.0">
    <AD_Client_ID>
		1000000
    </AD_Client_ID>
    <AD_Org_ID>
		100
    </AD_Org_ID>
    <Value>GardenAdmin-17</Value>
    <Name>GardenAdmin BP-17</Name>
    <DUNS>Duns-----17</DUNS>
    <Created>2003-03-27 15:44:25.0</Created>
    <CreatedBy>
		101
    </CreatedBy>
    <Updated>2007-08-06 00:30:31.0</Updated>
    <UpdatedBy>
		101
    </UpdatedBy>
</C_BPartner>

Links

sf.net forum posts


External links

[PostgreSQL Info]

[JMS Publish/Subscribe Messaging]

Screen shots

Menu

EE05Menu.png


Workflow

EE05Workflow.png

Export Format

1-ExportFormat.jpg


Export Format Line

2-ExportFormatLine.jpg


Export Format - Grid Mode

3-ExportFormat-GridMode.jpg


Example: Org_Value - Export Format window

4-ExportFormat-Org Value.jpg


Example: Org_Value - Export Format Line window

5-ExportFormatLine-GridMode-Org Value.jpg


Replication Case Study

Replication_HowTo Replication How To