Tuesday, June 21, 2011

Replication over WAN


I have a customer that has a server farm of Directory Servers. These directories are deployed in multiple sites, thus WAN replication is required to keep data in-sync.

In one of the directory instances, binary data are stored. Thus resulting in huge entries (some are as huge as 20MB per attribute value).

They were using Sun Directory Server 5.x happily before they were switched to Sun DSEE 6.x due to EOL of DS 5.2. Replication starts to break for those instances with huge entries.

In this particular case, customer was trying to replicate from a Wins2k3 box to a Solaris Sparc box.


We raised a Support Case with Oracle Support and were told the following:

I would consider your customer setup as a corner case which is a mixture of all the following:
* replication over SSL
* replication over a slow WAN link
* replication of huge entries
* replication topology mixing different versions
* replication not using compression
* replication timeouts not correctly set

AFAIK, the best performance of the Directory Server is obtained with the following combination:
* DS 7.0.1
* Solaris x86
* ZFS

Oracle kept suggesting getting Professional Services people to "take a look and make suggestion".

But to the customer, the equation is so simple:
* There were no change in topology, OS, and data
* DS 5.x works
* Why would DSEE 6.x breaks? Shouldn't 6.x be a enhanced version of 5.x?

It's definitely a limitation in DSEE 6.x which Oracle refuses to admit.



By the way, we always do our own homework and below are some findings:

1. 5.2 SP6 (wins 2k) -> 5.2 SP6 (solaris sparc) ==> OK
2. 5.2 SP6 (wins 2k) -> 6.3.1 (solaris sparc) ==> NOT OK
3. 6.3.1 (windows) -> 6.3.1 (solaris sparc) ==> NOT OK
4. 6.3.1 (solaris) -> 6.3.1 (solaris sparc) ==> NOT OK
5. 7.0.1 (solaris) -> 6.3.1 (solaris sparc) ==> NOT OK
6. 7.0.1 (solaris) -> 7.0.1 (solaris sparc) ==> NOT OK


In the end, I called upon my friend (OpenDJ) for help. Looks good!



To the customer, slow is not important as long as the data is kept in-sync. Simple requirement.

The next step is to deploy a test-bed in the customer's environment to confirm replication over WAN works using OpenDJ. The past few months have been very miserable, especially talking to those people.


.

No comments:

Post a Comment