Errors From Replication

Printable View

Show 40 post(s) from this thread on one page

07-22-2009, 03:58 PM
Diakiao

Errors From Replication

I keep getting replication errors mailed to me from my replication target SAN. It is located at a different branch via a 100MB link. The sync is asyncronous, and runs from 7PM-7AM the next morning. Here are the errors the SAN is e-mailing me in chronological order:

First e-mail:

2009/07/21 19:00:37 Replication:Volume Replication: Device lv0000: Connection (from WFConnection to WFBitMapT). Mode local/remote (from Secondary/Unknown to Secondary/Primary). DataStatus local/remote (from Consistent/DUnknown to Consistent/Consistent).

Second e-mail:

2009/07/21 19:00:37 Replication:Volume Replication: Device lv0000: Connection (from WFConnection to WFBitMapT). Mode local/remote (from Secondary/Unknown to Secondary/Primary). DataStatus local/remote (from Consistent/DUnknown to Consistent/Consistent).

2009/07/21 19:01:00 Replication:Volume Replication: Device lv0000: Connection (from WFBitMapT to SyncTarget). DataStatus local/remote (from Consistent/Consistent to Inconsistent/Consistent).

Third e-mail:

2009/07/22 07:00:36 Replication:Volume Replication: Device lv0000: Connection (from SyncTarget to WFConnection). Mode local/remote (from Secondary/Primary to Secondary/Unknown). DataStatus local/remote (from Inconsistent/Consistent to Consistent/DUnknown).

Fourth e-mail:

2009/07/22 07:00:36 Replication:Volume Replication: Device lv0000: Connection (from SyncTarget to WFConnection). Mode local/remote (from Secondary/Primary to Secondary/Unknown). DataStatus local/remote (from Inconsistent/Consistent to Consistent/DUnknown).

Now according to the task log on the source SAN the volumes were Consistent/Consistent at the end of the replication. But that last e-mail said unknown. Any ideas?
07-22-2009, 06:27 PM
To-M

I would check the test.log in the NIC section of ifconfig -a to see if there are any packet errors that might be generating those messages from consistent, inconsistent to unknown (Destination) to consistent.
07-22-2009, 08:05 PM
Diakiao

Well I just checked that. Here is a copy of the log from the source servers interface that connects to the second SAN. All the identifying info has been removed of course.

eth6 Link encap:Ethernet HWaddr
inet addr: Bcast: Mask:
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1914064650 errors:0 dropped:0 overruns:0 frame:0
TX packets:3948037088 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:209010884275 (194.6 GiB) TX bytes:5901393956807 (5.3 TiB)

Here is a copy of the log for the interface on the second SAN:

ncap:Ethernet HWaddr
inet addr: Bcast: Mask:
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:372937392 errors:0 dropped:0 overruns:0 frame:0
TX packets:205165134 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:541926660966 (504.7 GiB) TX bytes:21169516242 (19.7 GiB)

So near as I can tell there are no collisions, errors, drops or anything.
07-22-2009, 08:30 PM
To-M

The only thing I can think of other then the NIC's is the RAID health that can cause this or there is more data that needs additional time other then the time frame that you have but in your case this does not apply as it looks like it was consistent.

Send in logs to support@open-e.com so the engineers can look into it.
07-22-2009, 10:40 PM
Diakiao

Done, I also sent a copy to your e-mail. If a support ticket needs to be opened or anything like that please let me know and I'll do it.

Thanks!
07-22-2009, 11:01 PM
To-M

I will see the ticket if it is sent to support@open-e.com
07-23-2009, 02:54 AM
To-M

What was the ticket # that you got from support?
07-23-2009, 02:12 PM
mhubert

Hi!

I get the same messages every day.
I have 4 async replication tasks and get 4 emails at start time and 4 at end time.
The task in the GUI shows consistent for every volume.

I thought, the messages are produced because the volume gets inconsistent/unknown as soon as the task stops?!

regards
Matthias
07-23-2009, 03:01 PM
budy

I am getting 8 of these each day. That's why I posted a feature request a while ago, since planned interruptions of a volume replication isn't really an error, but necessary:

http://forum.open-e.com/showthread.php?t=1394

Cheers,
budy
07-23-2009, 04:04 PM
Diakiao

I'm afraid I haven't gotten a response just yet. No biggie, just don't have a number to give ya.

Show 40 post(s) from this thread on one page