PDA

View Full Version : iSCSI-R3 Enterprise


Raudi
06-06-2007, 01:47 PM
Hello!

has somebody experiences when using the iSCSI-R3 Enterprise as central storage in a VMware ESX 3 enviroment?

I have the following enviroment:

2 ESX 3.0.1 Hosts
1 iSCSI-R3 Enterprise Storage

Both ESX Hosts are connected directly to a NIC of the iSCSI Storage.

All functions are working normal, i can use VMotion to migrate the VM's from one Host to a other...

I have only a performance problem.

In the storage i have a 16 channel 3Ware S-ATA2 HBA and a RAID-5 of 6x 750GB Seagate S-ATA2 HDD's.

The first 500 GB iSCSI target i formatted with VMFS for the VM's and the rest i used as Raw Device Mapping for one of the VM's.

As default the write back cache is not enabled on the target, with this setting i got with IOMeter the following results:

32 KB Blocks - 100% Write - 100% Random: 2 MB/s - 60 IO/s

When i enable the write back cache for the target i got this results:

32 KB Blocks - 100% Write - 100% Random: 23 MB/s - 734 IO/s

But sometimes i got this errors on the iSCSI target:

Jun 6 11:28:19 iscsi kernel: execute_task_management(1236) 4ec5fe1 1 ca5fec04
Jun 6 11:28:19 iscsi kernel: cmnd_abort(1163) 4ec5fca 1 0 42 512 0 0
Jun 6 11:28:19 iscsi kernel: execute_task_management(1236) 4ec5fe2 2 ffffffff
On the VMware Host i got this errors:

Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu3:1032)SCSI: 3753: AsyncIO timeout (5000); aborting cmd w/ sn 28117, handle aaa0/0x62036f0
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu3:1032)LinSCSI: 3596: Aborting cmds with world 1024, originHandle 0x62036f0, originSN 28117 from vmhba40:0:0
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu3:1032)LinSCSI: 3612: Abort failed for cmd with serial=28117, status=bad0001, retval=bad0001
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu0:1062)iSCSI: session 0x3c5c3a78 sending mgmt 82599905 abort for itt 82599882 task 0x3c5c1370 cmnd 0x3c40a090 cdb 0x2a to (2 0 0 0) at 246825157
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu5:1063)iSCSI: session 0x3c5c3a78 abort rejected (0x1) for mgmt 82599905, itt 82599882, task 0x3c5c1370, cmnd 0x3c40a090, cdb 0x2a
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu0:1062)iSCSI: session 0x3c5c3a78 sending mgmt 82599906 abort task set to (2 0 0 0) at 246825157
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu5:1063)iSCSI: session 0x3c5c3a78 abort task set success for mgmt 82599906, itt 82599882, task 0x3c5c1370, cmnd 0x3c40a090
Jun 6 11:28:19 vmsrv02 vmkernel: 28:13:37:31.727 cpu0:1062)iSCSI: session 0x3c5c3a78 (2 0 0 0) finished error recovery at 246825157

The support told me that this problem came because my storage is to slow, there came to many requests, the queue is full and the storage can't write them down. First i had 3 HDD's in my RAID-5, because this problem i expanded it to 6 HDD's but the problem is still here. I configured already the 3Ware HBA to the performance mode and the write cache is enabled.

Has somebody some experiences? Perhaps there are parameters in VMware which i should change?

I will be glad if someone can give me some hints to solve this problem...

Best regards

Stefan

To-M
06-06-2007, 02:38 PM
I have seen this error before related to sometimes either with different initiator settings not matching the Target or you need to increase the MaxRecvDataSegmentLength and MaxXmitDataSegmentLength to 65536 under Hardware Tools(ALT+CTRL+W) -> Tuning optons -> iSCSI Daemon options. Also have seen this with bonding issues but since you have had an increase in I/O with changing the WB feature for the Target Volume this would eliminate network related issues.

Thanks for assisting with the LSI selection for Johhny;)

Raudi
06-06-2007, 03:02 PM
Thank you for the fast answer! I have changed the values, are they already active after changing them, or must i do a restart?

Stefan

To-M
06-06-2007, 03:13 PM
Should be immediate but maybe both need to be restarted. We are trying to work with VMware but as you know this will take some time (they are a big company). So allot of testing will be on users and our engineers. Thanks for helping! I also want to move all VMware postings to the VMware section of the Forum. So I may move this later to that section incase you need to find it.

Raudi
06-06-2007, 03:31 PM
O.k. a result i will see tomorrow after the next backup of my system. During backup and after the backup when ARCserve updates his database i got the most errors.

At the support they told me aready that you are working on a VMware certification, when you get this, this will be great! Then we can use it at our customers. This is only a test enviroment, so a certification is not important...

Stefan

To-M
06-06-2007, 03:34 PM
Thanks! Any information on this helps our engineers big time!!

Raudi
06-07-2007, 09:07 AM
At night during Backup i still got the errors, i try again what happens after restarting the storage...

Stefan

fivefive1978
06-07-2007, 07:59 PM
I haven't tried using the VMWare iSCSI software initiator but we are having very good luck using the iSCSI HBA's from QLogic in the VMWare host systems.

Search for my posts to see the setup I have.

Raudi
06-07-2007, 08:31 PM
About this i had thought already but didn't made it because:

- I have a Dual Xeon Quad CPU System. Even if the softare Initiator uses one whole CPU i have 7 remaining.
- I think Jumbo Frames don't speed up my system, because the access to the VMFS is most random access.

I will think about it when i have some money for this...

Stefan

chrwei
02-04-2008, 07:41 PM
did changing these setting and restarting solve the problem for you?

Raudi
02-04-2008, 08:56 PM
Not really, but it seems that open-e has done here something. In Build 2998 i didn't see this error...

Stefan

piwi
02-05-2008, 10:43 AM
same problem here...
with DSS and vmware 3.5

WB enabled on a 4GB Dual Xeon with a Areca Raid controller 4 disks in Raid 10
super performance, but still those erors.
latest open-e version