ESX 4.1 loses iSCSI connection after reboot
I'm running ESX 4.1 and planning to move the iSCSI storage from a single Windows Storage Server to two Open-E DSS V6 16TB Storage System which should run than Failover.
Right now I have the strange problem, that after a ESX reboot for example the connection to the open-e iSCSI LUN can't be restored nor will any rebooted ESX find any LUN from the Open-E System.
Only Solution is to reset the iSCSI Connections manually with the Web GUI of the Open-E System.
This couldn't be a solution for an HA setup.
How to fix this?
be sure to assign static paths to the targets for the HBA, only pointing to the virtual IP(s).
Hi that's all done.
After I changed from File IO to Block IO this problem only appears with with Volumes greater than 1TB.
I also get a huge amount of cmnd_abord 1143 errors.
What could be the reason?
All the best,
Those steps I've already tried...
Another strange behavior is that an ESX Host crashed today due to my failure, and this host was creating a VM at an iSCSI target of the Open-E DSS V6. This target was totally lost - ESX will only find it with 0,00B size now.
So what is the advice to create a stable and good running Open-E DSS V6 iSCSI Failover Cluster for VmWare ESX 4.1 ?
You might want to check out VMware site about missing LUNs from the ESX side and how to get it back.
http://kb.vmware.com/selfservice/mic...0463&stateId=1 0 17600580
All the best,
Sadly the performance of Open-E DSS V6 is totally disappointing as well as the stability.
Now the question is - is this due to ESX 4.1 or Open-E DSS V6 ?
Would you recommend to upgrade the DSS V6 16TB to DSS V7 16TB and buy the active-active iSCSI package? ( Well this is a question of the price in the end - if vSphere 5 or DSS V7 is cheaper and what would you suggest is the main reason for those trouble ? )
Or could it be the hardware from the Open-E Systems?
I've two identical system here based on a Supermicro X7 Mainboard.
Each system has two Xeon 5160 and 16GB Memory.
Per system there is a HP P400 with 512MB and BBU connected to 8x 300GB 15k SAS ( 2x 4 disk RAID5 arrays ) and a HP P800 with 512MB and BBUs connected to 5x 2 TB disks at RAID5 and 3x 1,5TB disk running at RAID5. Those 1,5TB disk will move out and will be replaced with 120GB SSDs as soon as they will arrive here.
Onboard NICs are used for WebGUI and Heartbeat.
Each system as 3x PCI-X Broadcom Dual GbE NICs. The PCI-X NICs are used for iSCSI with MIPO and two are used for Sync.
In the end I only get a total of about 600Mbit max Network traffic over all 4 iSCSI MIPO NICs
With a live CD I've already tested disk speed and network speed. disk speed was max at ~330MB/s max / ~130MB/s min and 270MB/s avg for the 4 disk SAS raid 5 arrays. And for the 5 disk RAID a max with ~ 320MB/s and a min of ~272MB/s.
Benchmarked iSCSI MIPO to a RAM iSCSI Disc was ~ 450MB/s ( using 4 NICs which is quite nice ).
So how comes that with Open-E DSS V6 the performance drops so much?
Copying VM's from an iSCSI Server which is easy able to handle ~ 3000 Mbit to the Open-E System I only get up to 600Mbit.
Copying from two "old" iSCSI Servers to each other I get about ~ 2500 Mbit.
There has to be a bottleneck either with Open-E or with VMWare and Open-E DSS V6 or with the hardware and Open-E?
Cause with Windows Storage Server 2003 x64 and a Win XP live CD all works as fast as it should.
It takes about 10min to create 3% of a 512GB virtual disk for a VM at the Open-E iSCSI Storage.
Before stating this lets take a look at the DSS log file to see if there is something else that is happening, log file can be obtained in the GUI in Status > Hardware > Log then click on the Download button. We believe you sent us an email inquiring about the features of the A/A so we can take a look your system from the log files.
Originally Posted by MAFRI
All the best,
Also you might want to try to tune your Targets as well.
1. From the console, press CTRL+ALT+W,
2. Select Tuning options -> iSCSI daemon options -> Target options,
3. Select the target in question,
4. change the MaxRecvDataSegmentLength and MaxXmitDataSegmentLength values to
the maximal required data size (check w/ the initiator to match).
Doing this will reset the iSCSI connections at each edit. Please pause any hosts connected
to the LUNS. These adjustments need to be made at each node, and at the initiators.
Did you look at this document - keep in mind that ESX 4.x MPIO is different then 5.0 and .51.
Last edited by To-M; 08-28-2013 at 10:30 PM.
All the best,
Already tried the tuning at the Open-E Storage and did same tuning for the iscsi HBA at the ESX hosts.
I sure know how to handle the Robin-Round with ESX 4.1
The log only show's the only 407 Errors since I cleared it last this must be about 12h ago.
407 cmnd_abort(1143) errors.
You offered me to send the log files via e-mail - where to send them ?