PDA

View Full Version : iSCSI R3 Enterprise performance tuning


Raudi
09-17-2007, 11:03 AM
Hello,

i still have performance problems. Yesterday i updated to the current version and deleted the complete unit to made new iSCSI volumes. After this i copyed the vmdk files from my local VMFS back to the VMFS on the iSCSI Volume. A 6 GB file takes 11:15 minutes, this will be 9 MB/Sec.

If i remember right, the copy of the file in the other direction to a local single S-ATA drive was much faster. At this time the restore from tape is running so i think i got no useable value when i test it now. The backup to the LTO1 tape runs with 900MB/Min and now the restore only with 550MB/Min. (2 days remaining to restore all the data.)

At the performance chart on the esx host for the network card i used for the ESX iSCSI initaor i got never more than 30 MBps. The current value during restore is 10 MBps.

When i use IOMeter in a VM on a RAW mapped 2 TB iSCSI Volume i got for read and write a value of 35 MB/Sec.

What can i do to optimise the performance?

My enviroment:

Storage
Supermicro PDSM4 Mainboard with a Pentium D 930 (3 GHz Dual Core) and 4 GB memory.
3Ware 9550SXU 16 Ch. S-ATA II (with BBU in performance mode) with 6x Seagate 750 GB HDD's

2x ESX Host
Supermicro X7DBE Mainboard with two Xeon E5320 (1,86 GHz Quad Core) and 6 GB memory.

For iSCSI communication i use direct cables from the two on board NIC's of the storage to each ESX host.

Best regards
Stefan

Raudi
09-17-2007, 12:15 PM
Note:
The Virtual Infrastructure Client seem to show the wrong value. There is a "0" missing.

The current Data Transmit Rate is shown as 12000 KBps, this will be 1,5 MB/s and can't be correct. This must be 120000 KBps corresponding to the value i read on the check_sys page on the Open-E.

Regards
Stefan

Raudi
09-17-2007, 02:19 PM
Argh... The KBps are "KB per Second" and not "KBit per Second"!

But this doesn't change anything at my problem....

To-M
09-17-2007, 05:17 PM
Can you test without the VM iscsi connection and just use direct with a MS initiator (MS XP or 2003) to see if it is the iSCSI-R3. Also check the tests.log in the log file from Status > Hardware > Download Logs then look for the hdparm -t /dev/sdx to see what speeds we are getting from the RAID set.

Raudi
09-17-2007, 05:46 PM
Now i canceled my restore to made some tests:

To copy a 8 GB file from a single 500 GB drive (connected to a MegaRaid SATA 150-6) to iSCSI it take 13 min and 23 sec. The other direction only 5 min. and 43 sec.

The other informations:

I made a new volume and mounted it from my Windows XP with the MS Initiator. With IOMeter i got the same values i got in the VM. 30-35 while reading and 30-40 during write... (32K 100% seq. read or write)

From the tests.log:

--- Physical volume ---
PV Name /dev/sdb
VG Name vg+3ware
PV Size 3.41 TB / not usable 0
Allocatable yes
PE Size (KByte) 32768
Total PE 111757
Free PE 12
Allocated PE 111745

hdparm -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 1036 MB in 3.00 seconds = 345.31 MB/sec

This looks good. Is there a way to test only the network connection? (like with iperf)

Regards
Stefan

Raudi
09-17-2007, 07:47 PM
It must be a local problem on my iSCSI box. Now i restarted the restore and during this i tested again with IOMeter from my Workstation. The value for read is now between 0 and 5 MB/s...

The workstation is connected to a Intel GigaBit Server Adapter which is placed in a PCI-X slot. The ESX Hosts are connected to the both on board Intel GigaBit NIC's.

Can i made some other tests? Where is the bottleneck?

Regards
Stefan

To-M
09-17-2007, 11:08 PM
Check the tests.log and see what the netstat -s and ethtool ethx has. Also is the 3Ware firmware up to date. Is the iSCSI-R3 version 2.30 and was this upgraded from older version? If it was updated from older version did you delete the Logical Volumes and VG' then recreate (from release notes).

Try to test with different Kernel. Go to Console screen - then Console Tools CTRL.
+ ALT. + T select Boot Options - Select System Architecture - Multi
processor server system (TESTING).

Raudi
09-17-2007, 11:30 PM
After i installed the 2.30 i go to the console typed ctrl-alt-x and selected the point 10 - "remove partition".

The 3ware Firmware is not the latest, it is 3.08.00.004 and the current on the web is 3.08.02.005, which has only support for new models and a fix for RAID6. I must connect a floppy to update the firmware. Will try tomorrow.

What the test.log says to the netstat -s and ethtool ethx, i don't see there something wrong:

*-----------------------------------------------------------------------------*
netstat -s
*-----------------------------------------------------------------------------*

Ip:
939354255 total packets received
0 forwarded
0 incoming packets discarded
939354255 incoming packets delivered
324964503 requests sent out
Icmp:
35 ICMP messages received
0 input ICMP message failed.
ICMP input histogram:
destination unreachable: 31
echo requests: 4
35 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 31
echo replies: 4
Tcp:
60171 active connections openings
60306 passive connection openings
0 failed connection attempts
23 connection resets received
49 connections established
939343091 segments received
324956798 segments send out
221 segments retransmited
0 bad segments received.
686 resets sent
Udp:
10944 packets received
0 packets to unknown port received.
0 packet receive errors
7668 packets sent
TcpExt:
59746 TCP sockets finished time wait in fast timer
144181 delayed acks sent
275 delayed acks further delayed because of locked socket
Quick ack mode was activated 217840 times
50422 packets directly queued to recvmsg prequeue.
11156301 of bytes directly received from prequeue
864295031 packet headers predicted
22820 packets header predicted and directly queued to user
19737188 acknowledgments not containing data received
22681705 predicted acknowledgments
2 times recovered from packet loss due to fast retransmit
10 times recovered from packet loss due to SACK data
Detected reordering 2 times using time stamp
2 congestion windows fully recovered
61 congestion windows partially recovered using Hoe heuristic
60 congestion windows recovered after partial ack
0 TCP data loss events
1 timeouts after SACK recovery
10 fast retransmits
1 forward retransmits
202 other TCP timeouts
163 DSACKs sent for old packets
1 connections reset due to unexpected data

*-----------------------------------------------------------------------------*
ethtool eth0
*-----------------------------------------------------------------------------*

Settings for eth0:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: d
Current message level: 0x00000007 (7)
Link detected: yes

*-----------------------------------------------------------------------------*
ethtool eth1
*-----------------------------------------------------------------------------*

Settings for eth1:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: g
Current message level: 0x00000007 (7)
Link detected: yes

*-----------------------------------------------------------------------------*
ethtool eth2
*-----------------------------------------------------------------------------*

Settings for eth2:
Supported ports: [ TP ]
Supported link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Supports auto-negotiation: Yes
Advertised link modes: 10baseT/Half 10baseT/Full
100baseT/Half 100baseT/Full
1000baseT/Full
Advertised auto-negotiation: Yes
Speed: 1000Mb/s
Duplex: Full
Port: Twisted Pair
PHYAD: 1
Transceiver: internal
Auto-negotiation: on
Supports Wake-on: umbg
Wake-on: g
Current message level: 0x00000007 (7)
Link detected: yes

*-----------------------------------------------------------------------------*

I wrote a long mail to the support and will call them tomorrow to talk about this Problem. I hope at phone we find a solution...

Regards
Stefan

To-M
09-18-2007, 01:34 AM
Correct the netstat -s info is good. What are the iSCSI daemon options for the Target (we did enable WB for the Target - correct - but this would not be much of an improvment - dont worry about that then). Any errors in the logs (error.log, critical_errors or 2 .logs).
If all is good then I would bet the issue to be the RAID controller - check and verify the RAID health or firmware update.

Raudi
09-18-2007, 09:15 AM
But how can this be a raid controller issue, when the operating system from the open-e can read from it with 350 MB/s?

Regards
Stefan

Raudi
09-18-2007, 12:59 PM
I tested some things:

Used a additional Reaktek GigaBit NIC, but the IOMeter read speed was only 7 MB/s and the write speed like the Intel NIC at 40 MB/s.

I tested with the multiprocessor and sigleprocessor testing kernels, but all the same...

I updated the Firmware of the 3Ware to the last one, but no change.

Some questions:

- What is the better enviroment: Direct links from the target to the initiator with a single cable or is it better to use a switch.

- Can i test the network performance from the initiator to the target and ignore the performance of the raid?

- What must i do to configure the NIC to 1000 FullDuplex fix, no autodetection.

Regards
Stefan

To-M
09-18-2007, 02:39 PM
Maybe disabling Flow control will help. Try changing this from the Console Tools in ALT+CTRL+T -> Modify driver option function. Any option will need to restart system. Sometimes forcing the speed value parameters may work for some NICs also look to see what the AutoNeg is set to (Intel NICs are set to 32).

Raudi
09-18-2007, 03:31 PM
But what must i enter in the fields for "FlowControl", "AutoNeg", "Duplex" and "Speed" to set them?

I don't think that this will help, because a few hours ago i startet my restore again, this data will go with eth1 into the storage and runs with 70 MBit/s (from check_sys). In ESX i saw a speed of 8 MB/s. Then i started a IOMeter from my workstation which is connected to eth0 and got 30-35 MB/s but in this moment the speed in ESX goes down to 500 KB/s. Wehen i stop IOMeter the speed in ESX goes up again... But i will leave nothing untested.

The best will be when i can made a networking test, to see what performance i got when i made this on one NIC and then on two and final at all three NIC's.

Regards
Stefan

Raudi
09-18-2007, 05:43 PM
I configured with now FlowControl and fix at 1000 Full Duplex, but no change...

Can this be a hardware issue?

I use this Supermicro Board: PDSM4 (http://www.supermicro.com/products/motherboard/PD/E7230/PDSM4.cfm)

Are there recommendations which board i can use with a 3Ware RAID and some Intel NIC's?

Regards
Stefan

To-M
09-18-2007, 05:51 PM
Try with the Intel Pro 1000 and tell me what you have for the driver settings. Here is the link for Intel settings:

http://www.intel.com/support/network/sb/cs-009209.htm?

Raudi
09-18-2007, 09:07 PM
I tryed to set the FlowCotrol to 0 and the AutoNeg to 32.

Tomorrow i'm one week out of the country...

Regards
Stefan

To-M
09-18-2007, 10:04 PM
Set the Duplex to (2)=full and Speed to 1000

Raudi
09-18-2007, 10:06 PM
In the Description of the AutoNeg: When this parameter is used, the Speed and Duplex parameters must not be specified.

But i had this specified too....

Best Regards
Stefan

To-M
09-18-2007, 10:39 PM
Running out of options.... try changing the iSCSI daemon settings in the Tuning options - did you enable the WB on the Target? Not sure if we covered that area - checking....

Try using a different version of client initiator.

Some report good speeds from these settings:

maxRecvDataSegmentLen=262144
MaxBurstLength=16776192
Maxxmitdatasegment=262144
maxoutstandingr2t=8
InitialR2T=No
ImmediateData=Yes


a) MaxRecvDataSegmentLength - Sets the maximum data segment length that can be received. This value should be set to multiples of PAGE_SIZE. Currently the maximum supported value is 64 * PAGE_SIZE, e.g. 262144 if PAGE_SIZE is 4kB.
Configuring too large values may lead to problems allocating sufficient memory, which in turn may lead to SCSI commands timing out at the initiator host. The default value is 8192.

b) MaxBurstLength - Sets the maximum amount of either unsolicited or solicited data the initiator may send in a single burst. Any amount of data exceeding this value must be explicitly solicited by the target. This value should be set to multiples of PAGE_SIZE. Configuring too large values may lead to problems allocating sufficient memory, which in turn may lead to SCSI commands timing out at the initiator host. The default value is 262144.

c) MaxXmitDataSegmentLength - Sets the maximum data segment length that can be sent. This value actually used is the minimum of MaxXmitDataSegmentLength and the MaxRecvDataSegmentLength announced by the initiator. It should be set to multiples of PAGE_SIZE. Currently the maximum supported value is 64 * PAGE_SIZE, e.g. 262144 if PAGE_SIZE is 4kB. Configuring too large values may lead to problems allocating sufficient memory, which in turn may lead to SCSI commands timing out at the initiator host. The default value is 8192.

d) DataDigest <CRC32C|None> - If set to "CRC32C" and the initiator is configured accordingly, the integrity of an iSCSI PDU's data segment will be protected by a CRC32C checksum. The default is "None". Note that data digests are not supported during discovery sessions.

e) MaxOutstandingR2T <value> - Controls the maximum number of data transfers the target may request at once, each of up to MaxBurstLength bytes. The default is 1.

f) InitialR2T <Yes|No> - If set to "Yes" (default), the initiator has to wait for the target to solicit SCSI data before sending it. Setting it to "No"
allows the initiator to send a burst of FirstBurstLength bytes unsolicited right after and/or (depending on the setting of ImmediateData together with the command. Thus setting it to "No" may improve performance.

g) ImmediateData <Yes|No> - This allows the initiator to append unsolicited data to a command. To achieve better performance, this should be set to "Yes".
The default is "No".

h) DataPDUInOrder <Yes|No> - It tells initiator if data has to be sent in order. Default is "Yes", which is also recommended.

i) DataSequencerInOrder <Yes|No> - It tells initiator if data has to be sent in order. Default is "Yes", which is also recommended.

j) HeaderDigest <CRC32C|None> - If set to "CRC32C" and the initiator is configured accordingly, the integrity of an iSCSI PDU's header segments will be protected by a CRC32C checksum. The default is "None".
Note that header digests are not supported during discovery sessions.

k) Wthreads - The iSCSI target employs several threads to perform the actual block I/O to the device. Depending on your hardware and your (expected) workload, the number of these threads may be carefully adjusted. The default value of 8 should be sufficient for most purposes.

Raudi
09-19-2007, 08:22 AM
Moin,

i tested this values but no change... Next week, when i'm back, i will install a windows with san melody or a open filer to see if it is a hardware problem. When this works it is a open-e problem...

Best regards
Stefan

To-M
09-19-2007, 02:19 PM
Did you send in the logs to support, we did not recieve them.

Raudi
09-26-2007, 01:52 PM
Hi Todd,

i'm back in town... Ask in munich, they must have the logs, i send them to the support.

Regards

Stefan

paul2kl
09-26-2007, 09:54 PM
Hi Raudi,

Did you try Sanmelody out ?? I was wondering how the ISCSI side compared. Ive been looking at it using FC cards with ESX server, and so far it works without issues. So from my point it is the open-e side that has issues with ESX server and FC cards certainly not my hardware.

Raudi
09-26-2007, 10:21 PM
No, last week i was not in the country. Vacation, came back today.

With the iSCSI Enterprise i cant't use FC, i was thinking about to use a iSCSI HBA but my ESX host is a dual Xeon quad core with 1,86 GHz, there must be enough CPU for the software iSCSI...

A customer has a EqalLogic and he has 93 MB/s during sequential read and write, and he uses the Software iSCSI too.

Regards
Stefan

Raudi
09-27-2007, 09:49 PM
Whow! I installed OpenFiler on my system and what a performance!

First i tested again the Open-E with IOmeter from a Virtual Machine:

1 Worker:

100% seqential, 100% read, 32 KB - 30 MB/sec
100% seqential, 100% write, 32 KB - 35 MB/sec

2 Workers:

100% seqential, 100% read, 32 KB - 10 MB/sec
100% seqential, 100% write, 32 KB - 60 MB/sec

Now the values when using OpenFiler on the same Hardware:

1 Worker:

100% seqential, 100% read, 32 KB - 43 MB/sec
100% seqential, 100% write, 32 KB - 30 MB/sec

2 Workers:

100% seqential, 100% read, 32 KB - 73 MB/sec
100% seqential, 100% write, 32 KB - 54 MB/sec

At last i started a tape restore and and then a IOmeter test from my Workstation with 4! Workers, i got a read speed of 107 MB/sec and have no problems with the restore speed.

I can start a IOmeter in a Virtual Machine with 4 Workers, they are reading with 95 MB/sec and the 4 Workers on the Workstation are running with 105 MB/sec at the same time....

And now? My hardware works...

I will test now with the Open-E again. But first i will load factory defaults, perhaps there are problems in the settings after the x updates...

Regards
Stefan