Visit Open-E website
Results 1 to 10 of 25

Thread: Exomium S200 and DSS V6 and LSI 9265i and Cachecade -> Data corruption

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1

    Default

    I don't believe this is true.

    We have 1000's of V6 and V7 in production and we would have updated the LSI Megaraid SAS drivers from the LSI release notes if this was true.

    I have checked our support DB and have not found any records of this associated with any support tickets of this issue. What was the ticket that Open-E stated the same as LSI?

    We don't make the driver and the Cachecade is an addon feature to certain LSI controllers. Normally the firmware would control the Cachecade along with some parts of the driver to perform the algorithms it uses for hot data.

    I would also like to know the case w/ LSI where they stated this, you can submit a support ticket to Open-E with the contact information.

    We have several webinars about the LSI Cachecade and I did one with the LSI engineer in March 28th of 2012 and nothing was mentioned about a out dated driver that can cause corruption.

    Again we don't make the drivers, the vendors do so DSS would not be involved here. If there was corruption then it could be where the SSD for the cache was a RAID 0 and not a RAID 1 thus if losing the SSD in a RAID 0 then yes you can have corruption for the Writes at that time the SSD went bad but not for the Reads.
    Last edited by To-M; 08-14-2013 at 10:34 PM.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  2. #2

    Default

    Hi Todd,

    I wrote a main to Janusz Bak, in copy to support@open-e.com, with the statement of my hardware vendor.

    It's in german (can read it too?), maybe Janusz can clarify this.

    Thanks again
    Henri

  3. #3

    Default

    Yes, saw the email and saw what LSI stated that the release was in March of last year, so we did have the driver then as I did the video with LSI last year in March of the 28th. Again we get the drivers from LSI and have the small updates for them. Looks like you got the small updates already from Janusz.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  4. #4

    Default I've also had problems

    Hi Henri,

    I've also had massive problems with data-corruption on a production system. My setup is two DSS 7 boxes as an ISCSI SAN in an active/active fail-over / replication cluster. I think I've now narrowed the cause down to LSI firmware v23.16.0-0018 having a problem with the CacheCade write cache. With the cache turned off, or read only there is no issue. As soon as I set up a write cache my VM's start catastrophically corrupting, particularly after write intensive operations - e.g. Windows Updates. I lost about 20 production servers due to this - complete disaster.

    Todd - I opened a support ticket #1035676 on 11th July with this issue so I'm surprised you say you aren't aware of this. I have had CacheCade up and running with no issues prior to this firmware upgrade, so it's true that it's generally very stable, but I'd strongly advise anyone running a CacheCade write cache not to upgrade to the latest firmware.

    I'd agree with Todd that it's unlikely to be a driver issue as all the caching is handled by the RAID card and is invisible to the host system. However it seems odd that if you Google "23.16.0-0018 corruption" this thread is one of only two hits you get so I do wonder if there is something about the combination of this firmware and DSS that causes an issue. One would think that if the problem was more general LSI would be aware of the issue and would have pulled the firmware by now.

    I'm still testing, but the problem does seem to go away if I revert to an older LSI firmware.

    Tim

  5. #5

    Default

    Henri,

    This case of yours is being investigated by the QA team and we did contact LSI here in the USA and they do not know of any issues, they also wanted us or you to reproduce this and let them know. We have a huge customer base using them and I am not defending them just saying until we can prove it is LSI firmware and reproduce the issue then we can inform them. Also did you submit a support ticket with LSI about this? Again we get the drivers from them and until it is reproducible then I cant say it is for sure the driver or firmware.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  6. #6

    Default

    Hi Todd,

    I can reproduce this. Roll back to the 23.12.0.0013 firmware and all is ok. If I move to the 23.16.0-0018 firmware the corruption starts happening again.

    If there is anything I can do to help your investigations let me know ASAP, I'd be glad to help.

    Tim

  7. #7

    Default

    Tim,

    I am sending this link to the LSI guys, did you show LSI your test results showing them this is issue? Seems our guys after testing believe they have reproduced the same issue.
    All the best,

    Todd Maxwell


    Follow the red "E"
    Facebook | Twitter | YouTube

  8. #8

    Default

    Thanks for the update Todd.

    My conversation with LSI ended when they told me I should replace my SSDs with ones on their approved hardware list. Frankly I've not found their first line support very helpful. If their first line support aren't listening to their customers problems but are trying to find excuses for not helping then it doesn't surprise me it takes them weeks for them to recongise a problem with their firmware. Hopefully you'll be able to escalate this to someone who will take notice.

    Glad you guys have been able to reproduce the issue, I was starting to think I was alone in having this problem.

    Would appreciate any updates on this.

    Thanks,

    Tim

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •