Disk problems
By Ryan on Friday 16 May 2008, 12:05 - Hosting - Permalink
Despite extremely low odds considering the type of hardware used, we lost 2 disks on a filer in under 3 minutes yesterday. While we do our best to warn you of any anticipated problems, we again stress the importance (and your contractual obligation) of maintaining an up-to-date backup of your server at an external location in the unlikely event that just such an incident might arise.
At 7:50 PM (GMT) we lost a third disk and all of its data
Our teams spent the night trying to recover the RAID 6 volume of filer 13, though this was eventually deemed not possible.
To be totally transparent, we are not going to hide behind the fact that we are still in the Beta testing phase, or that our hosting contract requires you to maintain a backup of your data on an external machine; The loss of your data is totally unacceptable. Even if what happened had nearly no change of occurring, it did, and those of you that were affected by this will be given a full refund.
At 7:50 PM (GMT) we lost a third disk and all of its data
Our teams spent the night trying to recover the RAID 6 volume of filer 13, though this was eventually deemed not possible. To be totally transparent, we are not going to hide behind the fact that we are still in the Beta testing phase, or that our hosting contract requires you to maintain a backup of your data on an external machine; The loss of your data is totally unacceptable. Even if what happened had nearly no change of occurring, it did, and those of you that were affected by this will be given a full refund.
Let us now go into detail about the changes we are making in the platform's disk architecture.
Right from the start, we have been having rather frequent disk problems with:
- sporadic disk loss (which went unnoticed because of RAID6 - until yesterday),
- temporary freezing that occurred during disk access,
- the fact that a filer failure might lead to the loss of data for our customers.
We therefore decided to extend the Beta testing phase until we have addressed all these issues. Our idea is to, even if it leads to additional costs, change the RAID structure so that your data is constantly replicated on 2 different filers. Work is advancing quite nicely towards this goal, and we hope to soon be able to provide you with good news about this.
In the meantime, what has occurred once can occur again, and we strongly advise you to keep constant backups of your data on a local disk at your own location. For those of you who are uncomfortable with Linux, we are in the process of writing a tutorial that will help do this.
You can follow the discussion and comment on this at the Gandi Bar.
Right from the start, we have been having rather frequent disk problems with:
- sporadic disk loss (which went unnoticed because of RAID6 - until yesterday),
- temporary freezing that occurred during disk access,
- the fact that a filer failure might lead to the loss of data for our customers.
We therefore decided to extend the Beta testing phase until we have addressed all these issues. Our idea is to, even if it leads to additional costs, change the RAID structure so that your data is constantly replicated on 2 different filers. Work is advancing quite nicely towards this goal, and we hope to soon be able to provide you with good news about this.
In the meantime, what has occurred once can occur again, and we strongly advise you to keep constant backups of your data on a local disk at your own location. For those of you who are uncomfortable with Linux, we are in the process of writing a tutorial that will help do this.
You can follow the discussion and comment on this at the Gandi Bar.

