Learning from the fall of Distribute.IT
On 13 June, Australian web host Distribute.IT revealed that it had been the victim of a malicious hacking attack, one which had irrevocably annihilated the data of 4800-odd websites hosted by the company. Andrew Collins looks at what online storage customers and storage administrators can each learn from the incident.
Very few details have been made public on the disaster.
According to a statement from Distribute.IT, websites and emails stored on four of the host’s servers were destroyed. Key backups and snapshots of this data were also singled out for destruction by the interloper, such that many customers’ data simply disappeared without hope of recovery.
One commenter on the Sydney Morning Herald’s website said that Distribute.IT’s Melbourne office is now locked, sporting a sign saying “Watch for updates on Twitter”. No further information there.
Those of you who have read this month’s news pages will know that webhost Netregistry has acquired customers and assets of Distribute.IT and Click n Go (a Distribute.IT brand). On the topic of the attack, Netregistry’s website says: “Â… there was a targeted hacker attack on the Distribute.IT network - following which the hacker caused permanent damage to backup systems as well as core services, making them impossible to restore.”
One blogger suggested this implied a lack of off-site backups at Distribute.IT - but this was mere conjecture and should not be taken for fact.
These details may become clearer later down the line. But in the meantime, we can look at how to avoid a similar storage disaster, either as a customer of online storage or as an entity responsible for its own backup.
Lessons for storage customers
There’s one primary lesson that customers looking for a new web host or cloud-based storage provider should take from the fall of Disribute.IT: that they should take great care when vetting prospective service providers and make sure a provider has sufficient data protection measures before signing up.
Kevin McIsaac, analyst at IBRS, has a list of six questions to ask a prospective provider:
- Ask for their documentation of their backup and recovery process. If they don’t have any this is a red flag, as recovery is about quality operations - which would mean they had documentation and processes.
- Look at how they ensure your data can be recovered if they lose their major facility, ie, is the data stored off-site. If the backup is not stored off-site, this is another red flag.
- See how far back you can recover from. Typically you want last night’s backup or a few days back. Anything over a month is generally not of much value. If the vendor keeps less than a week, this is another red flag.
- Ask how they validate that the backup is correct and readable.
- Ask them how often they do a disaster recovery test. If the answer is less than 12 months, this is another red flag.
- If the vendor uses disk-based backups, ask how they ensure the backups are not corrupted and if they are on a separate device from the primary.
Ian Raper, Regional Director ANZ at WAN optimisation vendor Riverbed Technology, says customers should ask about availability (including replication of data across multiple locations), performance, scalability and security.
“With regard to security, this includes encryption of data without the provider actually having access to the encryption keys and hence direct access to customers’ data,” Raper says.
And you shouldn’t just take a provider at their word - after all, some will promise you the world just to get you to sign on the dotted line. As such, McIsaac says reference checks are a must: “Ask the referees if they have ever had to recover files.”
Raper agrees that references are important, saying: “In terms of assessing risk, the acid test is to talk to a number of reference customers - ideally in similar industry segments or of similar size - using this provider. When doing such due diligence, it is important not to base the evaluation on published testimonials but to speak directly to reference customers and ask probing questions.”
Throughout your questioning of prospective vendors and their referees, there’s a long list of warning signs you should keep an eye out for. In addition to those mentioned in McIsaac’s list of tips, Raper adds: “Service unavailability, poor performance, inability or unwillingness of the service provider to allow regular testing or data restoration.”
If you spot any of these, you should ask yourself if the provider in question is really the one for you.
These tips are applicable not only to those searching for a new service provider, but also those already engaging the services of one. If you already use a web host, cloud-based storage or outsourced backup service, you should make sure their data protection measures are up to scratch.
“As in The Wizard of Oz, it’s important to pull back the curtain and fully understand the mechanics; customers need to be able to see the technology in use and compare it to industry best practices,” Raper says.
So if you don’t already know how your service provider takes care of your data, it’s time to give them a call and ask some hard questions.
And if your provider comes up wanting - simply move on. McIsaac explains: “Ask for a complete copy of your data and look to move to another vendor.”
But Raper maintains this is not as easy as it sounds.
“A customer can obviously elect to move to another service provider with a higher level of data protection but such a migration can be time-consuming, costly and disruptive. A recent alternative is for the customer to use new technology and another provider (as the data protection target) to add the required level of data protection themselves,” Raper says.
And while service providers must protect the data they hold, customers of these services also have a role to play in safeguarding their data.
“Even though you are outsourcing, you still need to take ownership for disaster recovery testing,” McIsaac says.
The analyst suggests undertaking one or two disaster recovery tests each year. And given the importance of this, he says, “This is something you should be prepared to pay for.”
There are two types of test you can request. The first is recovering a file from a specific date. This demonstrates that the provider can indeed supply a backup from a particular date.
And second, you can request to have a backup from the last few days recovered to a test system. This test demonstrates that the provider has a “point in time” copy that you can restore.
It’s also up to the customer to ensure that they’re making use of all the data protection features a service provider has to offer, as Brian Goodman-Jones, CTO at cloud computing service provider Ninefold, explains.
“The customer should also take advantage of service provider add-on services; such as recurring backups, recurring snapshots or multiple site locations, to ensure that they are able to meet their own recovery point objectives,” Goodman-Jones says.
And make sure you establish a service level agreement (SLA) with your provider that quantifies exactly what levels of data protection you can expect.
According to McIsaac, the basics that your SLA must cover are recovery time (how long it takes before your system comes back online) and recovery objective (how much data is lost - do they recover to last night’s data, or the day before?).
A good SLA will also cover the availability of your data, according to Raper. The exact figures will depend on business needs. Some organisations are happy with an uptime of 99.9% (around 8 hours’ downtime over the course of a year). Others, with more stringent availability requirements, would require 99.999% uptime (around 5 minutes’ downtime over a year). Less downtime will, of course, cost you.
“The SLA should also include other metrics such as performance (eg, the ability to backup and restore 1 GB of data in a certain time period with a given amount of bandwidth), scalability (the ability to add capacity through simple licensing) and security (eg, data must be encrypted and the provider must not have access to the encryption keys or the unencrypted data),” Raper says.
Lessons for storage providers
For those responsible for backup and disaster recovery - storage administrators and storage service providers - the fundamental message to take from the Distribute.IT disaster is ridiculously simple: make frequent backups, keep them in multiple locations and test them regularly.
“Sometimes the simplest things turn out to be the easiest and best. Keeping to the basic principle of frequent backups across multiple devices (as well as regular testing of the restore process to ensure that the data you are backing up is recoverable and useable) is the place to start,” says Goodman-Jones.
This should all be “backed by documented processes and procedures (and appropriate data security)”, he says.
This sentiment is mirrored by Raper, who says: “It is as simple - and as complex - as that; the backups need to be in multiple locations, the solution needs to provide rapid backups and restores, and it needs to be economically viable.”
In terms of particular technologies, McIsaac says that “while disk-to-disk is fashionable, tape has a few benefits” to help organisations meet these principles: each backup tape is a physically separate entity, so if any of them are corrupted, the rest are unaffected; tapes can easily be taken offline, preventing them from being hacked; and tapes can easily be moved off-site, providing resilience against a site failure.
McIsaac says that organisations should use a robust backup tool that automates the backup processes, and validate that the backup is readable each time one is produced. If you’re relying on staff member Bob to manually hit the ‘backup’ button once a day, bad things may happen to your business if Bob chucks a sickie, quits or is simply absent-minded.
And it doesn’t matter if you’re a one-man operation with a one terabyte consumer-grade hard drive or a multinational conglomerate with trillions of petabytes of data - Raper says these principles “absolutely” apply to all organisations, regardless of size.
McIsaac has a similar story, saying the only difference is “the amount of money you would spend on the infrastructure and the time spent testing/validating changes”.
And risk registry for all
There’s one snippet of advice that applies to storage users and storage administrators alike: create a ‘risk register’ that identifies the specific risks and consequences of losing your data.
“For each risk identified, you have the choice of either putting actions in place to mitigate the risk or accepting the risk. The choice to mitigate or accept a risk usually comes down to some form of cost-risk analysis. At this point, you now know your risks, mitigation strategies and exposures (if any),” Goodman-Jones says.
He also has some sage advice for everyone concerned with data protection.
“The key thing to remember is to keep things as simple as possible. The more complexity, the more chance for something to go wrong in increasingly unpredictable ways. Everything boils down to keeping multiple copies independent of each other. Everything else is just the detail to make that happen. Try to achieve that goal in the simplest and most foolproof way and you should have fewer unpleasant surprises,” Goodman-Jones says
Two large-scale, grid-connected batteries are to be built in Victoria with the help of the...
Companies looking to modernise their overall IT infrastructure cannot afford to take a relaxed...
CIOs must free their organisations from complex backup strategies in order for storage and...