Is direct-attached storage dead?
Several years ago, there was a concerted move from direct-attached storage (DAS) towards SAN (storage area network) and NAS (networked-attached storage). But emerging storage technologies could mean the resurgence of DAS within the next 10 years.
Two of the main drivers of the move from DAS to SAN and NAS were availability and virtualisation.
With traditional DAS, systems managers who needed to increase the amount of storage on say an Exchange server might shut down the server, install an additional drive, copy files to the new drive as necessary and restart Exchange - a relatively time-consuming process. (Yes, there are variations on the theme, but that is just an illustration.)
But with a storage array on a SAN, it became possible to ‘dial up’ additional storage for that server with little or no downtime. When a new physical drive is required, it can usually be installed without shutting down the array.
Another downside of DAS is that in the event of a failure the data is not available to other servers. Restoring service therefore involves fixing the server, transplanting its drives into another server or restoring the most recent backup plus the subsequent transactions to a replacement server. Again, it’s not a quick process. More sophisticated organisations might continuously replicate the data to a second server (possibly at a different location) to support rapid failover, but that is relatively expensive and may lack flexibility.
One of the big attractions of virtualisation is that it provides a way to move workloads between physical servers, whether that is for availability or for operational convenience (eg, if the utilisation of a physical server falls below a certain level, its virtual machines could be moved automatically so that server can be shut down).
That only works if the data is accessible by other servers. If the only non-backup copy is on disks that are physically attached to one server, there is no point moving the virtual machine to another server.
On the other hand, the input/output requirements of some demanding applications could only be met by DAS, so it never went away.
What (currently) needs DAS?
Kevin McIsaac, IT industry analyst at IBRS, said a requirement for very high IOPS (input/output operations per second) at a modest price is most easily met by DAS, especially by using flash storage (speed) alongside spinning disks (economy). That combination of requirements is particularly relevant to VDI (virtualised desktops) and Hadoop (big data analysis), he said.
Otherwise, the move to SAN was driven largely by the heterogeneity of hardware and workloads, and the ability to create consolidated pools of storage independent of the hardware and operating systems in use.
“DAS isn’t going away at all,” said John Marin, principal technologist at NetApp. “It’s seeing a resurgence for a variety of reasons.”
Historically, the model was one application per server and DAS. The marginal cost of storage was low and the administrators of those servers had control over their own storage. That changed with the advent of widespread virtualisation, which required shared storage and triggered big growth in the SAN and NAS markets - from which NetApp benefited.
While companies such as Facebook and Salesforce.com do a lot with generic hardware including DAS by building resilience into their software, this level of data management is not a core competence of most organisations so a rich, centralised data management system is compelling. Technologies such as those in NetApp’s Data Ontap Edge combine the benefits of DAS with those of a centralised system including backup and replication, Marin said.
Marin suggested DAS is appropriate for small sites with up to three servers and also for large installations where 20 or more servers are running the same application, but shared storage makes more sense in the middle ground, which accounts for most of the market.
David Holmes, programme manager for enterprise storage at Dell Asia Pacific, identified specific workloads where DAS fits the bill: high-performance computing (a large number of DAS spindles provide the required speed with low cost and complexity), non-virtualised Exchange servers and anything that needs ‘cheap and deep’ storage. The main benefit is that DAS is more cost-effective in these situations.
Adrian De Luca, chief technology officer for Asia Pacific at Hitachi Data Systems (HDS), observed “trends seem to skip a couple of decades before they come back again”, suggesting that is what is happening with DAS.
James Forbes-May, vice president for data management at CA Technologies, had a similar perspective: “Things go in cycles … there are different priorities over time.”
Early tests and trials with new-style DAS architectures are just beginning, De Luca said; for example, in small-scale VDI rollouts (as in hundreds rather than thousands of seats) where PCIe flash storage can deliver a significant benefit. Such scale-out architectures (where compute and storage are in the same box) are also suited to big data projects, he said.
Giving DAS the features needed to enable rapid recovery in the event of a disaster requires a lot of software capabilities and integration, said De Luca. “I don’t think any vendor has quite answered that question.”
HDS uses Fusion-io flash cards in some of its Unified Compute Platform products, with the addition of the ability to write data through to a SAN for protection. But De Luca said it is important to consider whether the application will tolerate the greater latency that occurs in the event of a failover resulting in it accessing data on remote storage. Another problem with write-through technology is that “there’s always a degree of risk because it’s an asynchronous operation”, he said.
Forbes-May was more bullish about DAS, saying that tier one servers are increasingly being used with DAS as a way of getting the required performance, with a variety of technologies such as host-based replication applied to ensure high availability. “Seconds to minutes of data loss” are acceptable to 95% of businesses, he observed, pointing out that if virtually zero downtime is required then more expensive approaches such as clustering can be employed. (Forbes-May noted that CA’s Project Oolong - a unified data protection system spanning conventional disk or tape backup through to instant failover for critical servers - is currently in ‘customer validation’ with a release version expected sometime in the next 12 months.)
Geoff Hughes, data management practice lead at Bridge Point Communications, said there has been a recent resurgence in the use of DAS. This started with Exchange 2010 (proven to be a huge success story in terms of failover between DAS-equipped servers, to the extent that it’s now rare to see Exchange deployments using SAN, he said), and has been further boosted by the emergence of software-defined storage products such as VMware’s Virtual SAN (VSAN) and the Nutanix Virtual Computing Platform (which combines processing and storage in a modular appliance).
Aaron Steppat, product marketing manager at VMware, said one of the big drawbacks of SAN was that it could only be managed from the server it was connected to. VSAN changes that by providing a single point of management for DAS across multiple servers. It offers highly redundant distributed storage - copies of the data are held on at least three nodes - while holding the data as close as possible to the compute resource that uses it. VSAN also accommodates tiered storage so solid-state devices can be used for cost-effective performance.
According to Hughes, the focus on making SAN storage more efficient through the use of technologies such as deduplication has largely stemmed from the cost of storage arrays, but inexpensive flash and SATA drives that can be used without complex controllers or proprietary software make DAS a cheaper proposition.
Workloads identified by Steppat as suitable for VSAN include test and development (to avoid having any impact on production systems, for example), big data (eg, in conjunction with vSphere 5.5 or VMware’s virtualisation-specific version of Hadoop known as Project Serengeti), situations where cost-effective disaster recovery is required (it avoids the need for an expensive storage array at both locations) and VDI. He noted that where conventional storage arrays may struggle to cope with hundreds or thousands of users logging in simultaneously (eg, at the start of a contact centre shift), VSAN plus direct-attached flash storage can deal with such a “boot storm”.
VDI is a good example of workloads suited to new-style DAS, said Hughes. “It’s usually hard to put that sort of workload on a SAN,” he said, though he concedes Pure Storage’s all-flash approach to shared storage is “interesting”. Ten years ago, SAN controllers called for expensive technology, but now general-purpose CPUs offer the required performance at much lower cost.
But contrary to Steppat’s observation, Hughes said SAN is still the order of the day in environments that require very high availability, especially with cross-site failover: “There’s no answer for that [using DAS and TCP/IP],” he said. SAN is also the way to go where mainframe connectivity is needed, he added.
Oracle’s Engineered Systems approach looks at storage in a different way. DAS is installed in the Exadata Database Machine, and because the Oracle Database runs in that same box, it is in full control and can guarantee that no data is lost during the write process, explained Peter Thomas, senior director, Exadata and strategic solutions, Oracle Australia and New Zealand. The architecture places both the database servers and database storage in the same machine and benefits from the low latency of DAS.
In addition, the Exadata storage software filters out unwanted records only returning the required result set to the database server; something that is not possible if the database files are held on a conventional storage array. This could make the difference between transferring 200 KB or 200 GB of data across a network, he said. (In-memory databases get around that problem, but you still need to get all the data into memory in the first place and to replicate it to non-volatile storage.)
Using the Oracle DataGuard feature of the Oracle database, databases stored on an Exadata can be replicated for high availability to another Exadata or to an Oracle SuperCluster, or even conventional hardware, he said.
“We’re using DAS, but in a different way,” said Sam Voukenas, storage product director, Oracle Australia and New Zealand.
Oracle started moving in this direction in the mid-2000s, realising that while CPUs, memory and internal buses were all getting faster, disks were not keeping pace and therefore storage was becoming the bottleneck in large transaction processing systems, said Thomas.
That said, “It’s all about the workload,” he observed. IT needs to provide users with what they need, so if shared storage provides the required price/performance ratio, it makes sense to use it. Voukenas pointed to Oracle’s ZS3 storage appliances that give world-record benchmark results for shared storage.
But if an application calls for DAS, that’s fine too, said Voukenas, adding that there’s nothing wrong with using a combination of direct-attached and shared storage within an organisation - in fact, using both can be beneficial.
McIsaac predicted that by the end of the decade “people will move back to captive storage in the servers”, driven by developments in software-defined storage.
In such highly virtualised environments, a conventional SAN has less value according to McIsaac. Instead, DAS in each server is managed by hypervisor-level software so the data is available from pools of storage available across the installation. “It does for storage what VMware does for compute and memory,” he said. This approach is especially relevant to SMEs, but also makes sense for enterprise-scale organisations.
Marin agreed, tipping “a massive resurgence in DAS” due to technologies such as VSAN and the Storage Spaces feature of recent versions of Windows.
VSAN and similar approaches have several benefits, McIsaac said. For example, they make it simple to use commodity hardware (vs specialised storage arrays) and applications get the maximum performance from the storage devices (especially high-performance yet relatively low-cost PCIe flash storage). To get that peak performance, policies can be applied to ensure that certain workloads always run on a server that holds a copy of the data they use - this is diametrically opposed to the SAN model, where the data is never resident on the server that’s doing the processing (unless some sort of caching is built into the overall system). This is essentially similar to the Hadoop model of moving processing to wherever the data resides, but for a wider variety of workloads, he said.
Software-defined storage combines the advantages of DAS with those of SANs, said Holmes, and is particularly useful for storing very large amounts of unstructured data where performance and availability are important. It is also cheaper than SAN where large amounts of data must be retained for many years for compliance reasons, he said.
Marin went further, offering a personal opinion that in the 2017-2020 time frame, all active data will be held in solid-state storage connected directly to servers. He said the future of enterprise computing in this regard can be seen in the current high-performance computing and consumer IT markets.
But for mid-tier companies - those running say 50 to 100 production virtual machines - McIsaac said the real question is “why shouldn’t I do this in the cloud?” Organisations of this size worrying about DAS or SAN are “probably focused on the wrong problem” as cloud can be very cost-effective and also eliminates the need for decisions about hardware configuration. “Four years from now … it’s all going to be in the cloud,” he predicted.
Two large-scale, grid-connected batteries are to be built in Victoria with the help of the...
Companies looking to modernise their overall IT infrastructure cannot afford to take a relaxed...
CIOs must free their organisations from complex backup strategies in order for storage and...