Related Topics: @CloudExpo, Containers Expo Blog, SDN Journal

@CloudExpo: Article

The Future of Data Storage Solutions | @CloudExpo #SDS #Cloud #Storage

Businesses need access to billions of files, which often means moving to a new storage system. Let's review your options.

Bridging the divide between legacy storage and new data management platforms could constrain IT organizations and budgets and could prevent the utilization of cost-effective scalable storage infrastructures. But, businesses can avoid some of these constrains by evaluating their storage options objectively and asking themselves three important questions.

A decade ago, we were putting 250-gigabyte drives into servers. When people mentioned the cloud, they were talking about the weather, and a business was considered to be on the cutting edge if it needed to store a few million files.

Now, we have access to 10-terabyte drives, and grandparents are using the cloud to store pictures of their grandkids. It's now common for businesses to need access to billions of files, so companies need to move to newer systems to keep track of everything. With so many options available today, what's really the best solution for storage?

Common Data Storage Systems
To truly understand legacy storage systems, you need to know how storage has evolved beyond the hard drive. Here is a quick rundown of the most common solutions that have emerged over the years:

  • Storage Area Network (SAN): A SAN is a dedicated network that connects storage devices with servers typically using a Fibre Channel, InfiniBand, or Ethernet. SANs are commonly used for database servers and other applications that require a low-latency block-level storage interface. Advanced setups allow for clustering and failover capabilities among the servers.

    The downsides of SANs are that they often require exotic network hardware, proprietary software tools, and specialized staff to deploy and manage them. For these reasons, membership in the storage area network is normally limited to a small number of servers.

  • Network Attached Storage (NAS): The storage devices in a NAS can be a purpose-built NAS appliance or a general-purpose server running Windows or Linux that delivers files to clients. While there have historically been many protocols that connect storage devices and clients, the market has settled upon a couple: Network File System (NFS) and Server Message Block (SMB). An NAS appliance is an all-in-one bundle of integrated hardware and software that is built for the sole purpose of delivering files to clients. Almost any general-purpose server can also deliver files and act like an NAS with the appropriate level of administrative configuration.

    Unfortunately, there are inherent disadvantages of NAS systems. With limited potential to scale, they can quickly become costly, complex, and labor-intensive to manage.

  • Software-Defined Storage (SDS): SDS is still an evolving concept that can include file-based, object-based, block, cloud and storage management solutions. Software-defined storage essentially separates the data and services layers from the underlying hardware. Software-defined storage solutions typically involve storage virtualization, and they may provide features like search, organization, replication, distribution, thin provisioning, snapshots and backup to name a few.

  • Cloud-Based Storage (Public and Private): As a multi-tenant environment, a public cloud storage system requires you to purchase a portion of a cloud-based computing environment that is shared with many other tenants. Public cloud storage is offered in an on-demand arrangement with monthly payments that can be advantageous; however, capacity and access costs are compounded monthly and won't go down until data is deleted. Because you pay for the bandwidth to use your data, you may resist running analytics and other operations that would incur additional monthly charges.

    Private cloud storage solutions let you deploy storage as a service within your data center. You need to make an upfront capital investment in hardware and have the data center space and electrical power to run the service. If security is a priority, you are storing large amounts of data for long periods of time, or you are performing a lot of reads (such as analytics) on your data, a private cloud is almost always the best option. There are also hybrid solutions that provide a combination of private and public services.

  • Object-Based Storage: One popular type of SDS is object-based storage, which is at the heart of many public and private cloud-based storage services. In this model, there is no hierarchical folder structure; however, object-based storage does provide a method for data organization using metadata (often defined as "data about data"). In object-based storage systems, the data is organized into self-contained entities (objects). This flat approach provides for greater scalability and can be less expensive than block or file-based storage systems. For businesses with a need to store and search through high volumes of data, this is often the ideal solution.

Building a cost-effective and scalable storage infrastructure is not a task to be taken lightly. Initiatives like this have the potential to impact IT resources and inflate budgets. So how do you bridge the divide between legacy storage systems and the new data management platforms?

Planning for the Future
Bridging the gap between legacy storage and newer technologies sometimes requires ensuring compatibility through protocols such as S3, RESTful HTTP, NFS, and SMB. As a result, business and IT leaders should consider a few important matters before determining the best data management platform to use for taking their businesses into the future.

  1. What type of data is being stored, and how quickly is it growing? SAN and NAS are still your best options for structured data; however, the total amount of structured information an organization has is often less than 10 percent. Unstructured data is often 90 percent or more of the total capacity need. If you focus on your unstructured data growth rate year over year, you'll most likely notice an acceleration.

    Some of this acceleration can be accounted for in factors such as the improvements in resolution for videos and images as well as new sources of unstructured data, such as log files, metrics, and data created by devices. Create a formula based on these considerations, and use it in conjunction with your historic storage capacity compounded annual growth rate (CAGR) to estimate your needs three to five years out. Using your forecasted capacity need, select a storage solution that can expand to accommodate your expected growth.

  2. What are your access patterns? When you think about access, consider what (device, application, etc.) and who needs access and exactly how they will access it (e.g., geographical location and interface or search mechanism). When you have billions of files, how will you find what you need? Almost as important, how will you determine what you don't need so you can confidently delete this data? When choosing your future storage platform, make sure your chosen solution supports your organization's access requirements.

  3. How long must the data be retained? Data retention rates vary by industry from a few seconds to indefinitely. When you think about retention, consider the cost of different protection methods versus the value of the data and ease of migration (e.g., how easy it is to continue to evolve the underlying hardware infrastructure). If you factor ease of migration into your decisions today, you will make your life simpler when you one day find yourself needing to migrate petabytes or possibly exabytes of data.

    Beyond how long you are required to retain data, consider how long that data may be valuable to you from both an information and a monetary perspective.

The relationships between data and keeping content accessible and instantly searchable increase profit and agility, something every forward-thinking business leader understands. If you can keep your data online, organize it, and search it, you can continue to extract value from it.

In the information age, those who can leverage long-tail data will not only succeed, but they will also reap benefits in orders of magnitude greater than those constrained by the limits of traditional technologies.

More Stories By Jonathan Ring

Jonathan Ring is co-founder and CEO of Caringo, a leading scale-out storage provider. Prior to Caringo, Jonathan was an active angel investor advising a broad range of companies, and he was a vice president of engineering at Siebel Systems, where he was a member of the executive team that grew Siebel from $4 million to $2 billion in sales. Jonathan’s passion and experience are shaping the future of Caringo.

Latest Stories
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
SYS-CON Events announced today that Niagara Networks will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Niagara Networks offers the highest port-density systems, and the most complete Next-Generation Network Visibility systems including Network Packet Brokers, Bypass Switches, and Network TAPs.
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
SYS-CON Events announced today that StarNet Communications will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. StarNet Communications’ FastX is the industry first cloud-based remote X Windows emulator. Using standard Web browsers (FireFox, Chrome, Safari, etc.) users from around the world gain highly secure access to applications and data hosted on Linux-based servers in a central data center. ...
SYS-CON Events announced today that Cemware will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Use MATLAB functions by just visiting website mathfreeon.com. MATLAB compatible, freely usable, online platform services. As of October 2016, 80,000 users from 180 countries are enjoying our platform service.
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

SYS-CON Events announced today that eCube Systems, the leading provider of modern development tools and best practices for Continuous Integration on OpenVMS, will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. eCube Systems offers a family of middleware products and development tools that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...
Effectively SMBs and government programs must address compounded regulatory compliance requirements. The most recent are Controlled Unclassified Information and the EU’s GDPR have Board Level implications. Managing sensitive data protection will likely result in acquisition criteria, demonstration requests and new requirements. Developers, as part of the pre-planning process and the associated supply chain, could benefit from updating their code libraries and design by incorporating changes.
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
SYS-CON Events announced today that Isomorphic Software will exhibit at DevOps Summit at 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, cutting-edge enterprise web applications for desktop and mobile. SmartClient combines the productivity and performance of traditional desktop software with the simp...
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
DevOps is speeding towards the IT world like a freight train and the hype around it is deafening. There is no reason to be afraid of this change as it is the natural reaction to the agile movement that revolutionized development just a few years ago. By definition, DevOps is the natural alignment of IT performance to business profitability. The relevance of this has yet to be quantified but it has been suggested that the route to the CEO’s chair will come from the IT leaders that successfully ma...