Welcome!

Blog Feed Post

AWS S3 Storage Gateway Revisited (Part I)

server storage I/O trends

AWS S3 Storage Gateway Revisited (how to avoid an install error)

This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it's called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.

If you need an AWS primer and overview of various services such as Elastic Cloud Compute (EC2), Elastic Block Storage (EBS), Elastic File Service (EFS), Simple Storage Service (S3), Availability Zones (AZ), Regions and other items check this multi-part series (Cloud conversations: AWS EBS, Glacier and S3 overview (Part I) ).

AWS

As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard, reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.

Note that S3 IA is not Glacier as your data always remains on-line accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party's. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.

AWS S3 Storage Gateway and What's New

The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file or tape based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premise applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.

Some of the things that have evolved with the S3 Storage Gateway include:

  • Easier, streamlined download, installation, deployment
  • Enhanced Virtual Tape Library (VTL) and Virtual Tape support
  • File serving and sharing (not to be confused with Elastic File Services (EFS))
  • Ability to define your own bucket and associated parameters
  • Bucket options including Infrequent Access (IA) or standard
  • Options for AWS EC2 hosted, or on-premise VMware as well as Hyper-V gateways (file only supports VMware and EC2)

AWS Storage Gateway Three Functions

AWS Storage Gateway can be deployed for three basic functions:

    AWS Storage Gateway File Architecture
    AWS Storage Gateway File Architecture via AWS.com

  • File Gateway (NFS NAS) - Files, folders, objects and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premise file gateway.

    Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one to one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Non-Cached e.g. Stored Volume Gateway Architecture
    AWS Storage Gateway Non-Cached Volume Architecture via AWS.com

    AWS Storage Gateway cached volume Architecture
    AWS Storage Gateway Cached Volume Architecture via AWS.com

  • Volume Gateway (Block iSCSI) - Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.

    Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There are a maximum of 32 volumes for a gateway with total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Gateway VTL Architecture
    AWS Storage Gateway VTL Architecture via AWS.com

  • Virtual Tape Library Gateway (VTL) - Supports saving your data for backup/BC/DR/archiving into S3 and Glacier storage tiers. Being a Virtual Tape Library (e.g. VTL) you can specify emulation of tapes for compatibility with your existing backup, archiving and data protection software, management tools and processes.

    Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.

    AWS

Where To Learn More

What This All Means

As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premise Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases. Continue reading more and view images from the AWS Storage Gateway Revisited test drive in part two located here.

Ok, nuff said (for now...).

Cheers
Gs

Greg Schulz - Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book "Software-Defined Data Infrastructure Essentials" (CRC Press).

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2017 Server StorageIO(R) and UnlimitedIO All Rights Reserved

Read the original blog entry...

More Stories By Greg Schulz

Greg Schulz is founder of the Server and StorageIO (StorageIO) Group, an IT industry analyst and consultancy firm. Greg has worked with various server operating systems along with storage and networking software tools, hardware and services. Greg has worked as a programmer, systems administrator, disaster recovery consultant, and storage and capacity planner for various IT organizations. He has worked for various vendors before joining an industry analyst firm and later forming StorageIO.

In addition to his analyst and consulting research duties, Schulz has published over a thousand articles, tips, reports and white papers and is a sought after popular speaker at events around the world. Greg is also author of the books Resilient Storage Network (Elsevier) and The Green and Virtual Data Center (CRC). His blog is at www.storageioblog.com and he can also be found on twitter @storageio.

Latest Stories
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, paneli...
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), provided an overview of various initiatives to certify the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldwide re...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
IoT solutions exploit operational data generated by Internet-connected smart “things” for the purpose of gaining operational insight and producing “better outcomes” (for example, create new business models, eliminate unscheduled maintenance, etc.). The explosive proliferation of IoT solutions will result in an exponential growth in the volume of IoT data, precipitating significant Information Governance issues: who owns the IoT data, what are the rights/duties of IoT solutions adopters towards t...
With the introduction of IoT and Smart Living in every aspect of our lives, one question has become relevant: What are the security implications? To answer this, first we have to look and explore the security models of the technologies that IoT is founded upon. In his session at @ThingsExpo, Nevi Kaja, a Research Engineer at Ford Motor Company, discussed some of the security challenges of the IoT infrastructure and related how these aspects impact Smart Living. The material was delivered interac...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
Wooed by the promise of faster innovation, lower TCO, and greater agility, businesses of every shape and size have embraced the cloud at every layer of the IT stack – from apps to file sharing to infrastructure. The typical organization currently uses more than a dozen sanctioned cloud apps and will shift more than half of all workloads to the cloud by 2018. Such cloud investments have delivered measurable benefits. But they’ve also resulted in some unintended side-effects: complexity and risk. ...
It is ironic, but perhaps not unexpected, that many organizations who want the benefits of using an Agile approach to deliver software use a waterfall approach to adopting Agile practices: they form plans, they set milestones, and they measure progress by how many teams they have engaged. Old habits die hard, but like most waterfall software projects, most waterfall-style Agile adoption efforts fail to produce the results desired. The problem is that to get the results they want, they have to ch...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
In 2014, Amazon announced a new form of compute called Lambda. We didn't know it at the time, but this represented a fundamental shift in what we expect from cloud computing. Now, all of the major cloud computing vendors want to take part in this disruptive technology. In his session at 20th Cloud Expo, Doug Vanderweide, an instructor at Linux Academy, discussed why major players like AWS, Microsoft Azure, IBM Bluemix, and Google Cloud Platform are all trying to sidestep VMs and containers wit...
The taxi industry never saw Uber coming. Startups are a threat to incumbents like never before, and a major enabler for startups is that they are instantly “cloud ready.” If innovation moves at the pace of IT, then your company is in trouble. Why? Because your data center will not keep up with frenetic pace AWS, Microsoft and Google are rolling out new capabilities. In his session at 20th Cloud Expo, Don Browning, VP of Cloud Architecture at Turner, posited that disruption is inevitable for comp...
While DevOps most critically and famously fosters collaboration, communication, and integration through cultural change, culture is more of an output than an input. In order to actively drive cultural evolution, organizations must make substantial organizational and process changes, and adopt new technologies, to encourage a DevOps culture. Moderated by Andi Mann, panelists discussed how to balance these three pillars of DevOps, where to focus attention (and resources), where organizations might...
No hype cycles or predictions of zillions of things here. IoT is big. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, Associate Partner at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He discussed the evaluation of communication standards and IoT messaging protocols, data analytics considerations, edge-to-cloud tec...
When growing capacity and power in the data center, the architectural trade-offs between server scale-up vs. scale-out continue to be debated. Both approaches are valid: scale-out adds multiple, smaller servers running in a distributed computing model, while scale-up adds fewer, more powerful servers that are capable of running larger workloads. It’s worth noting that there are additional, unique advantages that scale-up architectures offer. One big advantage is large memory and compute capacity...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists examined how DevOps helps to meet the de...