Blog Feed Post

AWS S3 Storage Gateway Revisited (Part I)

server storage I/O trends

AWS S3 Storage Gateway Revisited (how to avoid an install error)

This Amazon Web Service (AWS) Storage Gateway Revisited posts is a follow-up to the AWS Storage Gateway test drive and review I did a few years ago (thus why it's called revisited). As part of a two-part series, the first post looks at what AWS Storage Gateway is, how it has improved since my last review of AWS Storage Gateway along with deployment options. The second post in the series looks at a sample test drive deployment and use.

If you need an AWS primer and overview of various services such as Elastic Cloud Compute (EC2), Elastic Block Storage (EBS), Elastic File Service (EFS), Simple Storage Service (S3), Availability Zones (AZ), Regions and other items check this multi-part series (Cloud conversations: AWS EBS, Glacier and S3 overview (Part I) ).


As a quick refresher, S3 is the AWS bulk, high-capacity unstructured and object storage service along with its companion deep cold (e.g. inactive) Glacier. There are various S3 storage service classes including standard, reduced redundancy storage (RRS) along with infrequent access (IA) that have different availability durability, performance, service level and cost attributes.

Note that S3 IA is not Glacier as your data always remains on-line accessible while Glacier data can be off-line. AWS S3 can be accessed via its API, as well as via HTTP rest calls, AWS tools along with those from third-party's. Third party tools include NAS file access such as S3FS for Linux that I use for my Ubuntu systems to mount S3 buckets and use similar to other mount points. Other tools include Cloudberry, S3 Motion, S3 Browser as well as plug-ins available in most data protection (backup, snapshot, archive) software tools and storage systems today.

AWS S3 Storage Gateway and What's New

The Storage Gateway is the AWS tool that you can use for accessing S3 buckets and objects via your block volume, NAS file or tape based applications. The Storage Gateway is intended to give S3 bucket and object access to on-premise applications and data infrastructures functions including data protection (backup/restore, business continuance (BC), business resiliency (BR), disaster recovery (DR) and archiving), along with storage tiering to cloud.

Some of the things that have evolved with the S3 Storage Gateway include:

  • Easier, streamlined download, installation, deployment
  • Enhanced Virtual Tape Library (VTL) and Virtual Tape support
  • File serving and sharing (not to be confused with Elastic File Services (EFS))
  • Ability to define your own bucket and associated parameters
  • Bucket options including Infrequent Access (IA) or standard
  • Options for AWS EC2 hosted, or on-premise VMware as well as Hyper-V gateways (file only supports VMware and EC2)

AWS Storage Gateway Three Functions

AWS Storage Gateway can be deployed for three basic functions:

    AWS Storage Gateway File Architecture
    AWS Storage Gateway File Architecture via AWS.com

  • File Gateway (NFS NAS) - Files, folders, objects and other items are stored in AWS S3 with a local cache for low latency access to most recently used data. With this option, you can create folders and subdirectory similar to a regular file system or NAS device as well as configure various security, permissions, access control policies. Data is stored in S3 buckets that you specify policies such as standard or Infrequent Access (IA) among other options. AWS hosted via EC2 as well as VMware Virtual Machine (VM) for on-premise file gateway.

    Also, note that AWS cautions on multiple concurrent writers to S3 buckets with Storage Gateway so check the AWS FAQs which may have changed by the time you read this. Current file share limits (subject to change) include 1 file gateway share per S3 bucket (e.g. a one to one mapping between file share and a bucket). There can be 10 file shares per gateway (e.g. multiple shares each with its own bucket per gateway) and a maximum file size of 5TB (same as maximum S3 object size). Note that you might hear about object storage systems supporting unlimited size objects which some may do, however generally there are some constraints either on their API front-end, or what is currently tested. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Non-Cached e.g. Stored Volume Gateway Architecture
    AWS Storage Gateway Non-Cached Volume Architecture via AWS.com

    AWS Storage Gateway cached volume Architecture
    AWS Storage Gateway Cached Volume Architecture via AWS.com

  • Volume Gateway (Block iSCSI) - Leverages S3 with a point in time backup as an AWS EBS snapshot. Two options exist including Cached volumes with low-latency access to most recently used data (e.g. data is stored in AWS, with a local cache copy on disk or SSD). The other option is Stored Volumes (e.g. non-cached) where primary copy is local and periodic snapshot backups are sent to AWS. AWS provides EC2 hosted, as well as VMs for VMware and various Hyper-V Windows Server based VMs.

    Current Storage Gateway volume limits (subject to change) include maximum size of a cached volume 32TB, maximum size of a stored volume 16TB. Note that snapshots of cached volumes larger than 16TB can only be restored to a storage gateway volume, they can not be restored as an EBS volume (via EC2). There are a maximum of 32 volumes for a gateway with total size of all volumes for a gateway (cached) of 1,024TB (e.g. 1PB). The total size of all volumes for a gateway (stored volume) is 512TB. View current AWS Storage Gateway resource and specification limits here.

  • AWS Storage Gateway VTL Architecture
    AWS Storage Gateway VTL Architecture via AWS.com

  • Virtual Tape Library Gateway (VTL) - Supports saving your data for backup/BC/DR/archiving into S3 and Glacier storage tiers. Being a Virtual Tape Library (e.g. VTL) you can specify emulation of tapes for compatibility with your existing backup, archiving and data protection software, management tools and processes.

    Storage Gateway limits for tape include minimum size of a virtual tape 100GB, maximum size of a virtual tape 2.5TB, maximum number of virtual tapes for a VTL is 1,500 and total size of all tapes in a VTL is 1PB. Note that the maximum number of virtual tapes in an archive is unlimited and total size of all tapes in an archive is also unlimited. View current AWS Storage Gateway resource and specification limits here.


Where To Learn More

What This All Means

As to which gateway function and mode (cached or non-cached for Volumes) depends on what it is that you are trying to do. Likewise choosing between EC2 (cloud hosted) or on-premise Hyper-V and VMware VMs depends on what your data infrastructure support requirements are. Overall I like the progress that AWS has put into evolving the Storage Gateway, granted it might not be applicable for all usage cases. Continue reading more and view images from the AWS Storage Gateway Revisited test drive in part two located here.

Ok, nuff said (for now...).


Greg Schulz - Microsoft MVP Cloud and Data Center Management, VMware vExpert 2010-2017 (and vSAN). Author Cloud and Virtual Data Storage Networking (CRC Press), The Green and Virtual Data Center (CRC Press), Resilient Storage Networks (Elsevier) and twitter @storageio. Watch for the spring 2017 release of his new book "Software-Defined Data Infrastructure Essentials" (CRC Press).

All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2017 Server StorageIO(R) and UnlimitedIO All Rights Reserved

Read the original blog entry...

More Stories By Greg Schulz

Greg Schulz is founder of the Server and StorageIO (StorageIO) Group, an IT industry analyst and consultancy firm. Greg has worked with various server operating systems along with storage and networking software tools, hardware and services. Greg has worked as a programmer, systems administrator, disaster recovery consultant, and storage and capacity planner for various IT organizations. He has worked for various vendors before joining an industry analyst firm and later forming StorageIO.

In addition to his analyst and consulting research duties, Schulz has published over a thousand articles, tips, reports and white papers and is a sought after popular speaker at events around the world. Greg is also author of the books Resilient Storage Network (Elsevier) and The Green and Virtual Data Center (CRC). His blog is at www.storageioblog.com and he can also be found on twitter @storageio.

Latest Stories
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, discussed how given the magnitude of today's application ...
Recently, REAN Cloud built a digital concierge for a North Carolina hospital that had observed that most patient call button questions were repetitive. In addition, the paper-based process used to measure patient health metrics was laborious, not in real-time and sometimes error-prone. In their session at 21st Cloud Expo, Sean Finnerty, Executive Director, Practice Lead, Health Care & Life Science at REAN Cloud, and Dr. S.P.T. Krishnan, Principal Architect at REAN Cloud, discussed how they built...
The “Digital Era” is forcing us to engage with new methods to build, operate and maintain applications. This transformation also implies an evolution to more and more intelligent applications to better engage with the customers, while creating significant market differentiators. In both cases, the cloud has become a key enabler to embrace this digital revolution. So, moving to the cloud is no longer the question; the new questions are HOW and WHEN. To make this equation even more complex, most ...
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
The past few years have brought a sea change in the way applications are architected, developed, and consumed—increasing both the complexity of testing and the business impact of software failures. How can software testing professionals keep pace with modern application delivery, given the trends that impact both architectures (cloud, microservices, and APIs) and processes (DevOps, agile, and continuous delivery)? This is where continuous testing comes in. D
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
Digital transformation is about embracing digital technologies into a company's culture to better connect with its customers, automate processes, create better tools, enter new markets, etc. Such a transformation requires continuous orchestration across teams and an environment based on open collaboration and daily experiments. In his session at 21st Cloud Expo, Alex Casalboni, Technical (Cloud) Evangelist at Cloud Academy, explored and discussed the most urgent unsolved challenges to achieve f...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...