Welcome!

Related Topics: @DevOpsSummit, Linux Containers, Containers Expo Blog

@DevOpsSummit: Article

Ops, APIs and Compression | @DevOpsSummit #API #APM #DevOps

I’ve been reading up on APIs. In particular I really enjoyed reading Best Practices for Designing a Pragmatic RESTful API

I've been reading up on APIs cause, coolness. And in particular I really enjoyed reading Best Practices for Designing a Pragmatic RESTful API because it had a lot of really good information and advice.

And then I got to the part about compressing your APIs.

Before we go too far let me first say I'm not saying you shouldn't compress your API or app responses. You probably should. What I am saying is that where you compress data and when are important considerations.

That's because generally speaking no one has put their web server (which is ultimately what tends to serve up responses, whether they're APIs or objects, XML or JSON) at the edge of the Internet. You know, where it's completely vulnerable. It's usually several devices back in the networking gauntlet that has be run before data gets from the edge of your network to the server.

wrong architecture

This is because there are myriad bad actors out salivating at the prospect of a return to an early aughts data center architecture in which firewalls, DDoS protection, and other app security services were not physically and logically located upstream from the apps they protect today.

Cause if you don't have to navigate the network, it's way easier to launch an attack on an app.

Today, we employ an average of 11 different services in the network, upstream from the app, to provide security, scale, and performance-enhancing services. Like compression.

better architecture

Now, you can enable compression on the web server. It's a standard thing in HTTP and it's little more than a bit to flip in the configuration. Easy peasy performance-enhancing change, right?

Except that today that's not always true.

The primary reason compression improves performance is because when it reduces the size of data it reduces the number of packets that must be transmitted. That reduces the potential for congestion that causes a Catch-22 where TCP retransmits increase congestion that increases packet loss that increases... well, you get the picture. This is particularly true when mobile clients are connecting via cellular networks, because latency is a real issue for them and the more round trips it takes, the worse the application experience.

Suffice to say that the primary reason compression improves performance is that it reduces the amount of data needing to be transmitted which means "faster" delivery to the client. Fewer packets = less time = happier users.

That's a good thing. Except when compression gets in the way or doesn't provide any real reduction that would improve performance.

What? How can that be, you ask.

Remember that we're looking for compression to reduce the number of packets transmitted, especially when it has to traverse a higher latency, lower capacity link between the data center and the client.

It turns out that sometimes compression doesn't really help with that.

Consider the aforementioned article and its section on compressing. The author ran some tests, and concluded that compression of text-based data produces some really awesome results:

Let's look at this with a real world example. I've pulled some data from GitHub's API, which uses pretty print by default. I'll also be doing some gzip comparisons:

$ curl https://api.github.com/users/veesahni > with-whitespace.txt $ ruby -r json -e 'puts JSON JSON.parse(STDIN.read)' < with-whitespace.txt > without-whitespace.txt
$ gzip -c with-whitespace.txt > with-whitespace.txt.gz
$ gzip -c without-whitespace.txt > without-whitespace.txt.gz

The output files have the following sizes:

  • without-whitespace.txt - 1252 bytes
  • with-whitespace.txt - 1369 bytes
  • without-whitespace.txt.gz - 496 bytes
  • with-whitespace.txt.gz - 509 bytes

In this example, the whitespace increased the output size by 8.5% when gzip is not in play and 2.6% when gzip is in play. On the other hand, the act of gzipping in itself provided over 60% in bandwidth savings. Since the cost of pretty printing is relatively small, it's best to pretty print by default and ensure gzip compression is supported!

To further hammer in this point, Twitter found that there was an 80% savings (in some cases)when enabling gzip compression on their Streaming API. Stack Exchange went as far as to never return a response that's not compressed!

Wow! I mean, from a purely mathematical perspective, that's some awesome results. And the author is correct in saying it will provide bandwidth savings.

What those results won't necessarily do is improve performance because the original size of the file was already less than the MSS for a single packet. Which means compressed or not, that data takes exactly one packet to transmit. That's it. I won't bore you with the mathematics, but the speed of light and networking says one packet takes the same amount of time to transit whether it's got 496 bytes of payload or 1396 bytes of payload. The typical MSS for Ethernet packets is 1460 bytes, which means compressing something smaller than that effectively nets you nothing in terms of performance. It's like a plane. It takes as long to fly from point A to point B whether there are 14 passengers or 140. Fuel efficiency (bandwidth) is impacted, but that doesn't really change performance, just the cost.

Furthermore, compressing the payload at the web server means that web app security services upstream have to decompress if they want to do their job, which is to say scan responses for sensitive or excessive data indicative of a breach of security policies. This is a big deal, kids. 42% of respondents in our annual security strategy to prevent data leaks. Which means they have to spend extra time to decompress the data to evaluate it and then recompress it, or perhaps they can't inspect it at all.

State of Application Delivery survey always scan responses as part of their overall attack-surfaces-soad-2016

Now, that said, bandwidth savings are a good thing. It's part of any comprehensive scaling strategy to consider the impact of increasing use of an app on bandwidth. And a clogged up network can impact performance negatively so compression is a good idea. But not necessarily at the web server. This is akin to carefully considering where you enforce SSL/TLS security measures, as there are similar impacts on security services upstream from the app / web server.

That's why the right place for compression and SSL/TLS is generally upstream, in the network, after security has checked out the response and it's actually ready to be delivered to the client. That's usually the load balancing service or the ADC, where compression can not only be applied most efficiently and without interfering with security services and offsetting the potential gains by forcing extra processing upstream.

As with rate limiting APIs, it's not always a matter of whether or not you should, it's a matter of where you should.

Architecture, not algorithms, are the key to scale and performance of modern applications.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Latest Stories
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, will discuss how from store operations...
Though cloud is the future of enterprise computing, a smooth transition of legacy applications and systems is critical for seamless business operations. IT professionals are eager to start leveraging the cost, scale and other benefits of cloud, but with massive investments already in place in existing infrastructure and a number of compliance and resource hurdles, it can be challenging to move to a cloud-based infrastructure.
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, will go over the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, applicatio...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual b...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
Infoblox delivers Actionable Network Intelligence to enterprise, government, and service provider customers around the world. They are the industry leader in DNS, DHCP, and IP address management, the category known as DDI. We empower thousands of organizations to control and secure their networks from the core-enabling them to increase efficiency and visibility, improve customer service, and meet compliance requirements.
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, will describe how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launchi...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant tha...
Transforming cloud-based data into a reportable format can be a very expensive, time-intensive and complex operation. As a SaaS platform with more than 30 million global users, Cornerstone OnDemand’s challenge was to create a scalable solution that would improve the time it took customers to access their user data. Our Real-Time Data Warehouse (RTDW) process vastly reduced data time-to-availability from 24 hours to just 10 minutes. In his session at 21st Cloud Expo, Mark Goldin, Chief Technolo...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
Cloud Expo, Inc. has announced today that Andi Mann and Aruna Ravichandran have been named Co-Chairs of @DevOpsSummit at Cloud Expo Silicon Valley which will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is at the intersection of technology and business-optimizing tools, organizations and processes to bring measurable improvements in productivity and profitability," said Aruna Ravichandran, vice president, DevOps product and solutions marketing...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...