Welcome!

Blog Feed Post

Caching for Faster APIs

Pop quiz: Do you know what the number one driver cited in 2016 for networking investments was? No peeking! top drivers nw investments 2016

 

If you guessed speed and performance, you guessed right. If you guessed security don’t feel bad, it came in at number two, just ahead of availability. 

Still, it’s telling that the same things that have always driven network upgrades, improvements, and architectures continue to do so. We want fast, secure, and reliable networks that deliver fast, secure, and reliable applications.

Go figure.

The problem is that a blazing fast, secure, and reliable network does not automatically translate into a fast, secure, and reliable application.

But it can provide a much needed boost. And I’m here to tell you how (and it won’t even cost you shipping and handling fees).

The thing is that there have long been web app server (from which apps and APIs are generally delivered) options for both caching and compression. The other thing is that they’re often not enabled. Caching headers are part of the HTTP specification. They’re built in, but that means they’re packaged up with each request and response. So if a developer doesn’t add them, they aren’t there.

Except when you’ve got an upstream, programmable proxy with which you can insert them. Cause when we say software-defined, we really mean software-defined. As in “Automation is cool and stuff, but interacting with requests/responses in real-time is even cooler.”

So, to get on with it, there are several mechanisms for managing caching within HTTP, two of which are: ETag and Last-Modified.

  1. ETag The HTTP header “ETag” contains a hash or checksum that can be used to compare whether or not content has changed. It’s like the MD5 signature on compressed files or RPMs.While MD5 signatures are usually associated with security, they can also be used to determine whether or not content has changed. In the case of browser caching, the browser can make a request that says “hey, only give me new content if it’s changed”. The server-side uses the ETag to determine if it has and if not, sends back an empty HTTP 304 response. The browser says “Cool” and pulls the content from its own local cache. This saves on transfer times (by reducing bandwidth and round trips if the content is large) and thus improves performance.
  2. Last-Modified.  This is really the same thing as an ETag but with timestamps, instead. Browsers ask to be served new content if it has been modified since a specific date. This, too, saves on bandwidth and transfer times, and can improve performance.

Now, these mechanisms were put into place primarily to help with web-based content. Caching images and other infrequently changing presentation components (think style-sheets, a la CSS) can have a significant impact on performance and scalability of an application. But we’re talking about APIs, and as we recall, APIs are not web pages. So how does HTTP’s caching options help with APIs?

Well, very much the same way, especially given that most APIs today are RESTful, which means they use HTTP.

If I’ve got an app (and I’ve got lots of them) that depends on an API there are still going to be a lot of content types that are similar, like images. Those images can (and should) certainly be cached when possible, especially if the app is a mobile one. Data, too, for frequently retrieved content can be cached, even if it is just a big blob of JSON. Consider the situation in which I have an app and every day the “new arrivals” are highlighted. But they’re only updated once a day, or on a well-known schedule. The first time I open the menu item to see the “new arrivals”, the app should certainly go get the new content, because it’s new. But after that, there’s virtually no reason for the app to go requesting that data. I already paid the performance price to get it, and it hasn’t changed – neither the JSON objects representing the individual items nor the thumbnails depicting them. Using HTTP caching headers and semantics, I can ask “have you changed this yet?” and the server can quickly respond “Not yet.” That saves subsequent trips back and forth to download data while I click on fourteen different pairs of shoes* off the “new arrivals” list and then come back to browse for more.

If the API developer hasn’t added the appropriate HTTP cues in the headers, however, you’re stuck grabbing and regrabbing the same content and wasting bandwidth as well as valuable client and server-side resources. An upstream programmable proxy can be used to insert them, however, and provide both a performance boost (for the client) and greater scalability (for the server).

Basically, you can insert anything you want into the request/response using a programmable proxy, but we’ll focus on just HTTP headers right now. The basic pattern is:

  1: when HTTP_REQUEST {
  2:   HTTP::header insert "ETag" "my-computed-value"
  3: }

Really, that’s all there is to it. Now, you probably want some logic in there to not override an existing header because if the developer put it in, there’s a good reason. This is where I mini-lecture you on the cultural side of DevOps and remind you that communication is as critical as code when it comes to improving the deployment and delivery of applications. And there’s certainly going to be some other stuffs that go along with it, but the general premise is that the insertion of caching-related HTTP headers is pretty simple to achieve.

For example, we could insert a Last-Modified header for any JPG image:

  1: when HTTP_RESPONSE {
  2:   if { [HTTP::header "Content-Type" ] equals "Image/jpeg" } {
  3:     HTTP::header insert "Last-Modified" "timestamp value"
  4:    }
  5:   }
  6: } 

We could do the same for CSS, or JS, as well. And we could get more complex and make decisions based on a hundred other variables and conditions. Cause, software-defined delivery kinda means you can do whatever you need to do.

Another reason a programmable proxy is an excellent option in this case is because it further allows you to extend HTTP unofficial functionality when servers do not. For example, there’s an unofficial “PURGE” method that’s used by Varnish for invalidating cache entries. Because it’s unofficial, it’s not universally supported by the web servers on which APIs are implemented. But a programmable proxy could be used to implement that functionality on behalf of the web server (cause that’s what proxies do) and relieve pressure on web servers to do so themselves. That’s important when external caches like memcached and varnish enter the picture. Because sometimes it’s not just about caching on the client, but in the infrastructure.

In any case, HTTP caching mechanisms can improve performance of APIs, particularly when they are returning infrequently changing content like images or static text. Not taking advantage of them is a lost opportunity.

 

* you shop for what you want, I’ll shop for shoes.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

Latest Stories
Leading companies, from the Global Fortune 500 to the smallest companies, are adopting hybrid cloud as the path to business advantage. Hybrid cloud depends on cloud services and on-premises infrastructure working in unison. Successful implementations require new levels of data mobility, enabled by an automated and seamless flow across on-premises and cloud resources. In his general session at 21st Cloud Expo, Greg Tevis, an IBM Storage Software Technical Strategist and Customer Solution Architec...
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
In his session at Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, presented a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to maximize project result...
The Founder of NostaLab and a member of the Google Health Advisory Board, John is a unique combination of strategic thinker, marketer and entrepreneur. His career was built on the "science of advertising" combining strategy, creativity and marketing for industry-leading results. Combined with his ability to communicate complicated scientific concepts in a way that consumers and scientists alike can appreciate, John is a sought-after speaker for conferences on the forefront of healthcare science,...
"We work around really protecting the confidentiality of information, and by doing so we've developed implementations of encryption through a patented process that is known as superencipherment," explained Richard Blech, CEO of Secure Channels Inc., in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
In IT, we sometimes coin terms for things before we know exactly what they are and how they’ll be used. The resulting terms may capture a common set of aspirations and goals – as “cloud” did broadly for on-demand, self-service, and flexible computing. But such a term can also lump together diverse and even competing practices, technologies, and priorities to the point where important distinctions are glossed over and lost.
When shopping for a new data processing platform for IoT solutions, many development teams want to be able to test-drive options before making a choice. Yet when evaluating an IoT solution, it’s simply not feasible to do so at scale with physical devices. Building a sensor simulator is the next best choice; however, generating a realistic simulation at very high TPS with ease of configurability is a formidable challenge. When dealing with multiple application or transport protocols, you would be...
As organizations shift towards IT-as-a-service models, the need for managing and protecting data residing across physical, virtual, and now cloud environments grows with it. Commvault can ensure protection, access and E-Discovery of your data – whether in a private cloud, a Service Provider delivered public cloud, or a hybrid cloud environment – across the heterogeneous enterprise. In his general session at 18th Cloud Expo, Randy De Meno, Chief Technologist - Windows Products and Microsoft Part...
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abilit...
Detecting internal user threats in the Big Data eco-system is challenging and cumbersome. Many organizations monitor internal usage of the Big Data eco-system using a set of alerts. This is not a scalable process given the increase in the number of alerts with the accelerating growth in data volume and user base. Organizations are increasingly leveraging machine learning to monitor only those data elements that are sensitive and critical, autonomously establish monitoring policies, and to detect...
Two weeks ago (November 3-5), I attended the Cloud Expo Silicon Valley as a speaker, where I presented on the security and privacy due diligence requirements for cloud solutions. Cloud security is a topical issue for every CIO, CISO, and technology buyer. Decision-makers are always looking for insights on how to mitigate the security risks of implementing and using cloud solutions. Based on the presentation topics covered at the conference, as well as the general discussions heard between sessio...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
In his session at 20th Cloud Expo, Scott Davis, CTO of Embotics, discussed how automation can provide the dynamic management required to cost-effectively deliver microservices and container solutions at scale. He also discussed how flexible automation is the key to effectively bridging and seamlessly coordinating both IT and developer needs for component orchestration across disparate clouds – an increasingly important requirement at today’s multi-cloud enterprise.