Blog Feed Post

A Tale of Two Load Balancers

It was the best of load balancers, it was the worst of load balancers, it was the age of happy users, it was the age of frustrated users.

I get to see a variety of interesting network problems; sometimes these are first-hand, but more frequently now these are through our partner organization. Some are old hat; TCP window constraints on high latency networks remain at the top of that list. Others represent new twists on stupid network tricks, often resulting from external manipulation of TCP parameters for managing throughput (or shaping traffic). And occasionally – as in this example – there’s a bit of both.

Many thanks to Stefan Deml, co-founder and board member at amasol AG, Dynatrace’s Platinum Partner headquartered in Munich, Germany. Stefan and his team worked diligently and expertly with their customer to uncover – and fix – the elusive root cause of an ongoing performance complaint.

Problem brief

Users in North America connect to an application hosted in Germany. The app uses the SOAP protocol to request and deliver information. Users connect through a firewall and one of two Cisco ACE 30 load balancers to the first-tier WebLogic app servers.

When users connect through LB1, performance is good. When they connect through LB2, however, performance is quite poor. While the definition of “poor performance” varied depending on the type of transaction, the customer identified a 1.5MB test transaction that helped quantify the problem quite well: fast is 10 seconds, while slow is 60 seconds – or even longer.

EUE monitoring

Dynatrace DC RUM is used to monitor this customer’s application performance and user experience, alerting the IT team to the problem and quantifying the severity of user complaints. (When users complain that response time is measured in minutes rather than seconds, it’s helpful to have a solution that validates those claims with measured transaction response times.) DC RUM automatically isolated the problem to a network-related bottleneck, while proving that the network itself – as qualified by packet loss and congestion delay – was not to blame.

Time to dig a little deeper

I’ll use Dynatrace Network Analyzer – DNA, my protocol analyzer of choice – to examine the underlying behavior and identify the root cause of the problem, taking advantage of the luxury of having traces of both good and poor performing transactions.  I’ll skip DNA’s top-down analysis (I’m assuming you don’t care to see yet another Client/Network/Server pie chart), and dive directly into annotated packet-level Bounce Diagrams to illustrate the problem.

(DNA’s Bounce Diagram is simply a graphic of a trace file; each packet is represented by an arrow color-coded according to packet size.)

First, the fast transaction instance:

Bounce Diagram illustrating a fast instance of the test transaction through LB1; total elapsed time about 10 seconds.

For the fast transaction, most of the 10-second delay is allocated to server processing; the response download of 1.5MB takes about 1.7 seconds – about 7Mbps.

Here’s the same view of the slow transaction instance:

Bounce Diagram illustrating a slow instance of the test transaction through LB2; total elapsed time about 70 seconds.

There are two distinct performance differences between the fast transaction – the baseline – and this slow transaction. First, a dramatic increase in client request time (from 175 msec. to 52 seconds!); second, a smaller but still significant increase in response download time, from 1.7 seconds to 7.7 seconds.

The MSB (most significant bottleneck)

Let’s first examine the most significant bottleneck in the slow transaction. The client SOAP request – only 3KB – takes 54 seconds to transmit to the server, in 13 packets.

The packet trace shows the client sending very small packets, with gaps of about 5 seconds between. Examining the ACKs from LB2, we see that the TCP receive window size is unusually small; 254 bytes.

Packet trace excerpt showing LB2 advertising a window size of 256 bytes.

Such an unusually small window advertisement is generally a reliable indicator that TCP Window Scaling is active; without the SYN/SYN/ACK handshake, a protocol analyzer doesn’t know whether scaling is active, and is therefore unable to apply a scale factor to accurately interpret the window size field.

The customer did provide another trace that included the handshake, showing that the LB response to the client’s SYN does in fact include the Window Scaling option – with a scale factor of 0.

The SYN packet from LB2; window scaling will be supported, but LB2 will not scale it’s receive window.

Odd? Not really; this simply means that LB2 will allow the client to scale its receive window, but doesn’t intend to scale its own. The initial (non-scaled) receive window advertised by the LB is 32768. (It’s interesting to note that given a scale factor of 7, a receive window value of 256 would equal 32768.)

Once a few packets have been exchanged on the connection, however, LB2 abruptly reduces its receive window from 32768 to 254 – even though the client has only sent only a few hundred bytes. This is clearly not a result of the TCP socket’s buffer space filling up. Instead, it’s as if LB2 suddenly shifts to a non-zero scale factor (perhaps that factor of 7 I just suggested), even though it has already established a scale factor of zero.

Pop quiz: What to do with tiny windows?

Question: what should a TCP sender do when the peer TCP receive window falls below the MSS?

Answer: The sender should wait until the receiver’s window increases to a value greater than the MSS.

In practice, this means the sender waits for the receiver to empty its buffer. Given a receiver that is slow to read data from its buffer – and therefore advertises a small window of less than the MSS – it would be silly for the sender to send tiny packets just to fill the remaining space. In fact, this undesirable behavior is called the silly window syndrome, avoided through algorithms built into TCP.

For this reason, protocol analyzers and network probes should treat the occurrence of small (<MSS) window advertisements the same as zero window events, as they have the same performance impact.

When a receiver’s window is at zero for an extended period, a sender will typically send a window probe packet attempting to “wake up” the receiver. Of course, since the window is zero, no usable payload accompanies this window probe packet. In our example, the window is not zero, but the sender behavior is similar; the LB waits five seconds, then sends a small packet with just enough data (254 bytes) to fill the buffer. The ACK is immediate (the LB’s ACK frequency is 1), but the advertised window remains abnormally small. We can conclude that the LB believes it is advertising a full 32KB buffer, although it telling the client something much different.

After about 52 seconds, the 3K request reaches LB2, after which application processing occurs normally. It’s a good thing the request size wasn’t 30K!

The NSB (next significant bottleneck)

As is quite common, there’s another tuning opportunity – the NSB. This is highlighted by DC RUM’s metric called Server Realized Bandwidth, or download rate. The fast transaction transfers 1.5MB in about 1.6 seconds (7.5Mbps), while the slow transaction takes about 8 seconds for the same payload (1.5Mbps).

Could this be receiver flow control, or a small configured receive TCP window? These would seem reasonable theories – except that we’re using the same client for the tests. A quick look at the receiver’s TCP window proves this is not the case, as it remains at 131,072 (512 with a scaling factor of 9).

DNA’s Timeplot can graph a sender’s TCP Payload in Transit; comparing this with the receiver’s advertised TCP window can quickly prove – or disprove – a TCP window constraint theory.

Time plot showing LB2’s TCP payload in transit (bytes in flight) along with the client’s receive window size.

The maximum payload in transit for the slow transaction is about 32KB; given that the client’s receive window is much larger, we know that the client is not limiting throughput.

Let’s compare this with the fast transaction as it ramps up exponentially through TCP slow start:

Time plot showing LB1’s payload in transit as it ramps up through slow start.

It becomes clear that LB1 does not limit send throughput – bytes in flight – to 32KB, instead allowing the transfer to make more efficient use of the available bandwidth. We can conclude that some characteristic of LB2 is artificially limiting throughput.

Fixing the problems

For the MSB (most significant bottleneck), Cisco has identified a workaround (even if they might have slightly misstated the actual problem):

CSCud71628—HTTP performance across ACE is very bad. Packet captures show that ACE drops the TCP Window Size it advertises to the client to a very low value early in the connection and never recovers from this. Workaround: Disable the “tcp-options window-scale allow”.

For the NSB (next significant bottleneck), the LB configuration defaults to a TCP send buffer value of 32768K. Modifying the parameter set tcp buffer-share from the default 32768 to 262143 (the maximum permitted value) allowed for LB2 throughput to match that of LB1.

Wait; do you see the contradiction here? If we disable TCP window scaling, that would limit the effective TCP buffer to 65535, limiting the download transfer rate to under 4Mbps (given the existing link’s 130ms round-trip delay).

But this was the spring of hope; it seems that changing the tcp buffer-share parameter also solved the window scaling problem, without having to disable that option. This suggests a less-than obvious interaction between these parameters – but with happy users, we’ll take that bit of luck.

Is there more?

There are always additional NSBs; this is a tenet of performance tuning. We stop when the next bottleneck becomes insignificant (or when we have other problems to attend to). For this test transaction, the SOAP payload is rather large (1.5MB); while the payload is encrypted, it could still be compressed to reduce download time; a quick test using WinZip shows the potential for at least a 50% reduction.

While some of you will be quick to note that ACE has been discontinued, Cisco support for ACE will continue through January 2019.

The post A Tale of Two Load Balancers appeared first on Dynatrace blog – monitoring redefined.

Read the original blog entry...

More Stories By Dynatrace Blog

Building a revolutionary approach to software performance monitoring takes an extraordinary team. With decades of combined experience and an impressive history of disruptive innovation, that’s exactly what we ruxit has.

Get to know ruxit, and get to know the future of data analytics.

Latest Stories
"I think DevOps is now a rambunctious teenager – it’s starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We are still a relatively small software house and we are focusing on certain industries like FinTech, med tech, energy and utilities. We help our customers with their digital transformation," noted Piotr Stawinski, Founder and CEO of EARP Integration, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We've been engaging with a lot of customers including Panasonic, we've been involved with Cisco and now we're working with the U.S. government - the Department of Homeland Security," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We're here to tell the world about our cloud-scale infrastructure that we have at Juniper combined with the world-class security that we put into the cloud," explained Lisa Guess, VP of Systems Engineering at Juniper Networks, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"With Digital Experience Monitoring what used to be a simple visit to a web page has exploded into app on phones, data from social media feeds, competitive benchmarking - these are all components that are only available because of some type of digital asset," explained Leo Vasiliou, Director of Web Performance Engineering at Catchpoint Systems, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
"Peak 10 is a hybrid infrastructure provider across the nation. We are in the thick of things when it comes to hybrid IT," explained Michael Fuhrman, Chief Technology Officer at Peak 10, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
As enterprise cloud becomes the norm, businesses and government programs must address compounded regulatory compliance related to data privacy and information protection. The most recent, Controlled Unclassified Information and the EU’s GDPR have board level implications and companies still struggle with demonstrating due diligence. Developers and DevOps leaders, as part of the pre-planning process and the associated supply chain, could benefit from updating their code libraries and design by in...
SYS-CON Events announced today that Calligo, an innovative cloud service provider offering mid-sized companies the highest levels of data privacy and security, has been named "Bronze Sponsor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Calligo offers unparalleled application performance guarantees, commercial flexibility and a personalised support service from its globally located cloud plat...
"We are an IT services solution provider and we sell software to support those solutions. Our focus and key areas are around security, enterprise monitoring, and continuous delivery optimization," noted John Balsavage, President of A&I Solutions, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We were founded in 2003 and the way we were founded was about good backup and good disaster recovery for our clients, and for the last 20 years we've been pretty consistent with that," noted Marc Malafronte, Territory Manager at StorageCraft, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
"We are focused on SAP running in the clouds, to make this super easy because we believe in the tremendous value of those powerful worlds - SAP and the cloud," explained Frank Stienhans, CTO of Ocean9, Inc., in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"At the keynote this morning we spoke about the value proposition of Nutanix, of having a DevOps culture and a mindset, and the business outcomes of achieving agility and scale, which everybody here is trying to accomplish," noted Mark Lavi, DevOps Solution Architect at Nutanix, in this SYS-CON.tv interview at @DevOpsSummit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.