Welcome!

Blog Feed Post

More Users, More Problems: How LinkedIn Manages their Cloud Infrastructure

One of the most basic truths of the IT industry is that as the number of users increases, so too must the number of infrastructure components within a company’s architecture, and therefore the number of things that can go wrong grows along with it. 

This challenge is nothing new to the IT giants of the world. Companies like LinkedIn, which serves over 500 million users in more than 200 different countries and territories around the world, have deployed a wide range of infrastructure components, such as multiple DNS resolvers and content delivery networks (CDNs), to meet the performance needs of their users. These services are necessary for LinkedIn to reach their customers at the edge, but with only one CDN under their direct control, they must also have a reliable monitoring solution in place to ensure that their third-party vendors are performing up to expectations. 

Putting this monitoring solution in place, however, is a challenge in and of itself. With so many third parties and so many end-user locations, the ability to have a point of presence as close to the end user as possible from which to monitor the digital experience becomes even more important. This is because many performance issues are isolated to specific regions due to issues with localized networks; for example, if you’re running your synthetic tests from Tokyo, you might be missing a micro-outage that is only being felt by users in Shanghai. 

Another crucial aspect of this strategy is that it mitigates the risk of outsourcing much of your infrastructure to third parties. No company has the resources to put first-party infrastructure in place to reach 500 million global users; it would simply be cost prohibitive to do so, regardless of how large or successful they might be. It makes much more sense to rely on third parties dedicated to that specific purpose, but that also requires a certain level of trust that those vendors won’t hamper your customer experience and thus negatively impact your brand. 

Service Level Agreements (SLAs) exist for this express purpose, ensuring that a vendor is tied to certain performance thresholds and must make financial restitution if they fall below them. However, even this can be a challenge, because there must be requisite monitoring capabilities in place to evaluate how the vendor’s service is truly performing. 

Of course, an SLA payment is only made after the damage has been done; the first priority must be to prevent the end user experience from ever suffering in the first place. Here, too, is where LinkedIn’s SRE team and monitoring strategy play a key role. 

The SREs are tasked with staying ahead of performance issues and minimizing the impact whenever one occurs, which requires them to be able to catch problems in real time and troubleshoot a solution as quickly as possible. Therefore, when something such as a spike in network latency occurs, the SRE team can be alerted immediately. If the issue can’t be solved right away, they need to start handing those users off to a different CDN until the vendor can correct the problem. 

This means that in certain cases, LinkedIn could be aware of an issue even before the vendor(s) if their monitoring solution is faster and more accurate. In situations such as this, they can help the vendor identify the root cause of the issue thanks to in-depth reporting and analytical capabilities. By being able to do things like capture headers for every single object on the page, or collect and analyze the data in a short amount of time, they can then share the results with the vendor and thus generate a faster resolution to the problem. 

When you’re trying to maintain digital performance for 500 million users, speed, accuracy, and reliability of the data makes all the difference in the world. 

The post More Users, More Problems: How LinkedIn Manages their Cloud Infrastructure appeared first on Catchpoint's Blog - Web Performance Monitoring.

Read the original blog entry...

More Stories By Mehdi Daoudi

Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.

Founded in 2008 by four DoubleClick / Google executives with a passion for speed, reliability and overall better online experiences, Catchpoint has now become the most innovative provider of web performance testing and monitoring solutions. We are a team with expertise in designing, building, operating, scaling and monitoring highly transactional Internet services used by thousands of companies and impacting the experience of millions of users. Catchpoint is funded by top-tier venture capital firm, Battery Ventures, which has invested in category leaders such as Akamai, Omniture (Adobe Systems), Optimizely, Tealium, BazaarVoice, Marketo and many more.

Latest Stories
Cloud Expo, Inc. has announced today that Andi Mann and Aruna Ravichandran have been named Co-Chairs of @DevOpsSummit at Cloud Expo Silicon Valley which will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is at the intersection of technology and business-optimizing tools, organizations and processes to bring measurable improvements in productivity and profitability," said Aruna Ravichandran, vice president, DevOps product and solutions marketing...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous a...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, will lead you through the exciting evolution of the cloud. He'll look at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering ...
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
SYS-CON Events announced today that N3N will exhibit at SYS-CON's @ThingsExpo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. N3N’s solutions increase the effectiveness of operations and control centers, increase the value of IoT investments, and facilitate real-time operational decision making. N3N enables operations teams with a four dimensional digital “big board” that consolidates real-time live video feeds alongside IoT sensor data a...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It’s clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Tha...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere delivers a more modern architectural approach to storage that doesn't require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbui...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual b...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...