Welcome!

Related Topics: @DXWorldExpo, Java IoT, @CloudExpo

@DXWorldExpo: Blog Post

Employment Network Uses Big Data By @Dana_Gardner | @BigDataExpo #BigData

How Big Data technologies Hadoop and Vertica drive business results at Snagajob

Employment Network Uses Big Data to Chart System Performance

The next BriefingsDirect analytics innovation case study interview explores how Snagajob in Richmond, Virginia – one of the largest hourly employment networks for job seekers and employers – uses big data to finally understand their systems' performance in action. The result is vast improvement in how they provide rapid and richer services to their customers.

Snagajob recently delivered 4 million new jobs applications in a single month through their systems. To learn how they're managing such impressive scale, BriefingsDirect sat down with Robert Fehrmann, Data Architect at Snagajob in Richmond, Virginia. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about your jobs matching organization. You’ve been doing this successfully since 2000. Let's understand the role you play in the employment market.

Fehrmann: Snagajob, as you mentioned, is America's largest hourly network for employees and employers. The hourly market means we have, relatively speaking, high turnover.

Another aspect, in comparison to some of our competitors, is that we provide an inexpensive service. So our subscriptions are on the low end, compared to our competitors.

Gardner: Tell us how you use big data to improve your operations. I believe that among the first ways that you’ve done that is to try to better analyze your performance metrics. What were you facing as a problem when it came to performance? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Signs of stress

Fehrmann: A couple of years ago, we started looking at our environment, and it became obvious that our traditional technology was showing some signs of stress. As you mentioned, we really have data at scale here. We have 20,000 to 25,000 postings per day, and we have about 700,000 unique visitors on a daily basis. So data is coming in very, very quickly.

Fehrmann

We also realized that we're sitting on a gold mine and we were able to ingest data pretty well. But we had problem getting information and innovation out of our big data lake.

Gardner: And of course, near real time is important. You want to catch degradation in any fashion from your systems right away. How do you then go about getting this in real time? How do you do the analysis?

Fehrmann: We started using Hadoop. I'll use a lot of technical terms here. From our website, we're getting events. Events are routed via Flume directly into Hadoop. We're collecting about 600 million key-value pairs on a daily basis. It's a massive amount of data, 25 gigabytes on a daily basis.

The second piece in this journey to big data was analyzing these events, and that’s where we're using HP Vertica. Second, our original use case was to analyze a funnel. A funnel is where people come to our site. They're searching for jobs, maybe by keyword, maybe by zip code. A subset of that is an interest in a job, and they click on a posting. A subset of that is applying for the job via an application. A subset is interest in an employer, and so on. We had never been able to analyze this funnel.

The dataset is about 300 to 400 million rows, and 30 to 40 gigabytes. We wanted to make this data available, not just to our internal users, but all external users. Therefore, we set ourselves a goal of a five-second response time. No query on this dataset should run for more than five seconds -- and Vertica and Hadoop gave us a solution for this.

Gardner: How have you been able to increase your performance reach your key performance indicators (KPIs) and service-level agreements (SLAs)? How has this benefited you?

Fehrmann: Another application that we were able to implement is a recommendation engine. A recommendation engine is that use where our jobseekers who apply for a specific job may not know about all the other jobs that are very similar to this job or that other people have applied to.

We started analyzing the search results that we were getting and implemented a recommendation engine. Sometimes it’s very difficult to have real comparison between before and after. Here, we were able to see that we got an 11 percent increase in application flow. Application flow is how many applications a customer is getting from us. By implementing this recommendation engine, we saw an immediate 11 percent increase in application flow, one of our key metrics.

Gardner: So you took the success from your big-data implementation and analysis capabilities from this performance task to some other areas. Are there other business areas, search yield, for example, where you can apply this to get other benefits?

Brand-new applications

Fehrmann: When we started, we had the idea that we were looking for a solution for migrating our existing environment, to a better-performing new environment. But what we've seen is that most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

You mentioned search yield. Search yield is a very interesting aspect. It’s a massive dataset. It's about 2.5 billion rows and about 100 gigabytes of data as of right now and it's continuously increasing. So for all of the applications, as well as all of the search requests that we have collected since we have started this environment, we're able to analyze the search yield.

Most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

For example, that's how many applications we get for a specific search keyword in real time. By real time, I mean that somebody can run a query against this massive dataset and gets result in a couple of seconds. We can analyze specific jobs in specific areas, specific keywords that are searched in a specific time period or in a specific location of the country.

Gardner: And once again, now that you've been able to do something you couldn't do before, what have been the results? How has that impacted change your business? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Fehrmann: It really allows our salespeople to provide great information during the prospecting phase. If we're prospecting with a new client, we can tell him very specifically that if they're in this industry, in this area, they can expect an application flow, depending on how big the company is, of let’s say in a hundred applications per day.

Gardner: How has this been a benefit to your end users, those people seeking jobs and those people seeking to fill jobs?

Fehrmann: There are certainly some jobs that people are more interested in than others. On the flip side, if a particular job gets a 100 or 500 applications, it's just a fact that only a small number going to get that particular job. Now if you apply for a job that isn't as interesting, you have much, much higher probability of getting the job.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

Latest Stories
Serverless applications increase developer productivity and time to market, by freeing engineers from spending time on infrastructure provisioning, configuration and management. Serverless also simplifies Operations and reduces cost - as the Kubernetes container infrastructure required to run these applications is automatically spun up and scaled precisely with the workload, to optimally handle all runtime requests. Recent advances in open source technology now allow organizations to run Serv...
Docker is sweeping across startups and enterprises alike, changing the way we build and ship applications. It's the most prominent and widely known software container platform, and it's particularly useful for eliminating common challenges when collaborating on code (like the "it works on my machine" phenomenon that most devs know all too well). With Docker, you can run and manage apps side-by-side - in isolated containers - resulting in better compute density. It's something that many developer...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
The Japan External Trade Organization (JETRO) is a non-profit organization that provides business support services to companies expanding to Japan. With the support of JETRO's dedicated staff, clients can incorporate their business; receive visa, immigration, and HR support; find dedicated office space; identify local government subsidies; get tailored market studies; and more.
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
Kubernetes as a Container Platform is becoming a de facto for every enterprise. In my interactions with enterprises adopting container platform, I come across common questions: - How does application security work on this platform? What all do I need to secure? - How do I implement security in pipelines? - What about vulnerabilities discovered at a later point in time? - What are newer technologies like Istio Service Mesh bring to table?In this session, I will be addressing these commonly asked ...
The KCSP program is a pre-qualified tier of vetted service providers that offer Kubernetes support, consulting, professional services and training for organizations embarking on their Kubernetes journey. The KCSP program ensures that enterprises get the support they're looking for to roll out new applications more quickly and more efficiently than before, while feeling secure that there's a trusted and vetted partner that's available to support their production and operational needs.
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rathe...
In a recent survey, Sumo Logic surveyed 1,500 customers who employ cloud services such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). According to the survey, a quarter of the respondents have already deployed Docker containers and nearly as many (23 percent) are employing the AWS Lambda serverless computing framework. It's clear: serverless is here to stay. The adoption does come with some needed changes, within both application development and operations. Th...
When you're operating multiple services in production, building out forensics tools such as monitoring and observability becomes essential. Unfortunately, it is a real challenge balancing priorities between building new features and tools to help pinpoint root causes. Linkerd provides many of the tools you need to tame the chaos of operating microservices in a cloud native world. Because Linkerd is a transparent proxy that runs alongside your application, there are no code changes required. I...
xMatters helps enterprises prevent, manage and resolve IT incidents. xMatters industry-leading Service Availability platform prevents IT issues from becoming big business problems. Large enterprises, small workgroups, and innovative DevOps teams rely on its proactive issue resolution service to maintain operational visibility and control in today's highly-fragmented IT environment. xMatters provides toolchain integrations to hundreds of IT management, security and DevOps tools. xMatters is the ...
With the rise of Docker, Kubernetes, and other container technologies, the growth of microservices has skyrocketed among dev teams looking to innovate on a faster release cycle. This has enabled teams to finally realize their DevOps goals to ship and iterate quickly in a continuous delivery model. Why containers are growing in popularity is no surprise — they’re extremely easy to spin up or down, but come with an unforeseen issue. However, without the right foresight, DevOps and IT teams may lo...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...