Welcome!

Related Topics: @DXWorldExpo, Java IoT, @CloudExpo

@DXWorldExpo: Blog Post

Employment Network Uses Big Data By @Dana_Gardner | @BigDataExpo #BigData

How Big Data technologies Hadoop and Vertica drive business results at Snagajob

Employment Network Uses Big Data to Chart System Performance

The next BriefingsDirect analytics innovation case study interview explores how Snagajob in Richmond, Virginia – one of the largest hourly employment networks for job seekers and employers – uses big data to finally understand their systems' performance in action. The result is vast improvement in how they provide rapid and richer services to their customers.

Snagajob recently delivered 4 million new jobs applications in a single month through their systems. To learn how they're managing such impressive scale, BriefingsDirect sat down with Robert Fehrmann, Data Architect at Snagajob in Richmond, Virginia. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about your jobs matching organization. You’ve been doing this successfully since 2000. Let's understand the role you play in the employment market.

Fehrmann: Snagajob, as you mentioned, is America's largest hourly network for employees and employers. The hourly market means we have, relatively speaking, high turnover.

Another aspect, in comparison to some of our competitors, is that we provide an inexpensive service. So our subscriptions are on the low end, compared to our competitors.

Gardner: Tell us how you use big data to improve your operations. I believe that among the first ways that you’ve done that is to try to better analyze your performance metrics. What were you facing as a problem when it came to performance? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Signs of stress

Fehrmann: A couple of years ago, we started looking at our environment, and it became obvious that our traditional technology was showing some signs of stress. As you mentioned, we really have data at scale here. We have 20,000 to 25,000 postings per day, and we have about 700,000 unique visitors on a daily basis. So data is coming in very, very quickly.

Fehrmann

We also realized that we're sitting on a gold mine and we were able to ingest data pretty well. But we had problem getting information and innovation out of our big data lake.

Gardner: And of course, near real time is important. You want to catch degradation in any fashion from your systems right away. How do you then go about getting this in real time? How do you do the analysis?

Fehrmann: We started using Hadoop. I'll use a lot of technical terms here. From our website, we're getting events. Events are routed via Flume directly into Hadoop. We're collecting about 600 million key-value pairs on a daily basis. It's a massive amount of data, 25 gigabytes on a daily basis.

The second piece in this journey to big data was analyzing these events, and that’s where we're using HP Vertica. Second, our original use case was to analyze a funnel. A funnel is where people come to our site. They're searching for jobs, maybe by keyword, maybe by zip code. A subset of that is an interest in a job, and they click on a posting. A subset of that is applying for the job via an application. A subset is interest in an employer, and so on. We had never been able to analyze this funnel.

The dataset is about 300 to 400 million rows, and 30 to 40 gigabytes. We wanted to make this data available, not just to our internal users, but all external users. Therefore, we set ourselves a goal of a five-second response time. No query on this dataset should run for more than five seconds -- and Vertica and Hadoop gave us a solution for this.

Gardner: How have you been able to increase your performance reach your key performance indicators (KPIs) and service-level agreements (SLAs)? How has this benefited you?

Fehrmann: Another application that we were able to implement is a recommendation engine. A recommendation engine is that use where our jobseekers who apply for a specific job may not know about all the other jobs that are very similar to this job or that other people have applied to.

We started analyzing the search results that we were getting and implemented a recommendation engine. Sometimes it’s very difficult to have real comparison between before and after. Here, we were able to see that we got an 11 percent increase in application flow. Application flow is how many applications a customer is getting from us. By implementing this recommendation engine, we saw an immediate 11 percent increase in application flow, one of our key metrics.

Gardner: So you took the success from your big-data implementation and analysis capabilities from this performance task to some other areas. Are there other business areas, search yield, for example, where you can apply this to get other benefits?

Brand-new applications

Fehrmann: When we started, we had the idea that we were looking for a solution for migrating our existing environment, to a better-performing new environment. But what we've seen is that most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

You mentioned search yield. Search yield is a very interesting aspect. It’s a massive dataset. It's about 2.5 billion rows and about 100 gigabytes of data as of right now and it's continuously increasing. So for all of the applications, as well as all of the search requests that we have collected since we have started this environment, we're able to analyze the search yield.

Most of the applications we've developed so far are brand-new applications that we hadn't been able to do before.

For example, that's how many applications we get for a specific search keyword in real time. By real time, I mean that somebody can run a query against this massive dataset and gets result in a couple of seconds. We can analyze specific jobs in specific areas, specific keywords that are searched in a specific time period or in a specific location of the country.

Gardner: And once again, now that you've been able to do something you couldn't do before, what have been the results? How has that impacted change your business? [Register for the upcoming HP Big Data Conference in Boston on Aug. 10-13.]

Fehrmann: It really allows our salespeople to provide great information during the prospecting phase. If we're prospecting with a new client, we can tell him very specifically that if they're in this industry, in this area, they can expect an application flow, depending on how big the company is, of let’s say in a hundred applications per day.

Gardner: How has this been a benefit to your end users, those people seeking jobs and those people seeking to fill jobs?

Fehrmann: There are certainly some jobs that people are more interested in than others. On the flip side, if a particular job gets a 100 or 500 applications, it's just a fact that only a small number going to get that particular job. Now if you apply for a job that isn't as interesting, you have much, much higher probability of getting the job.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

Latest Stories
There's no doubt that blockchain technology is a powerful tool for the enterprise, but bringing it mainstream has not been without challenges. As VP of Technology at 8base, Andrei is working to make developing a blockchain application accessible to anyone. With better tools, entrepreneurs and developers can work together to quickly and effectively launch applications that integrate smart contracts and blockchain technology. This will ultimately accelerate blockchain adoption on a global scale.
Enterprises are universally struggling to understand where the new tools and methodologies of DevOps fit into their organizations, and are universally making the same mistakes. These mistakes are not unavoidable, and in fact, avoiding them gifts an organization with sustained competitive advantage, just like it did for Japanese Manufacturing Post WWII.
It cannot be overseen or regulated by any one administrator, like a government or bank. Currently, there is no government regulation on them which also means there is no government safeguards over them. Although many are looking at Bitcoin to put money into, it would be wise to proceed with caution. Regular central banks are watching it and deciding whether or not to make them illegal (Criminalize them) and therefore make them worthless and eliminate them as competition. ICOs (Initial Coin Offer...
DXWorldEXPO LLC announced today that the upcoming DXWorldEXPO | DevOpsSUMMIT | CloudEXPO New York will feature 10 companies from Poland to participate at the "Poland Digital Transformation Pavilion" on November 12-13, 2018. Polish Digital Transformation companies which will exhibit at CloudEXPO | DevOpsSUMMIT | DXWorldEXPO include All in Mobile, dhosting, Cryptomage, Perfect Gym, Polcom, Apius Technologies, Aplisens, ELZAB SA, TELDAT, and Rebug.io.
DevOpsSUMMIT at CloudEXPO, to be held June 25-26, 2019 at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
We are in a digital age however when one looks for their dream home, the mortgage process can take as long as 60 days to complete. Not what we expect in a time where processes are known to take place swiftly and seamlessly. Mortgages businesses are facing the heat and are in immediate need of upgrading their operating model to reduce costs, decrease the processing time and enhance the customer experience. Therefore, providers are exploring multiple ways of tapping emerging technologies to solve ...
If a machine can invent, does this mean the end of the patent system as we know it? The patent system, both in the US and Europe, allows companies to protect their inventions and helps foster innovation. However, Artificial Intelligence (AI) could be set to disrupt the patent system as we know it. This talk will examine how AI may change the patent landscape in the years to come. Furthermore, ways in which companies can best protect their AI related inventions will be examined from both a US and...
Cloud is the motor for innovation and digital transformation. CIOs will run 25% of total application workloads in the cloud by the end of 2018, based on recent Morgan Stanley report. Having the right enterprise cloud strategy in place, often in a multi cloud environment, also helps companies become a more intelligent business. Companies that master this path have something in common: they create a culture of continuous innovation. In his presentation, Dilipkumar will outline the latest resear...
This session describes how Professional Services organisations can deliver within Technology-as-a-Service (IaaS) constructs, in private and public enterprise cloud scenarios. See how professional services can be packaged and funded by IaaS cash flows, based upon consumption of technology services. Learn how significant, IT infrastructure transformations can be funded through OPEX spending models with multi-year As-a-Services based contracts. Understand how the automation of repeatable services c...
This is going to be a live demo on a production ready CICD pipeline which automate the deployment of application onto AWS ECS and Fargate. The same pipeline will automate deployment into various environment such as Test, UAT, and Prod. The pipeline will go through various stages such as source, build, test, approval, UAT stage, Prod stage. The demo will utilize only AWS services including AWS CodeCommit, Codebuild, code pipeline, Elastic container service (ECS), ECR, and Fargate.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
Cloud Storage 2.0 has brought many innovations, including the availability of cloud storage services that are less expensive and much faster than previous generations of cloud storage. Cloud Storage 2.0 has also delivered new and faster methods for migrating your premises storage environment to the cloud and the concept of multi-cloud. This session will provide technical details on Cloud Storage 2.0 and the methods used to efficiently migrate from premises-to-cloud storage. This session will als...
Doug was appointed CEO of Big Switch in 2013 to lead the company on its mission to provide modern cloud and data center networking solutions capable of disrupting the stronghold by legacy vendors. Under his guidance, Big Switch has experienced 30+% average QoQ growth for the last 16 quarters; more than quadrupled headcount; successfully shifted to a software-only and subscription-based recurring revenue model; solidified key partnerships with Accton/Edgecore, Dell EMC, HPE, Nutanix, RedHat and V...
PCCW Global is a leading telecommunications provider, offering the latest voice and data solutions to multi-national enterprises and communication service providers. Our truly global coverage combined with local, on the ground knowledge has helped us build best in class connections across the globe; and especially in some of the remotest, hard-to-reach areas in exciting growth markets across Asia, Africa, Latin America and the Middle East.
Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine ...