Welcome!

Related Topics: @DXWorldExpo, Microservices Expo, @CloudExpo

@DXWorldExpo: Blog Post

Big Data Analysis Aids Advertising By @Dana_Gardner | @BigDataExpo [#BigData]

Rapid matching of consumer inferences to ads serves up a big data success story

Big Data Analysis Aids Advertising Capabilities

The next BriefingsDirect big data innovation success story uncovers how New York-based adMarketplace, a search syndication advertising network, uses big data to improve its search advertising capabilities.

In part two of our series on adMarketplace, we'll explore how they instantly capture and analyze massive data to allow for efficient real-time bidding for traffic sources for online advertising. And we'll hear how the data-analysis infrastructure also delivers rapid cost-per-click insights to advertisers.

To learn how, BriefingsDirect sat down with Raj Yakkali, Director of Data Infrastructure at adMarketplace, at the recent HP Big Data 2014 Conference in Boston. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about adMarketplace. What do you do, and why is big data such a big part of that?

Yakkali: adMarketplace is a leading advertising platform for search intent. We provide the advertisers with the consumer space where they can project their ads. The benefit of the adMarketplace comes into play where we provide a data platform that can match those ads with the right user intent.

Yakkali

When user searches for a certain keyword, they're directly telling us what they want to see, and we match it perfectly well with our ads. The relationship that we have with our advertisers is that we match them well and make it accessible in exactly what the user is thinking. We do some predictive analytics on top of what the user is saying. We add that dimension to our user search and provide ads aptly.

Gardner: I'm all for getting better ads based on lot of things I already get. Do you have more than just keywords in terms of how you can draw inference, and what sort of scale of data are we talking about when it comes to all that inference information about an intent on behalf of the consumer?

15 dimensions

Yakkali: Keyword search is one side or one dimension of the user search. There are also category campaigns that the advertisers are running. At the same time, there's a geospatial analysis to it as well. There are 15 dimensions that we go through to provide an ad that is perfectly fit for the advertiser and for the consumer to see and take advantage of to meet their needs. With some of the ads, we are trying to serve the user’s requirements and needs.

http://bit.ly/1sWpHmCGardner: With all these variables, this sounds like you're going to be gathering an awful lot of information. You also need to reply back with your results very fast or you lose the opportunity for that consumer to get the ad and then even click through and make a decision. Tell me about scale and speed.

Yakkali: You're right on with that question. In this business, latency is your enemy. If you look into the certain metrics, there are almost a half a billion requests that we're receiving every day and we have to match all of those ads with a sub-second performance. We have internal proprietary datasets, which we take care of before matching these ads. And there are two platforms that we've built internally.

One is called Bid Smart. That performs the analysis between the user intent and the traffic sources that the user search is coming from. At the same time, the price of that ad goes to the publisher. There are the pricing strategies, the traffic sources, and the user intent of the search. All of these things are put together. That predictive analytics system gathers all this information and emits the right ad towards the consumer.

With the partnership with Vertica, we’re able to take the dataset, derive analytics about it, and provide our marketers with all that information.

On top of it, if you look into the amount of data, those half a billion requests that are coming into our system, it generates around two terabytes per hour. At certain times, we can't store all of it for analytics. There is a lot of data that's not inside the database. Now, with the partnership with Vertica, we’re able to take the dataset, derive analytics about it, and provide our marketers with all that information. Bid Smart is the one that does the pricing and matching.

The other thing is Advertiser 3D, which provides that detailed analytics into all these dimensions on the metrics. That provides a very good insight. Now, when it comes to the competition or the opportunity to deliver the right ad at the right time, that's where data work flows make a difference.

We utilize Vertica to directly stream all this click data into it, rather than going into certain other locations and then doing it in a batch format. We directly live-stream that data into Vertica, so that it is readily available for analytics. Our Bid Smart System makes use of that dataset. That's where we get the opportunity to deliver much better ads, with price tags, and the right user intent matched.

Gardner: It sounds very complex. There's an awful lot going on for just serving up an ad. I suppose people don’t appreciate that, but the economics here are very compelling, the more refined and appropriate an ad can be, the more likely the consumer is to buy, but there are a lot of resources that don't get wasted in the meantime. Do you have any sense of what the payoff is, either in business, financial, or technical terms for when you can really accomplish this goal of targeted advertising?

Conversion rate

Yakkali: So our conversion rate is a major key performance indicator (KPI) when it comes to understanding how well we are doing. When we see higher conversion rates, that gives us the sense that we've done the best job and user is happy with what they are searching and what they are getting.

At the at the same time, the publishers, as well as the advertisers, are happy, because the user is coming to us again and again to get that similar, beautiful experience. The advertisers are able to sell more products that meet the needs of the user. And the users are able to get the product that really caters to their needs. We're in the middle of all these things, trying provide the facilitation to the advertisers, as well as the users and the publishers' space.

Gardner: I daresay this is the very future of advertising. Now for you to accomplish these goals and create those positive KPIs, are you housing Vertica in your own data center, do you use cloud, hybrid cloud? Given that you have different platforms, different datasets, how do you manage this technically?

Yakkali: On that end, we started with testing cloud two or three years ago, but again, it turned out that because of so many unknowns and troubleshooting, we had to go with our data centers. Now, we host all our systems in our own data centers and we manage it.

We have our own hardware to deal with. Our system is a 24/7, and we have to be able to deliver the sub-second latency performance. Having your own infrastructure, you have the controlled environment where you can tweak and tune your system to get the best performance out of it.

Considering that it is a 24/7, there are fewer excuses that you can get away with in not delivering it. For that, we do innovation in terms the data flows and the process of how we ingest the data, how we process the data, how we emit the data, and how we clean up the data when we don’t really need it.

All these things have to come together, and it really helps us having that control on all of our infrastructure and all elements in the data pipeline, starting from the user intent and user search, until we provide the data and the results.

Gardner: How long have you been using Vertica in this regard and how did you go about making that decision?

Yakkali: We've been using Vertica for four to five years and our data pipeline was not on Vertica to start with, but as Vertica came into the picture and we saw the great beauty and the powerful features that it brings to capitalize our ability.

That really helped us. With Vertica in place, we have been migrating our mechanics slowly to use it for the real-time analysis and real-time bidding and all those beautiful features that make us do what we can do better. So it’s been a great partnership with Vertica and we see many more features coming in with the new version. Our Bid Smart mechanism is also improving, and with that, algorithmic capabilities are increasing. So it’s progressing.

Feedback loop

Gardner: Tell us a little bit about where your business is heading. In addition to speed, complexity, and scale, where do you see the ability to create this feedback loop? It’s very rapid feedback loop between a lot of incoming data and an action like streaming up an ad. It seems like this could be applied to either other marketing or advertising chores or perhaps even have an ancillary business-development direction. You’ve got this platform and these data centers. Is there something else that you're gearing up for?

Yakkali: At this point, we're in the business of connecting the advertisers, the publishers, and the users. But that is an untapped business to what it can accomplish. The market has started its pathway towards the level of reaching that epitome. If we take a step back and try to understand it, initially, when search started, there was no Google or anything. It was more about curated search.

So the publishers put out all this content together and then projected it out to the user. They didn't know what user wanted. At the same time, when the user looked at this content, they didn't know whether they want it or whether it catered to their needs.

Then, Google came along and user search started. What that directly told was "I want this piece of information. I want to use this piece of information. And I want to see this ad that is relevant to my needs." That’s a very powerful thing. When you hear that part, you're able to analyze that piece and match it properly with the advertisers. But then again, it started to fragment.

At this point, we're in the business of connecting the advertisers, the publishers, and the users.

Now, it’s not only Google. There is Yahoo, Bing, there is mobile, and there are certain apps. There are many apps in the mobile space and each one has its own search. So not all the searches are going to Google, Yahoo, or Bing. Search is already fragmented.

We tap all those pieces. The market that is beyond Google. Yahoo-Bing is stronger and it is growing. So there is a lot of market that needs to be tapped into. We come into the place connecting the advertisers to tap that untapped marketplace.

We've been improving our internal Bid Smart algorithm that came out in the last year. Then, we also launched Advertiser 3D last year as well. Those two products have been providing tremendous growth in our revenue, and the retention rates have been stellar.

The top 60 percent of Google’s top spenders are working with us to complement their business. At the same time, we're also able to provide 50 percent increase in year-over-year revenues. It's additional revenue for them, and even our revenues are increasing based on that fact.

Gardner: It seems like you have an awful lot of runway ahead of you in terms of where search could be applied, and analytics can be drawn from that to augment these services and explode that market.

Is Vertica being used just for the intercept between the incoming data and the outgoing ad, or you are also analyzing what goes on within these marketplace so that you better appreciate, whether you can offer reports, audit trails, and that sort of thing? Is this an inclusive platform, or do you use different analytics platforms for different aspects of what you are doing?

End to end

Yakkali: We do almost everything. It is an end-to-end platform. As part of the business we look into the operational metrics of the whole thing, starting from the user search until the ad is delivered. Then, from that end, there is always that analytics piece that comes onto play, which provides insights to the marketers.

Our market base is filled with the very data-savvy marketers, and they look into each and every data dimension to understand their return on investment (ROI). We give them transparency through our Advertise 3D System and utilizing that, they're able to navigate through the space and aptly tune their campaigns to get the best out of it and to deliver the best to the customer.

Gardner: Any thoughts about other organizations that are also facing significant challenges around speed, scale, also perhaps with a big runway, in terms of knowing that more and more business could be coming their way therefore more data? What would you advise them in terms of the data architecture or the planning in order to accomplish the goals?

Yakkali: When we look at the industries and the market, the ad industry still is untapped. The healthcare industry is just getting into the business of doing much more with analytics. It’s all about the speed and the latency and the insights as well. One, at the operational level and the other, at the insight level to do more innovation on top of it.

Our market base is filled with the very data-savvy marketers, and they look into each and every data dimension to understand their return on investment (ROI).

The ability to listen to the customer depends on how fast you can capture all that feedback, and you tighten that loop of feedback so that you're able to do something with it and make a better product out of it.

So it’s all about taking a look at the datasets very closely as to what they mean, what the user is asking us, what do they want to see, and how you are listening to the customer. Those two aspects really make the difference.

You want to listen to the customer, what they really want. Are you providing it and are you able to guess what they want for tomorrow for that predictive, and going into prescriptive analytics, phase later on. You're telling them what they need to do even before they tell you.

That's the stage that the market is going towards. We're not even scratching the surface of prescriptive analytics. The wave has not yet started towards that route. We're still at the predictive analytics phase, and there is still a lot more to go within that space. Get the foundation stronger, drive towards prescriptive analytics, and listening to your customer, are the three aspects that would make any industry. Those three would be the key foundational pieces for making innovation.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

Latest Stories
Every organization is facing their own Digital Transformation as they attempt to stay ahead of the competition, or worse, just keep up. Each new opportunity, whether embracing machine learning, IoT, or a cloud migration, seems to bring new development, deployment, and management models. The results are more diverse and federated computing models than any time in our history.
On-premise or off, you have powerful tools available to maximize the value of your infrastructure and you demand more visibility and operational control. Fortunately, data center management tools keep a vigil on memory contestation, power, thermal consumption, server health, and utilization, allowing better control no matter your cloud's shape. In this session, learn how Intel software tools enable real-time monitoring and precise management to lower operational costs and optimize infrastructure...
"Calligo is a cloud service provider with data privacy at the heart of what we do. We are a typical Infrastructure as a Service cloud provider but it's been designed around data privacy," explained Julian Box, CEO and co-founder of Calligo, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Isomorphic Software is the global leader in high-end, web-based business applications. We develop, market, and support the SmartClient & Smart GWT HTML5/Ajax platform, combining the productivity and performance of traditional desktop software with the simplicity and reach of the open web. With staff in 10 timezones, Isomorphic provides a global network of services related to our technology, with offerings ranging from turnkey application development to SLA-backed enterprise support. Leadin...
While a hybrid cloud can ease that transition, designing and deploy that hybrid cloud still offers challenges for organizations concerned about lack of available cloud skillsets within their organization. Managed service providers offer a unique opportunity to fill those gaps and get organizations of all sizes on a hybrid cloud that meets their comfort level, while delivering enhanced benefits for cost, efficiency, agility, mobility, and elasticity.
DevOps has long focused on reinventing the SDLC (e.g. with CI/CD, ARA, pipeline automation etc.), while reinvention of IT Ops has lagged. However, new approaches like Site Reliability Engineering, Observability, Containerization, Operations Analytics, and ML/AI are driving a resurgence of IT Ops. In this session our expert panel will focus on how these new ideas are [putting the Ops back in DevOps orbringing modern IT Ops to DevOps].
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Enterprises are striving to become digital businesses for differentiated innovation and customer-centricity. Traditionally, they focused on digitizing processes and paper workflow. To be a disruptor and compete against new players, they need to gain insight into business data and innovate at scale. Cloud and cognitive technologies can help them leverage hidden data in SAP/ERP systems to fuel their businesses to accelerate digital transformation success.
Most organizations are awash today in data and IT systems, yet they're still struggling mightily to use these invaluable assets to meet the rising demand for new digital solutions and customer experiences that drive innovation and growth. What's lacking are potent and effective ways to rapidly combine together on-premises IT and the numerous commercial clouds that the average organization has in place today into effective new business solutions.
Concerns about security, downtime and latency, budgets, and general unfamiliarity with cloud technologies continue to create hesitation for many organizations that truly need to be developing a cloud strategy. Hybrid cloud solutions are helping to elevate those concerns by enabling the combination or orchestration of two or more platforms, including on-premise infrastructure, private clouds and/or third-party, public cloud services. This gives organizations more comfort to begin their digital tr...
Keeping an application running at scale can be a daunting task. When do you need to add more capacity? Larger databases? Additional servers? These questions get harder as the complexity of your application grows. Microservice based architectures and cloud-based dynamic infrastructures are technologies that help you keep your application running with high availability, even during times of extreme scaling. But real cloud success, at scale, requires much more than a basic lift-and-shift migrati...
David Friend is the co-founder and CEO of Wasabi, the hot cloud storage company that delivers fast, low-cost, and reliable cloud storage. Prior to Wasabi, David co-founded Carbonite, one of the world's leading cloud backup companies. A successful tech entrepreneur for more than 30 years, David got his start at ARP Instruments, a manufacturer of synthesizers for rock bands, where he worked with leading musicians of the day like Stevie Wonder, Pete Townsend of The Who, and Led Zeppelin. David has ...
Darktrace is the world's leading AI company for cyber security. Created by mathematicians from the University of Cambridge, Darktrace's Enterprise Immune System is the first non-consumer application of machine learning to work at scale, across all network types, from physical, virtualized, and cloud, through to IoT and industrial control systems. Installed as a self-configuring cyber defense platform, Darktrace continuously learns what is ‘normal' for all devices and users, updating its understa...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Addteq is a leader in providing business solutions to Enterprise clients. Addteq has been in the business for more than 10 years. Through the use of DevOps automation, Addteq strives on creating innovative solutions to solve business processes. Clients depend on Addteq to modernize the software delivery process by providing Atlassian solutions, create custom add-ons, conduct training, offer hosting, perform DevOps services, and provide overall support services.