|By Dana Gardner||
|October 27, 2013 03:00 PM EDT||
The next edition of the HP Discover Performance Podcast Series explores how network testing, monitoring, and analytics provider Empirix developed unique and powerful data processing capabilities.
Empirix uses an advanced analytics engine to continuously and proactively evaluate carrier network performance and customer experience metrics -- amid massive data flows -- to automatically identify issues as they emerge.
To learn more about how a combination of large-scale, real-time performance and pervasive data access made the HP Vertica analytics platform stand out to support such demands for Empirix, join Navdeep Alam, Director of Engineering, Analytics and Prediction at Empirix, based in Billerica, Mass.
The discussion, which took place at the recent HP Vertica Big Data Conference in Boston, is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]
Here are some excerpts:
Gardner: Why do you have such demanding requirements for data processing and analysis?
Alam: What we do is actively and passively monitor networks. When you're in a network as a service provider, you have the opportunity to see the packets within that network, both on the control plane and on the user plane. That just means you're looking at signaling data and also user plane data -- what's going on with the behavior; what's going at the data layer. That’s a vast amount of data, especially with mobile, and most people doing stuff on their devices with data.
When you're in that network and you're tapping that data, there is a tremendous amount of data -- and there's a tremendous amount of insights about not only what's going on in the network, but what's going on with the subscribers and users of that network.
Empirix is able to collect this data from our probes in the network, as well as being able to look at other data points that might help augment the analysis. Through our analytics platform we're able to analyze that data, correlate it, mediate it, and drive metrics out of that data.
That’s a service for our customers, increasing value from that data, so that they can turn around a return on investment (ROI) and understand how they can leverage their networks better to increase operations and so forth. They can understand their customers better and begin to analyze, slice and dice, and visualize data of this complex network.
They can use our platform, as well to do proactive and predictive analysis, so that we can create even better ROI for our customers by telling them what potentially might go wrong and what might be the solution to get around that to avoid a catastrophe.
Gardner: It’s interesting that not only is this data being used for understanding the performance on the network itself, but it's giving people business development and marketing information about how people are using it and where the new opportunities might be.
Is that something fairly new? Were you able to do that with data before, or is it the scale and ability to get in there and create analysis in near-real-time that’s allowed for such a broad-based multilevel approach to data and analysis?
Alam: This is something we've gotten into. We definitely tried to do it before with success, but we knew that in order to really tackle mobile and the increasing demands of data, we really had to up the ante.
Our investment with HP Vertica and how we've introduced that in our new analytics platform, Empirix IntelliSight 1.0, that recently came out, is about leveraging that platform -- not only for scalability and our ability to ingest and process data, but to look at data in its more natural format, both as discrete data, and also as aggregate data. We allow our customers to view that data ad hoc and analyze that data.
It positioned us very well. Now that we have a central point from which all this data is being processed and analyzed, we now run analytics directly at this data, increasing our data locality and decreasing the data latency. This definitely ups our ante to do things much faster, in near real time.
Gardner: Obviously, the sensors, probes, agents, and the ability to pull in the information from the network needs to reside or be at close proximity to the network, but how are you actually deployed? Where does the infrastructure for doing the data analysis reside? Is it in the networks themselves, or is there a remote site? Maybe you could just lay out the architecture of how this is set up.
Alam: We get installed on site. Obviously, the future could change, but right now we're an on-premise solution. We're right where the data is being generated, where it’s flowing, and because of that we're able to gain access to the data in real-time.
One of the things we learned is that this is a tremendous amount of data. It doesn't make sense for us to just hold it and assume that we will do something interesting with it afterward.
The way we've approached our customers is to say, "What kind of value do you seen in this data? What kind of metrics or key performance indicators (KPIs), or what do you think is valuable in this data? We then build a framework that defines the value that they can gain from data -- what are the metrics and what kind of structure they want to apply to this data. We're not just calculating metrics, but we're also applying some sort of model that gives this data some structure.
As they go through what we call the Empirix Intelligent Data Mediation and Correlation (IDMC) system, it's really an analytics calculator. It's putting our data into the Vertica system, so that at that point we have meaningful, actionable data that can be used to trigger alarms, to showcase thresholds, to give customers great insight to what's going on in their network.
Growing the business
From that, they can do various things, such as solve problems proactively, reach out to the customers to deal with those issues, or to make better investments with their technology in order to grow their business.
Gardner: How long have you been using Vertica and how did that come to be the choice that you made? Perhaps you could also tell us a little bit about where you see things going in terms of other capabilities that you might need or a roadmap for you?
Alam: We've been using Vertica for a few years, at least three or four, even before I came on-board. And we're using Vertica primarily for its ability to input and read data very quickly. We knew that, given our solutions, we needed to load a lot of data into the system and then read a lot of data out of it fast and to do it at the same time.
At that time, the database systems we used just couldn't meet the demands for the ever-growing data. So we leveraged Vertica there, and it was used more as an operational data store. When I came on board about a year-and-a-half ago, we wanted to evolve our use of Vertica to be not just for data warehousing, but a hybrid, because we knew that in supporting a lot of different types of data, it was very hard for us to structure all of those types of data.
We wanted to create a framework from which we can define measures and metrics and KPIs and store it in a more flat system from which we can apply various models to make sense of that data.
That really presented us a lot of challenges, not only in scalability, but our ability to work and play with data in various ways. Ultimately, we wanted to allow customers to play with this data at will and to get response in seconds, not hours or minutes.
It required us to look at how we could leverage Vertica as an intelligent data-storage system from which we could process data, store it, and then get answers out of that data very, very quickly. Again, we were looking for responses in a second or so.
Now that we've put all of our data in the data basket, so to speak, with Vertica, we wanted to take it to the next level. We have all this data, both looking at the whole data value chain from discrete data to aggregate data all in one place, with conforming dimensions, where the one truth of that data exists in one system.
We want to take it to the next step. Can we increase our analytical capabilities with the data? Can we find that signal from the noise now that we have all this data? Can we proactively find the patterns in the data, what's contributing to that problem, surface that to our customers, and reduce the noise that they are presented with.?
Instead of showing them that 50 things are wrong, can I show them that 50 things are wrong, but that these one or two issues are actually impacting your network or your subscribers the most? Can we proactively tell them what might be the cause or the reason toward that and how to solve it?
The faster we can load this data, the faster we can retrieve the value out of this data and find that needle in the haystack. That’s where the future resides for us.
Gardner: Clearly, you're creating value and selling insight to the network to your customers, but I know other organizations have also looked at data as a source of revenue in itself. The analysis could be something that you could market. Is there an opportunity with the insight you have in various networks -- maybe in some aggregate fashion -- to create analysis of behavior, network use, or patterns that would then become a revenue source for you, something that people would subscribe to perhaps?
Alam: That's a possibility. Right now, our business has been all about empowering our customers and giving them the ability to leverage that data for their end use. You can imagine, as a service provider, having great insight into their customers and the over-the-top applications that are being leveraged on their network.
Could they then use our analytics and the metadata that we're generating about their network to empower their business systems and their operations to make smarter decisions? Can they change their marketing strategy or even their APIs about how they service customers on their network to take advantage of the data that we are providing them?
The opportunity to grow other business opportunities from this data is tremendous, and it's going to be exciting to see what our customers end up doing with their data.
Gardner: Are there any metrics of success that are particularly important for you. You've mentioned, of course, scale and volume, but things like concurrency, the ability to do queries from different places by different people at the same time is important. Help me understand what some of the other important elements of a good, strong data-analysis platform would be for you?
Alam: Concurrency is definitely important. For us it's about predictability or linear scalability. We know that when we do reach those types of scenarios to support, let’s say, 10 concurrent users or a 100 concurrent users, or to support a greater segmentation of data, because we have gone from 10 terabytes to 30 terabytes, we don't have to change a line of code. We don't have to change how or what we are doing with our data. Linear scalability, especially on commodity hardware, gives us the ability to take our solution and expand it at will, in order to deal with any type of bottlenecks.
Obviously, over time, we'll tune it so that we get better performance out of the hardware or virtual hardware that we use. But we know that when we do hit these bottlenecks, and we will, there is a way around that and it doesn't require us to recompile or rebuild something. We just have to add more nodes, whether it’s virtual or hardware.
You may also be interested in:
- With big data, the DNC turns politics into political science
- Need for quality and speed powers Sentara's applications modernization journey
- Big data changes the customer analysis game for Yammer, Spil Games, Jobrapido
- HP's global CISO Brett Wahlin on the future of security and risk
- Advanced IT monitoring Delivers Predictive Diagnostics Focus to United Airlines
- HP Vertica Architecture Gives Massive Performance Boost to Toughest BI Queries for Infinity Insurance
- Unum Group architect charts a DevOps course to a hybrid cloud future
- Application development efficiencies drive Agile payoffs for healthcare tech provider TriZetto
- How MZI HealthCare identifies big data patient productivity gems using HP Vertica
- Podcast recap: HP Experts analyze and explain the HAVEn big data news from HP Discover
- HP's Project HAVEn rationalizes HP's portfolio while giving businesses a path to total data analysis
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...
Oct. 23, 2016 06:00 AM EDT Reads: 662
A critical component of any IoT project is what to do with all the data being generated. This data needs to be captured, processed, structured, and stored in a way to facilitate different kinds of queries. Traditional data warehouse and analytical systems are mature technologies that can be used to handle certain kinds of queries, but they are not always well suited to many problems, particularly when there is a need for real-time insights.
Oct. 23, 2016 05:30 AM EDT Reads: 3,916
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
Oct. 23, 2016 05:15 AM EDT Reads: 1,831
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Oct. 23, 2016 04:45 AM EDT Reads: 4,293
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
Oct. 23, 2016 04:00 AM EDT Reads: 3,916
One of biggest questions about Big Data is “How do we harness all that information for business use quickly and effectively?” Geographic Information Systems (GIS) or spatial technology is about more than making maps, but adding critical context and meaning to data of all types, coming from all different channels – even sensors. In his session at @ThingsExpo, William (Bill) Meehan, director of utility solutions for Esri, will take a closer look at the current state of spatial technology and ar...
Oct. 23, 2016 03:45 AM EDT Reads: 1,689
SYS-CON Events announced today that Interface Masters Technologies, a leader in Network Visibility and Uptime Solutions, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Interface Masters Technologies is a leading vendor in the network monitoring and high speed networking markets. Based in the heart of Silicon Valley, Interface Masters' expertise lies in Gigabit, 10 Gigabit and 40 Gigabit Eth...
Oct. 23, 2016 03:45 AM EDT Reads: 3,303
Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, will discuss why and how ReadyTalk diverted from healthy revenue an...
Oct. 23, 2016 03:30 AM EDT Reads: 2,949
As software becomes more and more complex, we, as software developers, have been splitting up our code into smaller and smaller components. This is also true for the environment in which we run our code: going from bare metal, to VMs to the modern-day Cloud Native world of containers, schedulers and microservices. While we have figured out how to run containerized applications in the cloud using schedulers, we've yet to come up with a good solution to bridge the gap between getting your conta...
Oct. 23, 2016 03:30 AM EDT Reads: 1,439
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his session at @DevOpsSummit 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will show how customers are able to achieve a level of transparency that enables everyon...
Oct. 23, 2016 03:00 AM EDT Reads: 1,272
SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
Oct. 23, 2016 02:30 AM EDT Reads: 941
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Oct. 23, 2016 02:30 AM EDT Reads: 9,647
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
Oct. 23, 2016 02:30 AM EDT Reads: 847
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
Oct. 23, 2016 02:15 AM EDT Reads: 642
DevOps theory promotes a culture of continuous improvement built on collaboration, empowerment, systems thinking, and feedback loops. But how do you collaborate effectively across the traditional silos? How can you make decisions without system-wide visibility? How can you see the whole system when it is spread across teams and locations? How do you close feedback loops across teams and activities delivering complex multi-tier, cloud, container, serverless, and/or API-based services?
Oct. 23, 2016 02:15 AM EDT Reads: 991