Related Topics: @BigDataExpo, @CloudExpo, @ThingsExpo

@BigDataExpo: Blog Feed Post

Your #BigData Analytics | @BigDataExpo #IoT #M2M #BigData #BI #Analytics

Organizations have key business processes that they are constantly trying to re-engineer

Organizations have key business processes that they are constantly trying to re-engineer. These key business processes – loan approvals, college applications, mortgage underwriting, product and component testing, credit applications, medical reviews, employee hiring, environmental testing, requests for proposals, contract bidding, etc. – go through multiple steps, usually involving multiple people with different skill sets, with a business outcome at the end (accept/reject, bid/no bid, pass/fail, retest, reapply, etc.). And while these processes typically include “analytics” that report on how well the processes worked (process effectiveness), the analytics only provide an “after the fact” view on what happened.

Instead of using analytics to measure how well the process worked, how about using predictive and prescriptive analytics to actually direct the process at the beginning?  Instead of analytics that tell you what happened, how about creating analytics at the front of the process that predict what steps in the process are necessary and in what order?  Sometimes the most effective process is the process that you don’t need to execute, or only have to execute in part.

High-Tech Manufacturing Testing Example
We had an engagement with a high-tech manufacturer to help them to leverage analytics and the Internet of Things to optimize their 22-step product and component testing process.  Not only was there a significant amount of capital tied up with their in-process inventory, but the lengthy testing processes also created concerns about excessive and obsolete inventory in an industry where product changes happened constantly.

The manufacturer had lots of data that was coming off of the testing processes, but the data was being used after the fact to tell them where the testing was successful or not.  Instead, our approach was to use that data to predict what tests needed to be run for which components coming from which of the suppliers manufacturing facilities.  Instead of measuring what happened and identifying waste and inefficiencies after the fact, the manufacturer wanted to predict the likely quality of the component (given the extensive amount of data that they could be capturing but today was hitting the manufacturing and testing floors) and identify what tests where needed given that particular situation.  Think dynamic or even smart testing.

We worked with the client to identify all the data that was coming out of all the different testing processes.  We discovered that nearly 90% of the potential data was just “hitting the floor” because the organization did not have a method for capturing and subsequently analyzing this data (most of which was either very detailed log files, or comments and notes being generated by the testers, engineers and technicians during the testing processes).  At a conceptual level, their processing looked like Figure 1 with traditional dashboards and reports that answered the basic operational questions about how the process was working (see Figure 1).

Figure 1: Analytics That Tell You How Each Step Performed

However, we wanted to transform the role of analytics to not just reporting how the processes was working, but we wanted to employ predictive analytics to create a score for each component and then use prescriptive analytics to recommend what tests had to be run and in what order given the results of the predictive score.

We used a technique called the “By Analysis” to brainstorm the “variables and metrics that might be better predictors of testing performance”.  For example, when examining components with a high failure rate, we started the brainstorming process with the following question:

“Show me the percentage of component failures by…”

We asked the workshop participants to brainstorm the “by” variables and metrics that we might want to test.  Here are some of the results:

  • Show me the percentage of component failures by Tester/Technician (age, gender, years of experience, historical performance ratings, most recent training date, most recent testing certifications, months of experience by test machine, level of job satisfaction, years until retirement, etc.)
  • Show me the percentage of component failures by Test Machine (manufacturer, model, install date, last service date, ideal capacity, maximum capacity, test machine operational complexity, etc.)
  • Show me the percentage of component failures by Component (component type, supplier, supplier manufacturing facility, supplier manufacturing machine, supplier component testing results, etc.)
  • Show me the percentage of component failures by Manufacturer (years of service, historical performance, location, distance from distribution center, manufacturing location, lot number, storage location, etc.)
  • Show me the percentage of component failures by Distribution Center (location, build date, last remodel date, local temperature, local humidity, local economic conditions, etc.)
  • Show me the percentage of component failures by Time (Time of day/Day of Week, Time of year, local holidays, seasonality, etc.)
  • Show me the percentage of component failures by Weather (percipation, temperature, humidity, airborne particles, pollution, severe storms, etc.).
  • Show me the percentage of component failures by Labor unrest (strikes, number of plant injuries, safety issues, etc.)
  • Show me the percentage of component failures by Local economic conditions (average hourly wages, paid overtime, union representation, average hours worked per week, etc.)

The data science team started building a score that could be used to predict the quality and reliability of a component from a particular supplier created from a particular manufacturing machine at a particular time of day/week/year under particular weather conditions tested by a particular technician, etc.  Yea, I think you can quickly see how the more detailed data that you have, the more accurate the score.

We were able to create this “Component Quality & Reliability” score that we could use prior to the testing process to tell us what tests we needed to conduct and in what order with a reasonable level of risk (see Figure 2).

Figure 2:  Analytics That Prescribe What Tests to Run

By using the Component Quality & Reliability score, we could determine or predict ahead of time what tests we thought we would need to run and in what order.  The result was a dramatic improvement in the speed and cost of testing, with a minor but manageable increase in component performance risk.

Baseball Analogy
I love sports, particularly baseball.  Baseball has always been a game played with statistics, averages and probabilities.  There are lots of analytics best practices that we can learn from the game of baseball.  And one of the way that analytics is used in baseball is to determine the likelihood of your opponent doing something.

For example, the best baseball fielders understand each individual batter’s tendencies, propensities, averages, statistics and preferences (e.g., where the batter he is likely to hit the ball, what pitches he prefers to hit) and uses that information to position himself on the field in place where the ball is the most likely to be hit.

Then the fielder will make in-game, pitch-by-pitch adjustments based upon:

  • Field dimensions (distance to deep center, distance down the lines, height of the grass, grass or artificial turf, how best to play the Green Monster in Fenway or the vines in Wrigley, etc.)
  • Environment conditions (humidity, temperature, precipitation, wind, position of the sun or lights, etc.)
  • Game situation (number of outs, the inning, score, runners in scoring position, time of day/night, etc.)
  • Pitcher preferences and propensities and in-game effectiveness (getting ahead in the count, number of pitches thrown, current velocity, effectiveness of off speed pitches, etc.)
  • And even more…

The same approach – predicting ahead of time what is likely to happen – works for many business processes, such as underwriting a loan or mortgage.  We would want to learn as much as possible about the players involved in the underwriting process – borrower, property, lenders, appraisers, underwriters – so that we could build a “score” that predicts which steps in the process are important and necessary, and which ones could be skipped without significantly increasing the risk.  For example:

  • Which borrowers have a history of on-time payments, have a sufficient base of assets, have a solid job and salary outlook, don’t pose any retirement cash flow risks, have a reasonable number of dependents (children and potentially parents), etc.
  • Which properties are over-valued given the value of similar properties or properties within the same area, or which properties reside in a high storm or weather risk areas (probability or likelihood of hurricane, tornado, floods, forest fires, earthquakes, zodiac killers, etc.).
  • Which lenders have a history of good loans, have a solid financial foundation, have a solid management team, aren’t in the news for any management shenanigans, don’t have questionable investments, etc.
  • Which appraisers are most effective with what types and locations of properties for what types of loans in what economic situations, have a significant track record of success, have been with the same firm for a reasonable amount of time, have a solid educational background, etc.
  • Which underwriters are most effective with which types of loans for which types of properties, are happy with their job and family situation, have been on the job a reasonable amount of time, have solid performance ratings, don’t have any HR issues, etc.

With this information in hand, we’d be better prepared to know which types of credit applications need what level of scrutiny and what level of risks we would be willing to accept for what level of underwriting return.  Just like the best baseball shortstops and center fielders!

A few things to remember about using analytics to predict what process steps need to be executed and in what order, and which steps can be reduced or skipped given a reasonable increase in risk:

  • The “By Analysis” will fuel creative thinking about additional data sources that are collected (somewhere) but previously considered unusable such as technician comments, technician notes, work orders, product specifications, etc. It is important that to remember that all ideas are worthy of consideration.  Let the data science team determine which variables and metrics are actually worthy of inclusion in the score or model.
  • Through the use of instrumentation or tagging of each step, click and action taken by someone in the process, organizations can start to capture even more detailed and granular data about the testing or application processes. Remember:  more detailed data cannot hurt!
  • Another challenge is the binary nature of the results – pass/fail/retest. Instead of just three states, start contemplating capturing the results along a continuum (“the test was 96.5% successful” versus “the test was successful”).  That additional granularity in the results could prove invaluable in building and fine-tuning your scores and analytic models.
  • Be sure to consider Type I and Type II errors in the planning process to determine criticality and importance of each component or applications. Not all components deserve or require the same level of testing.  For example, the components that keep the cushion of the seat of an airplane in place are not nearly as important as the components that keep the engine on the airplane wing in place (see blog “Understanding Type I and Type II Errors” for more on Type I and Type II Errors).

Using analytics to predict what components or applications need to be tested versus using analytics to measure process effectiveness can provide a magnitude improvement in your key business processes.  In the long-term, it’s the analytics emitted from your key business processes (yielding superior customer, product, operational and market insights) that will differentiate your business.

The post Do Your Big Data Analytics Measure Or Predict? appeared first on InFocus.

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

Latest Stories
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
You think you know what’s in your data. But do you? Most organizations are now aware of the business intelligence represented by their data. Data science stands to take this to a level you never thought of – literally. The techniques of data science, when used with the capabilities of Big Data technologies, can make connections you had not yet imagined, helping you discover new insights and ask new questions of your data. In his session at @ThingsExpo, Sarbjit Sarkaria, data science team lead ...
Ayehu provides IT Process Automation & Orchestration solutions for IT and Security professionals to identify and resolve critical incidents and enable rapid containment, eradication, and recovery from cyber security breaches. Ayehu provides customers greater control over IT infrastructure through automation. Ayehu solutions have been deployed by major enterprises worldwide, and currently, support thousands of IT processes across the globe. The company has offices in New York, California, and Isr...
Tintri VM-aware storage is the simplest for virtualized applications and cloud. Organizations including GE, Toyota, United Healthcare, NASA and 6 of the Fortune 15 have said "No to LUNs." With Tintri they manage only virtual machines, in a fraction of the footprint and at far lower cost than conventional storage. Tintri offers the choice of all-flash or hybrid-flash platform, converged or stand-alone structure and any hypervisor. Rather than obsess with storage, leaders focus on the business app...
Addteq is one of the top 10 Platinum Atlassian Experts who specialize in DevOps, custom and continuous integration, automation, plugin development, and consulting for midsize and global firms. Addteq firmly believes that automation is essential for successful software releases. Addteq centers its products and services around this fundamentally unique approach to delivering complete software release management solutions. With a combination of Addteq's services and our extensive list of partners,...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and sh...
Big Data, cloud, analytics, contextual information, wearable tech, sensors, mobility, and WebRTC: together, these advances have created a perfect storm of technologies that are disrupting and transforming classic communications models and ecosystems. In his session at @ThingsExpo, Erik Perotti, Senior Manager of New Ventures on Plantronics’ Innovation team, provided an overview of this technological shift, including associated business and consumer communications impacts, and opportunities it m...
For organizations that have amassed large sums of software complexity, taking a microservices approach is the first step toward DevOps and continuous improvement / development. Integrating system-level analysis with microservices makes it easier to change and add functionality to applications at any time without the increase of risk. Before you start big transformation projects or a cloud migration, make sure these changes won’t take down your entire organization.
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, looked at differ...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
SYS-CON Events announced today that IoT Now has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, discussed the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports.
SYS-CON Events announced today that WineSOFT will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Based in Seoul and Irvine, WineSOFT is an innovative software house focusing on internet infrastructure solutions. The venture started as a bootstrap start-up in 2010 by focusing on making the internet faster and more powerful. WineSOFT’s knowledge is based on the expertise of TCP/IP, VPN, SSL, peer-to-peer, mob...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain.
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices that will soon be in each of our lives. Enter the conversational interface revolution, combining bots we can literally talk with, gesture to, and even direct with our thoughts, with embedded artificial intelligence, whic...