Related Topics: IoT User Interface, @CloudExpo, @BigDataExpo, @ThingsExpo

IoT User Interface: Blog Feed Post

Your #BigData Analytics | @BigDataExpo #IoT #M2M #BigData #BI #Analytics

Organizations have key business processes that they are constantly trying to re-engineer

Organizations have key business processes that they are constantly trying to re-engineer. These key business processes – loan approvals, college applications, mortgage underwriting, product and component testing, credit applications, medical reviews, employee hiring, environmental testing, requests for proposals, contract bidding, etc. – go through multiple steps, usually involving multiple people with different skill sets, with a business outcome at the end (accept/reject, bid/no bid, pass/fail, retest, reapply, etc.). And while these processes typically include “analytics” that report on how well the processes worked (process effectiveness), the analytics only provide an “after the fact” view on what happened.

Instead of using analytics to measure how well the process worked, how about using predictive and prescriptive analytics to actually direct the process at the beginning?  Instead of analytics that tell you what happened, how about creating analytics at the front of the process that predict what steps in the process are necessary and in what order?  Sometimes the most effective process is the process that you don’t need to execute, or only have to execute in part.

High-Tech Manufacturing Testing Example
We had an engagement with a high-tech manufacturer to help them to leverage analytics and the Internet of Things to optimize their 22-step product and component testing process.  Not only was there a significant amount of capital tied up with their in-process inventory, but the lengthy testing processes also created concerns about excessive and obsolete inventory in an industry where product changes happened constantly.

The manufacturer had lots of data that was coming off of the testing processes, but the data was being used after the fact to tell them where the testing was successful or not.  Instead, our approach was to use that data to predict what tests needed to be run for which components coming from which of the suppliers manufacturing facilities.  Instead of measuring what happened and identifying waste and inefficiencies after the fact, the manufacturer wanted to predict the likely quality of the component (given the extensive amount of data that they could be capturing but today was hitting the manufacturing and testing floors) and identify what tests where needed given that particular situation.  Think dynamic or even smart testing.

We worked with the client to identify all the data that was coming out of all the different testing processes.  We discovered that nearly 90% of the potential data was just “hitting the floor” because the organization did not have a method for capturing and subsequently analyzing this data (most of which was either very detailed log files, or comments and notes being generated by the testers, engineers and technicians during the testing processes).  At a conceptual level, their processing looked like Figure 1 with traditional dashboards and reports that answered the basic operational questions about how the process was working (see Figure 1).

Figure 1: Analytics That Tell You How Each Step Performed

However, we wanted to transform the role of analytics to not just reporting how the processes was working, but we wanted to employ predictive analytics to create a score for each component and then use prescriptive analytics to recommend what tests had to be run and in what order given the results of the predictive score.

We used a technique called the “By Analysis” to brainstorm the “variables and metrics that might be better predictors of testing performance”.  For example, when examining components with a high failure rate, we started the brainstorming process with the following question:

“Show me the percentage of component failures by…”

We asked the workshop participants to brainstorm the “by” variables and metrics that we might want to test.  Here are some of the results:

  • Show me the percentage of component failures by Tester/Technician (age, gender, years of experience, historical performance ratings, most recent training date, most recent testing certifications, months of experience by test machine, level of job satisfaction, years until retirement, etc.)
  • Show me the percentage of component failures by Test Machine (manufacturer, model, install date, last service date, ideal capacity, maximum capacity, test machine operational complexity, etc.)
  • Show me the percentage of component failures by Component (component type, supplier, supplier manufacturing facility, supplier manufacturing machine, supplier component testing results, etc.)
  • Show me the percentage of component failures by Manufacturer (years of service, historical performance, location, distance from distribution center, manufacturing location, lot number, storage location, etc.)
  • Show me the percentage of component failures by Distribution Center (location, build date, last remodel date, local temperature, local humidity, local economic conditions, etc.)
  • Show me the percentage of component failures by Time (Time of day/Day of Week, Time of year, local holidays, seasonality, etc.)
  • Show me the percentage of component failures by Weather (percipation, temperature, humidity, airborne particles, pollution, severe storms, etc.).
  • Show me the percentage of component failures by Labor unrest (strikes, number of plant injuries, safety issues, etc.)
  • Show me the percentage of component failures by Local economic conditions (average hourly wages, paid overtime, union representation, average hours worked per week, etc.)

The data science team started building a score that could be used to predict the quality and reliability of a component from a particular supplier created from a particular manufacturing machine at a particular time of day/week/year under particular weather conditions tested by a particular technician, etc.  Yea, I think you can quickly see how the more detailed data that you have, the more accurate the score.

We were able to create this “Component Quality & Reliability” score that we could use prior to the testing process to tell us what tests we needed to conduct and in what order with a reasonable level of risk (see Figure 2).

Figure 2:  Analytics That Prescribe What Tests to Run

By using the Component Quality & Reliability score, we could determine or predict ahead of time what tests we thought we would need to run and in what order.  The result was a dramatic improvement in the speed and cost of testing, with a minor but manageable increase in component performance risk.

Baseball Analogy
I love sports, particularly baseball.  Baseball has always been a game played with statistics, averages and probabilities.  There are lots of analytics best practices that we can learn from the game of baseball.  And one of the way that analytics is used in baseball is to determine the likelihood of your opponent doing something.

For example, the best baseball fielders understand each individual batter’s tendencies, propensities, averages, statistics and preferences (e.g., where the batter he is likely to hit the ball, what pitches he prefers to hit) and uses that information to position himself on the field in place where the ball is the most likely to be hit.

Then the fielder will make in-game, pitch-by-pitch adjustments based upon:

  • Field dimensions (distance to deep center, distance down the lines, height of the grass, grass or artificial turf, how best to play the Green Monster in Fenway or the vines in Wrigley, etc.)
  • Environment conditions (humidity, temperature, precipitation, wind, position of the sun or lights, etc.)
  • Game situation (number of outs, the inning, score, runners in scoring position, time of day/night, etc.)
  • Pitcher preferences and propensities and in-game effectiveness (getting ahead in the count, number of pitches thrown, current velocity, effectiveness of off speed pitches, etc.)
  • And even more…

The same approach – predicting ahead of time what is likely to happen – works for many business processes, such as underwriting a loan or mortgage.  We would want to learn as much as possible about the players involved in the underwriting process – borrower, property, lenders, appraisers, underwriters – so that we could build a “score” that predicts which steps in the process are important and necessary, and which ones could be skipped without significantly increasing the risk.  For example:

  • Which borrowers have a history of on-time payments, have a sufficient base of assets, have a solid job and salary outlook, don’t pose any retirement cash flow risks, have a reasonable number of dependents (children and potentially parents), etc.
  • Which properties are over-valued given the value of similar properties or properties within the same area, or which properties reside in a high storm or weather risk areas (probability or likelihood of hurricane, tornado, floods, forest fires, earthquakes, zodiac killers, etc.).
  • Which lenders have a history of good loans, have a solid financial foundation, have a solid management team, aren’t in the news for any management shenanigans, don’t have questionable investments, etc.
  • Which appraisers are most effective with what types and locations of properties for what types of loans in what economic situations, have a significant track record of success, have been with the same firm for a reasonable amount of time, have a solid educational background, etc.
  • Which underwriters are most effective with which types of loans for which types of properties, are happy with their job and family situation, have been on the job a reasonable amount of time, have solid performance ratings, don’t have any HR issues, etc.

With this information in hand, we’d be better prepared to know which types of credit applications need what level of scrutiny and what level of risks we would be willing to accept for what level of underwriting return.  Just like the best baseball shortstops and center fielders!

A few things to remember about using analytics to predict what process steps need to be executed and in what order, and which steps can be reduced or skipped given a reasonable increase in risk:

  • The “By Analysis” will fuel creative thinking about additional data sources that are collected (somewhere) but previously considered unusable such as technician comments, technician notes, work orders, product specifications, etc. It is important that to remember that all ideas are worthy of consideration.  Let the data science team determine which variables and metrics are actually worthy of inclusion in the score or model.
  • Through the use of instrumentation or tagging of each step, click and action taken by someone in the process, organizations can start to capture even more detailed and granular data about the testing or application processes. Remember:  more detailed data cannot hurt!
  • Another challenge is the binary nature of the results – pass/fail/retest. Instead of just three states, start contemplating capturing the results along a continuum (“the test was 96.5% successful” versus “the test was successful”).  That additional granularity in the results could prove invaluable in building and fine-tuning your scores and analytic models.
  • Be sure to consider Type I and Type II errors in the planning process to determine criticality and importance of each component or applications. Not all components deserve or require the same level of testing.  For example, the components that keep the cushion of the seat of an airplane in place are not nearly as important as the components that keep the engine on the airplane wing in place (see blog “Understanding Type I and Type II Errors” for more on Type I and Type II Errors).

Using analytics to predict what components or applications need to be tested versus using analytics to measure process effectiveness can provide a magnitude improvement in your key business processes.  In the long-term, it’s the analytics emitted from your key business processes (yielding superior customer, product, operational and market insights) that will differentiate your business.

The post Do Your Big Data Analytics Measure Or Predict? appeared first on InFocus.

More Stories By William Schmarzo

Bill Schmarzo, author of “Big Data: Understanding How Data Powers Big Business”, is responsible for setting the strategy and defining the Big Data service line offerings and capabilities for the EMC Global Services organization. As part of Bill’s CTO charter, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He’s written several white papers, avid blogger and is a frequent speaker on the use of Big Data and advanced analytics to power organization’s key business initiatives. He also teaches the “Big Data MBA” at the University of San Francisco School of Management.

Bill has nearly three decades of experience in data warehousing, BI and analytics. Bill authored EMC’s Vision Workshop methodology that links an organization’s strategic business initiatives with their supporting data and analytic requirements, and co-authored with Ralph Kimball a series of articles on analytic applications. Bill has served on The Data Warehouse Institute’s faculty as the head of the analytic applications curriculum.

Previously, Bill was the Vice President of Advertiser Analytics at Yahoo and the Vice President of Analytic Applications at Business Objects.

Latest Stories
SYS-CON Events announced today that Cloudbric, a leading website security provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Cloudbric is an elite full service website protection solution specifically designed for IT novices, entrepreneurs, and small and medium businesses. First launched in 2015, Cloudbric is based on the enterprise level Web Application Firewall by Penta Security Sys...
SYS-CON Events announced today that Impiger Technologies will exhibit in Booth #109 at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Impiger Technologies is a world-class, enterprise software product engineering company specializing in Mobile Application Development, Cloud Applications, Microsoft Technology Solutions, Web Technology and Telecom Services. Impiger Technologies helps enterprises improve busi...
Virgil consists of an open-source encryption library, which implements Cryptographic Message Syntax (CMS) and Elliptic Curve Integrated Encryption Scheme (ECIES) (including RSA schema), a Key Management API, and a cloud-based Key Management Service (Virgil Keys). The Virgil Keys Service consists of a public key service and a private key escrow service. 

Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
SYS-CON Events announced today that Cemware will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Use MATLAB functions by just visiting website mathfreeon.com. MATLAB compatible, freely usable, online platform services. As of October 2016, 80,000 users from 180 countries are enjoying our platform service.
Digitization is driving a fundamental change in society that is transforming the way businesses work with their customers, their supply chains and their people. Digital transformation leverages DevOps best practices, such as Agile Parallel Development, Continuous Delivery and Agile Operations to capitalize on opportunities and create competitive differentiation in the application economy. However, information security has been notably absent from the DevOps movement. Speed doesn’t have to negat...
SYS-CON Events announced today that eCube Systems, the leading provider of modern development tools and best practices for Continuous Integration on OpenVMS, will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. eCube Systems offers a family of middleware products and development tools that maximize return on technology investment by leveraging existing technical equity to meet evolving business needs. ...
SYS-CON Events announced today that MathFreeOn will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MathFreeOn is Software as a Service (SaaS) used in Engineering and Math education. Write scripts and solve math problems online. MathFreeOn provides online courses for beginners or amateurs who have difficulties in writing scripts. In accordance with various mathematical topics, there are more tha...
In an era of historic innovation fueled by unprecedented access to data and technology, the low cost and risk of entering new markets has leveled the playing field for business. Today, any ambitious innovator can easily introduce a new application or product that can reinvent business models and transform the client experience. In their Day 2 Keynote at 19th Cloud Expo, Mercer Rowe, IBM Vice President of Strategic Alliances, and Raejeanne Skillern, Intel Vice President of Data Center Group and ...
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
In the 21st century, security on the Internet has become one of the most important issues. We hear more and more about cyber-attacks on the websites of large corporations, banks and even small businesses. When online we’re concerned not only for our own safety but also our privacy. We have to know that hackers usually start their preparation by investigating the private information of admins – the habits, interests, visited websites and so on. On the other hand, our own security is in danger bec...
There is growing need for data-driven applications and the need for digital platforms to build these apps. In his session at 19th Cloud Expo, Muddu Sudhakar, VP and GM of Security & IoT at Splunk, will cover different PaaS solutions and Big Data platforms that are available to build applications. In addition, AI and machine learning are creating new requirements that developers need in the building of next-gen apps. The next-generation digital platforms have some of the past platform needs a...
Join Impiger for their featured webinar: ‘Cloud Computing: A Roadmap to Modern Software Delivery’ on November 10, 2016, at 12:00 pm CST. Very few companies have not experienced some impact to their IT delivery due to the evolution of cloud computing. This webinar is not about deciding whether you should entertain moving some or all of your IT to the cloud, but rather, a detailed look under the hood to help IT professionals understand how cloud adoption has evolved and what trends will impact th...
"We've discovered that after shows 80% if leads that people get, 80% of the conversations end up on the show floor, meaning people forget about it, people forget who they talk to, people forget that there are actual business opportunities to be had here so we try to help out and keep the conversations going," explained Jeff Mesnik, Founder and President of ContentMX, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.