Related Topics: SAP HANA Cloud, @CloudExpo, @BigDataExpo, @ThingsExpo

SAP HANA Cloud: Article

Excel for #BigData Analysis | @CloudExpo #BI #Analytics #DigitalTransformation

How propelling instant results to the Excel edge democratizes advanced analytics

HTI Labs in London provides the means and governance with its Schematiq tool to bring critical data services to the interface users want.

The next BriefingsDirect Voice of the Customer digital transformation case study explores how powerful and diverse financial information is newly and uniquely delivered to the ubiquitous Excel spreadsheet edge.

We'll explore how HTI Labs in London provides the means and governance with its Schematiq tool to bring critical data services to the interface users want. By leveraging the best of instant cloud-delivered data with spreadsheets, Schematiq democratizes end-user empowerment while providing powerful new ways to harness and access complex information.

To learn how complex cloud core-to-edge processes and benefits can be better managed and exploited we're joined by Darren Harris, CEO and Co-Founder of HTI Labs, and Jonathan Glass, CTO and Co-Founder of HTI Labs, based in London. The discussion is moderated by BriefingsDirect's Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Let's put some context around this first. What major trends in the financial sector led you to create HTI Labs, and what are the problems you're seeking to solve?


Harris: Obviously, in finance, spreadsheets are widespread and are being used for a number of varying problems. A real issue started a number of years ago, where spreadsheets got out of control. People were using them everywhere, causing lots of operational risk processes. They wanted to get their hands around it for governance, and there were loads that we needed to eradicate -- Excel-type issues.

That led to the creation of centralized teams that locked down rigid processes and effectively took away a lot of the innovation and discovery process that traders are using to spot opportunities and explore data.

Through this process, we're trying to help with governance to understand the tools to explore, and [deliver] the ability to put the data in the hands of people ... [with] the right balance.

So by taking the best of regulatory scrutiny around what a person needs, and some innovation that we put into Schematiq, we see an opportunity to take Excel to another level -- but not sacrifice the control that’s needed.

Gardner: Jonathan, are there technology trends that allowed you to be able to do this, whereas it may not have been feasible economically or technically before?

Upstream capabilities

Glass: There are lot of really great back-end technologies that are available now, along with the ability to either internally or externally scale compute resources. Essentially, the desktop remains quite similar. Excel has remained quite the same, but the upstream capabilities have really grown.


So there's a challenge. Data that people feel they should have access to is getting bigger, more complex, and less structured. So Excel, which is this great front-end to come to grips with data, is becoming a bit of bottleneck in terms of actually keeping up with the data that's out there that people want.

Gardner: So, we're going to keep Excel. We're not going to throw the baby out with the bathwater, so to speak, but we are going to do something a little bit different and interesting. What is it that we're now putting into Excel and how is that different from what was available in the past?

Harris: Schematiq extends Excel and allows it to access unstructured data. It also reduces the complexity and technical limitations that Excel has as an out-of-the-box product.

We have the notion of a data link that's effectively in a single cell that allows you to reference data that’s held externally on a back-end site. So, where people used to ingest data from another system directly into Excel, and effectively divorce it from the source, we can leave that data where it is.

It's a paradigm of take a question to the data; don’t pull the data to the question. That means we can leverage the power of the big-data platforms and how they process an analytic database on the back-end, but where you can effectively use Excel as the front screen. Ask questions from Excel, but push that query to the back-end. That's very different in terms of the model that most people are used to working with in Excel.

Gardner: This is a two-way street. It's a bit different. And you're also looking at the quality, compliance, and regulatory concerns over that data.

Harris: Absolutely. An end-user is able to break down or decompose any workflow process with data and debug it the same way they can in a spreadsheet. The transparency that we add on top of Excel’s use with Schematiq allows us to monitor what everybody is doing and the function they're using. So, you can give them agility, but still maintain the governance and the control.

In organizations, lots of teams have become disengaged. IT has tried to create some central core platform that’s quite restrictive, and it's not really serving the users. They have gotten disengaged and they've created what Gartner referred to as the Shadow BI Team, with databases under their desk, and stuff like that.

By bringing in Schematiq we add that transparency back, and we allow IT and the users to have an informed discussion -- a very analytic conversation -- around what they're using, how they are using it, where the bottlenecks are. And then, they can work out where the best value is. It's all about agility and control. You just can't give the self-service tools to an organization and not have the transparency for any oversight or governance.

To the edge

Gardner: So we have, in a sense, brought this core to the edge. We've managed it in terms of compliance and security. Now, we can start to think about how creative we can get with what's on that back-end that we deliver. Tell us a little bit about what you go after, what your users want to experiment with, and then how you enable that.

Glass: We try to be as agnostic to that as we can, because it's the creativity of the end-user that really drives value.

We have a variety of different data sources, traditional relational databases, object stores, OLAP cubes, APIs, web queries, and flat files. People want to bring that stuff together. They want some way that they can pull this stuff in from different sources and create something that's unique. This concept of putting together data that hasn't been put together before is where the sparks start to fly and where the value really comes from.

Gardner: And with Schematiq you're enabling that aggregation and cleansing ability to combine, as well as delivering it. Is that right?

The iteration curve is so much tighter and the cost of doing that is so much less. Users are able to innovate and put together the scenario of the business case for why this is a good idea.

Harris: Absolutely. It's that discovery process. It may be very early on in a long chain. This thing may progress to be something more classic, operational, and structured business intelligence (BI), but allowing end-users the ability to cleanse, explore data, and then hand over an artifact that someone in the core team can work with or use as an asset. The iteration curve is so much tighter and the cost of doing that is so much less. Users are able to innovate and put together the scenario of the business case for why this is a good idea.

The only thing I would add to the sources that Jon has just mentioned is with HPE Haven OnDemand, [you gain access to] the unstructured analytics, giving the users the ability to access and leverage all of the HPE IDOL capabilities. That capability is a really powerful and transformational thing for businesses.

They have such a set of unstructured data [services] available in voice and text, and when you allow business users access to that data, the things they come up with, their ideas, are just quite amazing.

Technologists always try to put themselves in the minds of the users, and we've all historically done a bad job of making the data more accessible for them. When you allow them the ability to analyze PDFs without structure, to share that, to analyze sentiment, to include concepts and entities, or even enrich a core proposition, you're really starting to create innovation. You've raised the awareness of all of these analytics that exist in the world today in the back-end, shown end-users what they can do, and then put their brains to work discovering and inventing.

Gardner: Many of these financial organizations are well-established, many of them for hundreds of years perhaps. All are thinking about digital transformation, the journey, and are looking to become more data-driven and to empower more people to take advantage of that. So, it seems to me you're almost an agent of digital transformation, even in a very technical and sophisticated sector like finance.

Making data accessible

Glass: There are a lot of stereotypes in terms of who the business analysts are and who the people are that come up with ideas and intervention. The true power of democratization is making data more accessible, lowering the technical barrier, and allowing people to explore and innovate. Things always come from where you least expect them.

Gardner: I imagine that Microsoft is pleased with this, because there are some people who are a bit down on Excel. They think that it's manual, that it's by rote, and that it's not the way to go. So, you, in a sense, are helping Excel get a new lease on life.

Glass: I don’t think we're the whole story in that space, but I love Excel. I've used it for years and years at work. I've seen the power of what it can do and what it can deliver, and I have a bit of an understanding of why that is. It’s the live nature of it, the fact that people can look at data in a spreadsheet, see where it’s come from, see where it’s going, they can trust it, and they can believe in it.

That’s why what we're trying to do is create these live connections to these upstream data sources. There are manual steps, download, copy/paste, move around the sheet, which is where errors creep in. It’s where the bloat, the slowness, and the unreliability can happen, but by changing that into a live connection to the data source, it becomes instant and it goes back to being trusted, reliable, and actionable.

Harris: There's something in the DNA, as well, of how people interact with data and so we can lay out effectively the algorithm or the process of understanding a calculation or a data flow. That’s why you see a lot of other systems that are more web-based or web-centric and replicate an Excel-type experience.

The user starts to use it and starts to think, "Wow, it’s just like Excel," and it isn’t. They hit a barrier, they hit a wall, and then they hit the "export" button. Then, they put it back [into Excel] and create their own way to work with it. So, there's something in the DNA of Excel and the way people lay things out. I think of [Excel] almost like a programing environment for non-programers. Some people describe it as a functional language very much like Haskell, and the Excel functions they write were effectively then working and navigating through the data.

Gardner: No need to worry that if you build it, will they come; they're already there.

Harris: Absolutely.

Gardner: Tell us a bit about HTI Labs and how your company came about, and where you are on your evolution.

Cutting edge

Harris: HTI labs was founded in 2012. The core backbone of the team actually worked for the same tier 1 investment bank, and we were building risk and trading systems for front-office teams. We were really, I suppose, the cutting edge of all the big data technologies that were being used at the time -- real-time, disputed graphs and cubes, and everything.

As a core team, it was about taking that expertise and bringing it to other industries. Using Monte Carlo farms in risk calculations, the ability to export data at speed and real-time risk. These things were becoming more centric to other organizations, which was an opportunity.

At the moment, we're focusing predominately on energy trading. Our software is being used across a number of other sectors and our largest client has installed Schematiq on 120 desktops, which is great. That’s a great validation of what we're doing. We're also a member of the London Stock Exchange Elite Program, based in London for high-growth companies.

Glass: Darren and I met when we were working for the same company. I started out as a quant doing the modeling, the map behind pricing, but I found that my interest lay more in the engineering. Rather than doing it once, can I do it a million times, can I do these things reliably and scale them?

The algorithms are built, but the key to making them so much more improved is the feedback loop between your domain users, your business users, and how they can enrich and train effectively these algorithms.

Because I started in a front-office environment, it was very spreadsheet-dominated, it was very VBA-dominated. There's good and bad in that. A lot of those lessened, and Darren and I met up. We crossed the divide together from the top-down, big IT systems and the bottom-up end-user best-developed spreadsheets, and so on. We found a middle ground together, which we feel is a quite powerful combination.

Gardner: Back to where this leads. We're seeing more-and-more companies using data services like Haven OnDemand and starting to employ machine learning, artificial intelligence (AI), and bots to augment what the humans do so well. Is there an opportunity for that to play here, or maybe it already is? The question basically is, how does AI come to bear on what you can deliver out to the Excel edge?

Harris: I think what you see is that out of the box, you have a base unit of capability. The algorithms are built, but the key to making them so much more improved is the feedback loop between your domain users, your business users, and how they can enrich and train effectively these algorithms.

So, we see a future where the self-service BI tools that they use to interact with data and explore would almost become the same mechanism where people will see the results from the algorithms and give feedback to send back to the underlying algorithm.

Gardner: And Jonathan, where do you see the use of bots, particularly perhaps with an API model like Haven OnDemand?

The role of bots

Glass: The concept for bots is replicating an insight or a process that somebody might already be doing manually. When people create these data flows and analyses that they maybe run once so it’s quite time-consuming to run. The real exciting possibility is that you make these things run 24×7. So, you start receiving notifications, rather than having to pull from the data source. You start receiving notifications from your own mailbox that you have created. You look at those and you decide whether that's a good insight or a bad insight, and you can then start to train it and refine it.

The training and refining is that loop that potentially goes back to IT, gets back through a development loop, and it’s about closing that loop and tightening that loop. That's the thing that really adds value to those opportunities.

Gardner: Perhaps we should unpack Schematiq a bit to understand how one might go back and do that within the context of your tool. Are there several components of the tool, one of which might lend itself to going back and automating?

Glass: Absolutely. You can imagine the spreadsheet has some inputs and some outputs. One of the components within the Schematiq architecture is the ability to take a spreadsheet, to take the logic and the process that’s embedded in our spreadsheet, and turn it into an executable module of code, which you can host on your server, you can schedule, you can run as often as you like, and you can trigger based on events.

It’s very much all about empowering the end-user to connect, create, govern, share instantly and then allow consumption from anybody on any device.

It’s a way of emitting code from a spreadsheet. You take some of the insight, you take without a business analysis loop and a development loop, and you take the exact thing that the user, the analyst, has programmed. You make it into something that you can run, commoditize, and scale. That’s quite an important way in which we reduce that development loop. We create that cycle that’s tight and rapid.

Gardner: Darren, would you like to explain the other components that make-up Schematiq?

Harris: There are four components of Schematiq architecture. There's the workbench that extends Excel and allows the ability to have large structured data analytics. We have the asset manager, which is really all about governance. So, you can think of it like source control for Excel, but with a lot more around metadata control, transparency, and analytics on what people are using and how they are using it.

There's a server component that allows you just to off-load and scale analytics horizontally, if they do that, and build repeatable or overnight processes. The last part is the portal. This is really about allowing end-users to instantly share their insights with other people. Picking up from Jon’s point about the compound executable, but it’s defined in Schematiq. That can be off-loaded to a server and exposed as another API to a computer, the mobile, or even a function.

So, it’s very much all about empowering the end-user to connect, create, govern, share instantly and then allow consumption from anybody on any device.

Market for data services

Gardner: I imagine, given the sensitive nature of the financial markets and activities, that you have some boundaries that you can’t cross when it comes to examining what’s going on in between the core and the edge.

Tell me about how you, as an organization, can look at what’s going on with the Schematiq and the democratization, and whether that creates another market for data services when you see what the demand entails.

Harris: It’s definitely the case that people have internal datasets they create and that they look after. People are very precious about them because they are hugely valuable, and one of the things that we strive to help people do is to share those things.

Across the trading floor, you might effectively have a dozen or more different IT infrastructures, if you think of what’s existing on the desk as being a miniature infrastructure that’s been created. So, it's about making easy for people to share these things, to create master datasets that they gain value from, and to see that they gain mutual value from that, rather than feeling closed in, and don’t want to share this with their neighbors.

If we work together and if we have the tools that enable us to collaborate effectively, then we can all get more done and we can all add more value.

If we work together and if we have the tools that enable us to collaborate effectively, then we can all get more done and we can all add more value.

Gardner: It's interesting to me that the more we look at the use of data, the more it opens up new markets and innovation capabilities that we hadn’t even considered before. And, as an analyst, I expect to see more of a marketplace of data services. You strike me as an accelerant to that.

Harris: Absolutely. As the analytics are coming online and exposed by API’s, the underlying store that’s used is becoming a bit irrelevant. If you look at what the analytics can do for you, that’s how you consume the insight and you can connect to other sources. You can connect from Twitter, you connect from Facebook, you can connect PDFs, whether it’s NoSQL, structured, columnar, rows, it doesn’t really matter. You don’t see that complexity. The fact that you can just create an API key, access it as consumer, and can start to work with it is really powerful.

There was the recent example in the UK of a report on the Iraq War. It’s 2.2 million words, it took seven years to write, and it’s available online, but there's no way any normal person could consume or analyze that. That’s three times the complete works of Shakespeare.

Using these APIs, you can start to pull out mentions, you can pull out countries, locations and really start to get into the data and provide anybody with Excel at home, in our case, or any other tool, the ability to analyze and get in there and share those insights. We're very used to media where we get just the headline, and that spin comes into play. People turn things on their, head and you really never get to delve into the underlying detail.

What’s really interesting is when democratization and sharing of insights and collaboration comes, we can all be informed. We can all really dig deep, and all these people that work there, the great analysts, could start to collaborate and delve and find things and find new discoveries and share that insight.

Gardner: All right, a little light bulb just went off in my head whereas we would go to a headline and a new story and we might have a hyperlink to a source. I could get a headline and a news story, open up my Excel spreadsheet, get to the actual data source behind the entire story and then probe and plumb and analyze that any which way I wanted to.

Harris: Yes, Exactly. I think the most savvy consumer now, the analyst, is starting to demand that transparency. We've seen in the UK, words, election messages and quotes and even financial stats where people just don’t believe the headlines. They're demanding transparency in that process, and so governance can only be really a good thing.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

Latest Stories
The Internet of Things (IoT), in all its myriad manifestations, has great potential. Much of that potential comes from the evolving data management and analytic (DMA) technologies and processes that allow us to gain insight from all of the IoT data that can be generated and gathered. This potential may never be met as those data sets are tied to specific industry verticals and single markets, with no clear way to use IoT data and sensor analytics to fulfill the hype being given the IoT today.
In the 21st century, security on the Internet has become one of the most important issues. We hear more and more about cyber-attacks on the websites of large corporations, banks and even small businesses. When online we’re concerned not only for our own safety but also our privacy. We have to know that hackers usually start their preparation by investigating the private information of admins – the habits, interests, visited websites and so on. On the other hand, our own security is in danger bec...
Enterprises have been using both Big Data and virtualization for years. Until recently, however, most enterprises have not combined the two. Big Data's demands for higher levels of performance, the ability to control quality-of-service (QoS), and the ability to adhere to SLAs have kept it on bare metal, apart from the modern data center cloud. With recent technology innovations, we've seen the advantages of bare metal erode to such a degree that the enhanced flexibility and reduced costs that cl...
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his session at @DevOpsSummit 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will show how customers are able to achieve a level of transparency that enables everyon...
Donna Yasay, President of HomeGrid Forum, today discussed with a panel of technology peers how certification programs are at the forefront of interoperability, and the answer for vendors looking to keep up with today's growing industry for smart home innovation. "To ensure multi-vendor interoperability, accredited industry certification programs should be used for every product to provide credibility and quality assurance for retail and carrier based customers looking to add ever increasing num...
“Media Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. CloudBerry Backup is a leading cross-platform cloud backup and disaster recovery solution integrated with major public cloud services, such as Amazon Web Services, Microsoft Azure and Google Cloud Platform.
In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, will discuss how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team a...
The security needs of IoT environments require a strong, proven approach to maintain security, trust and privacy in their ecosystem. Assurance and protection of device identity, secure data encryption and authentication are the key security challenges organizations are trying to address when integrating IoT devices. This holds true for IoT applications in a wide range of industries, for example, healthcare, consumer devices, and manufacturing. In his session at @ThingsExpo, Lancen LaChance, vic...
Regulatory requirements exist to promote the controlled sharing of information, while protecting the privacy and/or security of the information. Regulations for each type of information have their own set of rules, policies, and guidelines. Cloud Service Providers (CSP) are faced with increasing demand for services at decreasing prices. Demonstrating and maintaining compliance with regulations is a nontrivial task and doing so against numerous sets of regulatory requirements can be daunting task...
What are the successful IoT innovations from emerging markets? What are the unique challenges and opportunities from these markets? How did the constraints in connectivity among others lead to groundbreaking insights? In her session at @ThingsExpo, Carmen Feliciano, a Principal at AMDG, will answer all these questions and share how you can apply IoT best practices and frameworks from the emerging markets to your own business.
Between the mockups and specs produced by analysts, and resulting applications built by developers, there exists a gulf where projects fail, costs spiral, and applications disappoint. Methodologies like Agile attempt to address this with intensified communication, with partial success but many limitations. In his session at @DevOpsSummit at 19th Cloud Expo, Charles Kendrick, CTO at Isomorphic Software, will present a revolutionary model enabled by new technologies. Learn how business and deve...
Big Data has been changing the world. IoT fuels the further transformation recently. How are Big Data and IoT related? In his session at @BigDataExpo, Tony Shan, a renowned visionary and thought leader, will explore the interplay of Big Data and IoT. He will anatomize Big Data and IoT separately in terms of what, which, why, where, when, who, how and how much. He will then analyze the relationship between IoT and Big Data, specifically the drilldown of how the 4Vs of Big Data (Volume, Variety,...
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...