Related Topics: @BigDataExpo, Java IoT, Microservices Expo, Agile Computing, @CloudExpo, SDN Journal

@BigDataExpo: Blog Feed Post

Real-time Big Data or Small Data?

I’ve been asked what I consider as “Big Data” versus “Small Data” in this domain. Here’s my view.

Have you heard of products like IBM’s InfoSphere Streams, Tibco’s Event Processing product, or Oracle’s CEP product? All good examples of commercially available stream processing technologies which help you process events in real-time.

I’ve been asked what I consider as “Big Data” versus “Small Data” in this domain. Here’s my view.


Real-Time Analytics Small Data Big Data
Data Volume None None
Data Velocity 100K events / day (<<1K events / second) Billion+ events / day (>>1K events / second)
Data Variety 1-6 unstructured on sources AND 1 single destination (an output file, a SQL database, a BI tool) 6+ structured and 6+ unstructured for sources AND many destinations (a custom application, a BI tool, several SQL databases, NoSQL databases, Hadoop)
Data Models Used for “transport” mainly. Little to no ETL, in-stream analytics, or complex event processing performed. Transport is the foundation. However, distributed ETL, linearly scalable in-memory and in-stream analytics are applied, and complex event processing is the norm.
Business Functions One line of business (e.g. financial trading) Several lines of business – to – 360 view
Business Intelligence No queries are performed against the data in motion. This is simply a mechanism for transporting transaction or event from the source to a database.Transport times are <1 second.



Example: connect to desktop trading applications and transport trade events to an Oracle database.

ETL, sophisticated algorithms, complex business logic, and even queries can be applied to the stream of events as they are in motion.  Analytics span across all data sources and, thus, all business functions.Transport and analytics occur in < 1 second.


Example: connect to desktop trading applications, market data feeds, social media, and provide instantaneous trending reports. Allow traders to subscribe to information pertinent to their trades and have analytics applied in real-time for personalized reporting.

Want to see my view of Batch Analytics? Go Here.

Want to see my view of Ad Hoc Analytics? Go Here.

Here are a few other products in this space:

Read the original blog entry...

More Stories By Jim Kaskade

Jim Kaskade is Vice President and General Manager, Big Data & Analytics, at CSC. Prior to that he was CEO of Infochimps. Before that he served as SVP and General Manager at SIOS Technology, a publicly traded firm in Japan, where he led a business unit focused on developing private cloud Platform as a Service targeted for Fortune 500 enterprises. He has been heavily involved in all aspects of cloud, meeting with prominent CIOs, CISOs, datacenter architects of Fortune 100 companies to better understand their cloud computing needs. He also has hands-on cloud domain knowledge from his experience as founder and CEO of a SaaS company, which secured the digital media assets of over 10,000 businesses including Fortune 100 customers such as Lucasfilm, the NBA, Sony BMG, News Corp, Viacom, and IAC. Kaskade is also one of the Top 100 bloggers on Cloud Computing selected by the Cloud Computing Journal.

Latest Stories
The IoT industry is now at a crossroads, between the fast-paced innovation of technologies and the pending mass adoption by global enterprises. The complexity of combining rapidly evolving technologies and the need to establish practices for market acceleration pose a strong challenge to global enterprises as well as IoT vendors. In his session at @ThingsExpo, Clark Smith, senior product manager for Numerex, will discuss how Numerex, as an experienced, established IoT provider, has embraced a ...
A completely new computing platform is on the horizon. They’re called Microservers by some, ARM Servers by others, and sometimes even ARM-based Servers. No matter what you call them, Microservers will have a huge impact on the data center and on server computing in general. Although few people are familiar with Microservers today, their impact will be felt very soon. This is a new category of computing platform that is available today and is predicted to have triple-digit growth rates for some ...
DevOps theory promotes a culture of continuous improvement built on collaboration, empowerment, systems thinking, and feedback loops. But how do you collaborate effectively across the traditional silos? How can you make decisions without system-wide visibility? How can you see the whole system when it is spread across teams and locations? How do you close feedback loops across teams and activities delivering complex multi-tier, cloud, container, serverless, and/or API-based services?
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Sheng Liang to Keynote at SYS-CON's 19th Cloud Expo, which will take place on November 1-3, 2016 at the Santa Clara Convention Center in Santa Clara, California.
Effectively SMBs and government programs must address compounded regulatory compliance requirements. The most recent are Controlled Unclassified Information and the EU’s GDPR have Board Level implications. Managing sensitive data protection will likely result in acquisition criteria, demonstration requests and new requirements. Developers, as part of the pre-planning process and the associated supply chain, could benefit from updating their code libraries and design by incorporating changes.
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
What are the new priorities for the connected business? First: businesses need to think differently about the types of connections they will need to make – these span well beyond the traditional app to app into more modern forms of integration including SaaS integrations, mobile integrations, APIs, device integration and Big Data integration. It’s important these are unified together vs. doing them all piecemeal. Second, these types of connections need to be simple to design, adapt and configure...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...
The IoT has the potential to create a renaissance of manufacturing in the US and elsewhere. In his session at 18th Cloud Expo, Florent Solt, CTO and chief architect of Netvibes, discussed how the expected exponential increase in the amount of data that will be processed, transported, stored, and accessed means there will be a huge demand for smart technologies to deliver it. Florent Solt is the CTO and chief architect of Netvibes. Prior to joining Netvibes in 2007, he co-founded Rift Technologi...
SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
Established in 1998, Calsoft is a leading software product engineering Services Company specializing in Storage, Networking, Virtualization and Cloud business verticals. Calsoft provides End-to-End Product Development, Quality Assurance Sustenance, Solution Engineering and Professional Services expertise to assist customers in achieving their product development and business goals. The company's deep domain knowledge of Storage, Virtualization, Networking and Cloud verticals helps in delivering ...