Related Topics: @BigDataExpo, Java IoT, Linux Containers, Containers Expo Blog, @CloudExpo, SDN Journal

@BigDataExpo: Article

Big Data Needs a Thought Collective

Big Data should follow the lead of the scientific method to put greater emphasis on sharing and reusing data

Sharing data is a cornerstone of the scientific method because it makes it possible to replicate work. That foundation is mostly absent from data science, which makes obtaining and reusing knowledge more difficult than it should be.

Job postings for data scientists increased 15,000 percent between 2011 and 2012, and Gartner predicted that 63% of organizations would invest in Big Data this year. The communications, consumer, education, financial, healthcare, government, manufacturing, and retail sectors are all adopting business practices that are using data science to inform their activities and improve operations.

There are a number of companies creating solutions to visualize and uncover insights from large volumes of data with robust platforms in operation worldwide. Vast volumes of data from applications logs to the network and business activities are well served by today's analytics technologies - computation isn't the issue. The ability to model data into experiments that act on data with data sources and conclusions is what's missing, and it's an emerging problem for businesses.

Gartner has observed that those same organizations are now "struggling" with deriving value from and managing Big Data (depending on organizational maturity). That could be due to what famed microbiologist Ludwik Fleck deemed an "empty mind" as he explored the sociology of science during the 1930s. What is that exactly? Fleck postulated that a mind must be filled with initial knowledge before it can perceive or think. This logic applies to organizations too.

Fleck's theory was that participating in a "thought collective" of institutional knowledge would fill minds. His works concluded that cognition is a collaborative activity because a body of knowledge is acquired from a group. It could be argued that making it possible to reuse data experiments would have the same effect. Organizations that can't find value in data have an empty mind.

Big Data should follow the lead of the scientific method (which was influenced by Fleck's ideas) to put greater emphasis on sharing and reusing data. Why is that important for businesses? Scientific data is easy to share among different organizations. Having the ability to do the same with data science could solve what's emerging as a major pain point. Employees change roles and organizations, but what happens to the knowledge, experiments, and patterns?

Whether the academic model would also function in the enterprise is a fascinating question for data scientists, operations professionals and industries. The next great "open source" horizon could be the exchange of knowledge.

It would be interesting to see companies take on the challenge of building systems that organize and share experiments more liberally to put an end to the empty brain problem. After all, data science is still science. Why should it be treated differently?

More Stories By Haim Koshchitzky

Haim Koshchitzky is the Founder and CEO of XpoLog and has over 20 years of experience in complex technology development and software architecture. Prior to XpoLog, he spent several years as the tech lead for Mercury Interactive (acquired by HP) and other startups. He has a passion for data analytics and technology, and is also an avid marathon runner and Judo black belt.

Comments (1)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

Latest Stories
DevOps theory promotes a culture of continuous improvement built on collaboration, empowerment, systems thinking, and feedback loops. But how do you collaborate effectively across the traditional silos? How can you make decisions without system-wide visibility? How can you see the whole system when it is spread across teams and locations? How do you close feedback loops across teams and activities delivering complex multi-tier, cloud, container, serverless, and/or API-based services?
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
@ThingsExpo has been named the Top 5 Most Influential Internet of Things Brand by Onalytica in the ‘The Internet of Things Landscape 2015: Top 100 Individuals and Brands.' Onalytica analyzed Twitter conversations around the #IoT debate to uncover the most influential brands and individuals driving the conversation. Onalytica captured data from 56,224 users. The PageRank based methodology they use to extract influencers on a particular topic (tweets mentioning #InternetofThings or #IoT in this ...
Today every business relies on software to drive the innovation necessary for a competitive edge in the Application Economy. This is why collaboration between development and operations, or DevOps, has become IT’s number one priority. Whether you are in Dev or Ops, understanding how to implement a DevOps strategy can deliver faster development cycles, improved software quality, reduced deployment times and overall better experiences for your customers.
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 7-9, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
Cloud based infrastructure deployment is becoming more and more appealing to customers, from Fortune 500 companies to SMEs due to its pay-as-you-go model. Enterprise storage vendors are able to reach out to these customers by integrating in cloud based deployments; this needs adaptability and interoperability of the products confirming to cloud standards such as OpenStack, CloudStack, or Azure. As compared to off the shelf commodity storage, enterprise storages by its reliability, high-availabil...
The IoT industry is now at a crossroads, between the fast-paced innovation of technologies and the pending mass adoption by global enterprises. The complexity of combining rapidly evolving technologies and the need to establish practices for market acceleration pose a strong challenge to global enterprises as well as IoT vendors. In his session at @ThingsExpo, Clark Smith, senior product manager for Numerex, will discuss how Numerex, as an experienced, established IoT provider, has embraced a ...
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
SYS-CON Events announced today that Cloudbric, a leading website security provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Cloudbric is an elite full service website protection solution specifically designed for IT novices, entrepreneurs, and small and medium businesses. First launched in 2015, Cloudbric is based on the enterprise level Web Application Firewall by Penta Security Sys...
We are reaching the end of the beginning with WebRTC, and real systems using this technology have begun to appear. One challenge that faces every WebRTC deployment (in some form or another) is identity management. For example, if you have an existing service – possibly built on a variety of different PaaS/SaaS offerings – and you want to add real-time communications you are faced with a challenge relating to user management, authentication, authorization, and validation. Service providers will w...
When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...