Welcome!

Related Topics: @BigDataExpo, Java IoT, Linux Containers, Containers Expo Blog, @CloudExpo, SDN Journal

@BigDataExpo: Article

Big Data Needs a Thought Collective

Big Data should follow the lead of the scientific method to put greater emphasis on sharing and reusing data

Sharing data is a cornerstone of the scientific method because it makes it possible to replicate work. That foundation is mostly absent from data science, which makes obtaining and reusing knowledge more difficult than it should be.

Job postings for data scientists increased 15,000 percent between 2011 and 2012, and Gartner predicted that 63% of organizations would invest in Big Data this year. The communications, consumer, education, financial, healthcare, government, manufacturing, and retail sectors are all adopting business practices that are using data science to inform their activities and improve operations.

There are a number of companies creating solutions to visualize and uncover insights from large volumes of data with robust platforms in operation worldwide. Vast volumes of data from applications logs to the network and business activities are well served by today's analytics technologies - computation isn't the issue. The ability to model data into experiments that act on data with data sources and conclusions is what's missing, and it's an emerging problem for businesses.

Gartner has observed that those same organizations are now "struggling" with deriving value from and managing Big Data (depending on organizational maturity). That could be due to what famed microbiologist Ludwik Fleck deemed an "empty mind" as he explored the sociology of science during the 1930s. What is that exactly? Fleck postulated that a mind must be filled with initial knowledge before it can perceive or think. This logic applies to organizations too.

Fleck's theory was that participating in a "thought collective" of institutional knowledge would fill minds. His works concluded that cognition is a collaborative activity because a body of knowledge is acquired from a group. It could be argued that making it possible to reuse data experiments would have the same effect. Organizations that can't find value in data have an empty mind.

Big Data should follow the lead of the scientific method (which was influenced by Fleck's ideas) to put greater emphasis on sharing and reusing data. Why is that important for businesses? Scientific data is easy to share among different organizations. Having the ability to do the same with data science could solve what's emerging as a major pain point. Employees change roles and organizations, but what happens to the knowledge, experiments, and patterns?

Whether the academic model would also function in the enterprise is a fascinating question for data scientists, operations professionals and industries. The next great "open source" horizon could be the exchange of knowledge.

It would be interesting to see companies take on the challenge of building systems that organize and share experiments more liberally to put an end to the empty brain problem. After all, data science is still science. Why should it be treated differently?

More Stories By Haim Koshchitzky

Haim Koshchitzky is the Founder and CEO of XpoLog and has over 20 years of experience in complex technology development and software architecture. Prior to XpoLog, he spent several years as the tech lead for Mercury Interactive (acquired by HP) and other startups. He has a passion for data analytics and technology, and is also an avid marathon runner and Judo black belt.

Comments (1)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Latest Stories
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
Cloud Expo, Inc. has announced today that Aruna Ravichandran, vice president of DevOps Product and Solutions Marketing at CA Technologies, has been named co-conference chair of DevOps at Cloud Expo 2017. The @DevOpsSummit at Cloud Expo New York will take place on June 6-8, 2017, at the Javits Center in New York City, New York, and @DevOpsSummit at Cloud Expo Silicon Valley will take place Oct. 31-Nov. 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Translating agile methodology into real-world best practices within the modern software factory has driven widespread DevOps adoption, yet much work remains to expand workflows and tooling across the enterprise. As models evolve from pockets of experimentation into wholescale organizational reinvention, practitioners find themselves challenged to incorporate the culture and architecture necessary to support DevOps at scale. In his session at @DevOpsSummit at 20th Cloud Expo, Anand Akela, Senior...
Join IBM November 2 at 19th Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how to go beyond multi-speed it to bring agility to traditional enterprise applications. Technology innovation is the driving force behind modern business and enterprises must respond by increasing the speed and efficiency of software delivery. The challenge is that existing enterprise applications are expensive to develop and difficult to modernize. This often results in what Gartner calls ...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the work...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, pane...