Welcome!

Blog Feed Post

How Many Data Scientists are out there?

By

Editor’s note: This post by Gregory Piatetsky first appeared at KDnuggets.comIt it he dives into a key question regarding the possible shortage of data scientists. -bg

Many people have read the McKinsey report on Big Data (May 2011) which predicted 

The United States alone faces a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data.


However, it seems that so far the shortage is much less. 

The job title “Data Scientist” has grown tremendously in popularity, according to job siteindeed.com 

Job trend<br /><br /><br />
      for Data Scientist positions, 2006-2014 

However, notice that the demand stopped increasing sometime in 2013. 

As of March 13, 2014, Search for “Data Scientist” jobs (US-based) on indeed.com gives only 1,000 positions. We find about 10,000 jobs when searching for Data Scientist - without quotes, but many of these jobs have title “Scientist” or something to do with data, and not necessarily represent “Data Scientist” positions. 

Of course, many people may do similar work without having the title of “data scientist”. 

Several estimates may be relevant. 

Kaggle is the leading platform for data science competitions and claims to be world’s largest community of data scientists. Kaggle reached 100,000 in July 2013, reported110,000 in Sep 2013, 120,000 members on Oct 23, 2013, reported to have 140,000 on Feb 24, 2014. 
Latest numbers, from Kaggle CEO Anthony Goldbloom are: 157,142 Kaggle members, of whom 67,776 active in the last 6 months. 

A quick examination of the top 10 ranked Kagglers shows that only one has a title of “Data Scientist”. Top 10 include neuroscience researchers, PhD mathematicians and physicists, and while they are clearly talented competitors on Kaggle, their actual job may not involve data science. 

LinkedIn has many groups related to data science, Big Data and Analytics – see my analysis Top 2013 LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science

The two largest of these groups are:


Most members of these groups do not have the job title “Data scientists”. There is a “Data Scientists” LinkedIn group, but it has at present only 6,750 members. 

LinkedIn Data Scientist Peter Skomoroch, @PeteSkomoroch wrote 

Using the public LinkedIn search interface, with the job title in quotes – I see 12,170 members with the phrase “data scientist” anywhere their profile. Using the advanced search facet to look only at profiles with a current or past title containing the phrase “data scientist”, I see 6,896 results. Doing a plain keyword search will return many members that mention the words “data” or “scientist” anywhere in their profile, but the majority of those people have nothing to do with data science.


He further estimated that perhaps 150-250K people would be a match for a data scientist based on their skills and education. 

I remain optimistic that data scientist is a great profession, but I doubt that there is a demand for 100,000 new data scientist positions. There may be a re-branding of existing positions, or creation of teams which collectively do the data science job.

 

Gregory Piatetsky-Shapiro, Ph.D., is a well-known expert in Business Analytics, Data Mining, and Data Science. Gregory is the Editor and Publisher of KDnuggets.com, a Business Analytics “Guru” on Twitter, and a Top Influencer in Big Data, Data Mining, and Data Science. Gregory is a co-founder of KDD (Knowledge Discovery and Data mining conferences) and SIGKDD, professional organization for Knowledge Discovery and Data Mining. Gregory has over 60 publications and edited several books and collections on data mining and knowledge discovery.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

Latest Stories
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
If you had a chance to enter on the ground level of the largest e-commerce market in the world – would you? China is the world’s most populated country with the second largest economy and the world’s fastest growing market. It is estimated that by 2018 the Chinese market will be reaching over $30 billion in gaming revenue alone. Admittedly for a foreign company, doing business in China can be challenging. Often changing laws, administrative regulations and the often inscrutable Chinese Interne...
Fact is, enterprises have significant legacy voice infrastructure that’s costly to replace with pure IP solutions. How can we bring this analog infrastructure into our shiny new cloud applications? There are proven methods to bind both legacy voice applications and traditional PSTN audio into cloud-based applications and services at a carrier scale. Some of the most successful implementations leverage WebRTC, WebSockets, SIP and other open source technologies. In his session at @ThingsExpo, Da...
24Notion is full-service global creative digital marketing, technology and lifestyle agency that combines strategic ideas with customized tactical execution. With a broad understand of the art of traditional marketing, new media, communications and social influence, 24Notion uniquely understands how to connect your brand strategy with the right consumer. 24Notion ranked #12 on Corporate Social Responsibility - Book of List.
Cloud computing is being adopted in one form or another by 94% of enterprises today. Tens of billions of new devices are being connected to The Internet of Things. And Big Data is driving this bus. An exponential increase is expected in the amount of information being processed, managed, analyzed, and acted upon by enterprise IT. This amazing is not part of some distant future - it is happening today. One report shows a 650% increase in enterprise data by 2020. Other estimates are even higher....
An IoT product’s log files speak volumes about what’s happening with your products in the field, pinpointing current and potential issues, and enabling you to predict failures and save millions of dollars in inventory. But until recently, no one knew how to listen. In his session at @ThingsExpo, Dan Gettens, Chief Research Officer at OnProcess, will discuss recent research by Massachusetts Institute of Technology and OnProcess Technology, where MIT created a new, breakthrough analytics model f...
Personalization has long been the holy grail of marketing. Simply stated, communicate the most relevant offer to the right person and you will increase sales. To achieve this, you must understand the individual. Consequently, digital marketers developed many ways to gather and leverage customer information to deliver targeted experiences. In his session at @ThingsExpo, Lou Casal, Founder and Principal Consultant at Practicala, discussed how the Internet of Things (IoT) has accelerated our abil...
SYS-CON Events announced today that Niagara Networks will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Niagara Networks offers the highest port-density systems, and the most complete Next-Generation Network Visibility systems including Network Packet Brokers, Bypass Switches, and Network TAPs.
Adobe is changing the world though digital experiences. Adobe helps customers develop and deliver high-impact experiences that differentiate brands, build loyalty, and drive revenue across every screen, including smartphones, computers, tablets and TVs. Adobe content solutions are used daily by millions of companies worldwide-from publishers and broadcasters, to enterprises, marketing agencies and household-name brands. Building on its established design leadership, Adobe enables customers not o...
Everyone knows that truly innovative companies learn as they go along, pushing boundaries in response to market changes and demands. What's more of a mystery is how to balance innovation on a fresh platform built from scratch with the legacy tech stack, product suite and customers that continue to serve as the business' foundation. In his General Session at 19th Cloud Expo, Michael Chambliss, Head of Engineering at ReadyTalk, will discuss why and how ReadyTalk diverted from healthy revenue an...
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Digitization is driving a fundamental change in society that is transforming the way businesses work with their customers, their supply chains and their people. Digital transformation leverages DevOps best practices, such as Agile Parallel Development, Continuous Delivery and Agile Operations to capitalize on opportunities and create competitive differentiation in the application economy. However, information security has been notably absent from the DevOps movement. Speed doesn’t have to negat...
Leading cloud-centric IT organizations are establishing core capabilities to improve productivity, control costs and provide a highly responsive end-user experience. Key steps along this journey include creating an end-user cloud services catalog, automating workflows and provisioning, and implementing IT showback and chargeback. In his session at 19th Cloud Expo, Mark Jamensky, executive vice president of Products at Embotics, will walk attendees through an in-depth case study of enterprise I...
Without a clear strategy for cost control and an architecture designed with cloud services in mind, costs and operational performance can quickly get out of control. To avoid multiple architectural redesigns requires extensive thought and planning. Boundary (now part of BMC) launched a new public-facing multi-tenant high resolution monitoring service on Amazon AWS two years ago, facing challenges and learning best practices in the early days of the new service. In his session at 19th Cloud Exp...