Welcome!

News Feed Item

Databricks Launches “Certified Spark Distribution” Program to Recognize Vendors Committed to Supporting the Apache Spark Application Ecosystem

Databricks, the company founded by the creators of Apache Spark, the next generation Big Data engine, today announced the “Certified Spark Distribution” program for vendors with a commercial Spark distribution. Certification indicates that the vendor’s Spark distribution is compatible with the open source Apache Spark distribution, enabling “Certified on Spark” applications - certified to work with Apache Spark - to run on the vendor’s Spark distribution out-of-the-box.

“One of Databricks’ goals is to ensure users have a fantastic experience. Our belief is that having the community work together to maintain compatibility and therefore facilitate a vibrant application ecosystem is crucial to this vision,” said Ion Stoica, Databricks CEO. “We first launched the ‘Certified on Spark’ program to help build a robust ecosystem of innovative applications on top of Apache Spark. The ‘Certified Spark Distribution’ program is the other half of the equation, recognizing vendors that are committed to providing a home for these applications to allow the ecosystem to flourish.”

In keeping with the open source nature of Spark, the certification process is fully transparent with open-source tests, lightweight, and 100% free - a mirror image of the “Certified on Spark” process for Spark applications. Vendors fill out a short questionnaire and then simply execute a set of open-source tests - developed and maintained by the community and used to test each release of Apache Spark - against their build of Spark to demonstrate compatibility.

“Certification shouldn’t be used as a tool for lock-in: Certified Spark Distributions are not required to ship all the bits of Apache Spark, or be open source, or prevented from innovating significantly within and around Spark,” said Arsalan Tavakoli-Shiraji, Business Development Lead at Databricks. “They simply need to maintain compatibility with Apache Spark to provide support for the application ecosystem.”

As part of the certification program launch, five vendors have completed the certification process: DataStax, Hortonworks, IBM, Oracle, and Pivotal - industry leaders that have recognized and embraced the power of Spark when integrated with their respective platforms. Each of these vendors put their distributions through the certification process, which included a host of integration tests to ensure full compatibility with the latest Apache Spark release.

“One of the big risks faced by open source projects is fragmentation among distributors. Fragmentation is bad for both users and application developers, and ultimately for the growth of the project,” said Matei Zaharia, Databricks CTO and VP of the Spark project at Apache. “We are delighted that these partners - along with others in the certification pipeline - share our vision of an undivided Spark platform based directly around Apache, and will ensure that all applications built on Apache Spark run on their distributions.”

Vendors interested in certifying their Spark distribution should visit www.databricks.com and select "Apply for Certification." Enterprise users can also visit the Databricks site regularly to see the latest set of certified distributions and applications, and read “spotlight” blog articles that provide deep-dives on the Spark ecosystem by newly certified vendors.

All the inaugural members will be on hand at the upcoming Spark Summit from June 30th to July 2nd in San Francisco to provide greater information on the role of Spark in helping better serve their customers. Additionally, there will be an “Application Spotlight” segment that will highlight innovative “Certified on Spark” applications.

Supporting Quotes:

"DataStax is strongly committed to making Cassandra and Spark the best combination for today's online applications," said Robin Schumacher, VP of products at DataStax. "We have demonstrated that commitment with the integration work we have contributed back to both open source communities as well as the certified versions of Spark and Cassandra we provide in DataStax Enterprise for production environments."

“We support the fact that Apache Spark project provides enterprises with an additional processing engine in Hadoop to execute in-memory algorithms for advanced analytics,” said John Kreisa, vice president of strategic marketing at Hortonworks. “We applaud Databricks’ vision to ensure Spark is fully integrated on YARN, which enterprises have adopted as the data OS for Hadoop.”

"Pivotal's open source credentials are quite extensive - Apache-compatible Hadoop, MADLib, RabbitMQ, CloudFoundry - and now we've added Spark to that set," said Sarabjeet Chugh, Head of Hadoop Product Management at Pivotal. "Additionally, we recognize the importance of a unified community to enable the ecosystem to grow and so are thrilled to back this effort."

About Databricks

Databricks was founded by the creators of Apache Spark, who have been working for the past six years on cutting-edge systems to analyze and process Big Data. They believe that Big Data is a tremendous opportunity that is still largely untapped, and are actively working to revolutionize what enterprises can do with it. Databricks is venture-backed by Andreessen Horowitz. For more information, visit http://www.databricks.com.

More Stories By Business Wire

Copyright © 2009 Business Wire. All rights reserved. Republication or redistribution of Business Wire content is expressly prohibited without the prior written consent of Business Wire. Business Wire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

Latest Stories
It's easy to assume that your app will run on a fast and reliable network. The reality for your app's users, though, is often a slow, unreliable network with spotty coverage. What happens when the network doesn't work, or when the device is in airplane mode? You get unhappy, frustrated users. An offline-first app is an app that works, without error, when there is no network connection.
Data-as-a-Service is the complete package for the transformation of raw data into meaningful data assets and the delivery of those data assets. In her session at 18th Cloud Expo, Lakshmi Randall, an industry expert, analyst and strategist, will address: What is DaaS (Data-as-a-Service)? Challenges addressed by DaaS Vendors that are enabling DaaS Architecture options for DaaS
One of the bewildering things about DevOps is integrating the massive toolchain including the dozens of new tools that seem to crop up every year. Part of DevOps is Continuous Delivery and having a complex toolchain can add additional integration and setup to your developer environment. In his session at @DevOpsSummit at 18th Cloud Expo, Miko Matsumura, Chief Marketing Officer of Gradle Inc., will discuss which tools to use in a developer stack, how to provision the toolchain to minimize onboa...
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
With the proliferation of both SQL and NoSQL databases, organizations can now target specific fit-for-purpose database tools for their different application needs regarding scalability, ease of use, ACID support, etc. Platform as a Service offerings make this even easier now, enabling developers to roll out their own database infrastructure in minutes with minimal management overhead. However, this same amount of flexibility also comes with the challenges of picking the right tool, on the right ...
CIOs and those charged with running IT Operations are challenged to deliver secure, audited, and reliable compute environments for the applications and data for the business. Behind the scenes these tasks are often accomplished by following onerous time-consuming processes and often the management of these environments and processes will be outsourced to multiple IT service providers. In addition, the division of work is often siloed into traditional "towers" that are not well integrated for cro...
SYS-CON Events announced today that FalconStor Software® Inc., a 15-year innovator of software-defined storage solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. FalconStor Software®, Inc. (NASDAQ: FALC) is a leading software-defined storage company offering a converged, hardware-agnostic, software-defined storage and data services platform. Its flagship solution FreeStor®, utilizes a horizonta...
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Avere delivers a more modern architectural approach to storage that doesn’t require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbuilding of data centers ...
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies adopt disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advanced analytics, and DevO...
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
SYS-CON Events announced today that Column Technologies will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Established in 1998, Column Technologies is a global technology solutions provider with over 400 employees, headquartered in the United States with offices in Canada, India, and the United Kingdom. Column Technologies provides “Best of Breed” technology solutions that automate the key DevOps principal...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
The Quantified Economy represents the total global addressable market (TAM) for IoT that, according to a recent IDC report, will grow to an unprecedented $1.3 trillion by 2019. With this the third wave of the Internet-global proliferation of connected devices, appliances and sensors is poised to take off in 2016. In his session at @ThingsExpo, David McLauchlan, CEO and co-founder of Buddy Platform, will discuss how the ability to access and analyze the massive volume of streaming data from mil...