News Feed Item

Databricks Launches “Certified Spark Distribution” Program to Recognize Vendors Committed to Supporting the Apache Spark Application Ecosystem

Databricks, the company founded by the creators of Apache Spark, the next generation Big Data engine, today announced the “Certified Spark Distribution” program for vendors with a commercial Spark distribution. Certification indicates that the vendor’s Spark distribution is compatible with the open source Apache Spark distribution, enabling “Certified on Spark” applications - certified to work with Apache Spark - to run on the vendor’s Spark distribution out-of-the-box.

“One of Databricks’ goals is to ensure users have a fantastic experience. Our belief is that having the community work together to maintain compatibility and therefore facilitate a vibrant application ecosystem is crucial to this vision,” said Ion Stoica, Databricks CEO. “We first launched the ‘Certified on Spark’ program to help build a robust ecosystem of innovative applications on top of Apache Spark. The ‘Certified Spark Distribution’ program is the other half of the equation, recognizing vendors that are committed to providing a home for these applications to allow the ecosystem to flourish.”

In keeping with the open source nature of Spark, the certification process is fully transparent with open-source tests, lightweight, and 100% free - a mirror image of the “Certified on Spark” process for Spark applications. Vendors fill out a short questionnaire and then simply execute a set of open-source tests - developed and maintained by the community and used to test each release of Apache Spark - against their build of Spark to demonstrate compatibility.

“Certification shouldn’t be used as a tool for lock-in: Certified Spark Distributions are not required to ship all the bits of Apache Spark, or be open source, or prevented from innovating significantly within and around Spark,” said Arsalan Tavakoli-Shiraji, Business Development Lead at Databricks. “They simply need to maintain compatibility with Apache Spark to provide support for the application ecosystem.”

As part of the certification program launch, five vendors have completed the certification process: DataStax, Hortonworks, IBM, Oracle, and Pivotal - industry leaders that have recognized and embraced the power of Spark when integrated with their respective platforms. Each of these vendors put their distributions through the certification process, which included a host of integration tests to ensure full compatibility with the latest Apache Spark release.

“One of the big risks faced by open source projects is fragmentation among distributors. Fragmentation is bad for both users and application developers, and ultimately for the growth of the project,” said Matei Zaharia, Databricks CTO and VP of the Spark project at Apache. “We are delighted that these partners - along with others in the certification pipeline - share our vision of an undivided Spark platform based directly around Apache, and will ensure that all applications built on Apache Spark run on their distributions.”

Vendors interested in certifying their Spark distribution should visit www.databricks.com and select "Apply for Certification." Enterprise users can also visit the Databricks site regularly to see the latest set of certified distributions and applications, and read “spotlight” blog articles that provide deep-dives on the Spark ecosystem by newly certified vendors.

All the inaugural members will be on hand at the upcoming Spark Summit from June 30th to July 2nd in San Francisco to provide greater information on the role of Spark in helping better serve their customers. Additionally, there will be an “Application Spotlight” segment that will highlight innovative “Certified on Spark” applications.

Supporting Quotes:

"DataStax is strongly committed to making Cassandra and Spark the best combination for today's online applications," said Robin Schumacher, VP of products at DataStax. "We have demonstrated that commitment with the integration work we have contributed back to both open source communities as well as the certified versions of Spark and Cassandra we provide in DataStax Enterprise for production environments."

“We support the fact that Apache Spark project provides enterprises with an additional processing engine in Hadoop to execute in-memory algorithms for advanced analytics,” said John Kreisa, vice president of strategic marketing at Hortonworks. “We applaud Databricks’ vision to ensure Spark is fully integrated on YARN, which enterprises have adopted as the data OS for Hadoop.”

"Pivotal's open source credentials are quite extensive - Apache-compatible Hadoop, MADLib, RabbitMQ, CloudFoundry - and now we've added Spark to that set," said Sarabjeet Chugh, Head of Hadoop Product Management at Pivotal. "Additionally, we recognize the importance of a unified community to enable the ecosystem to grow and so are thrilled to back this effort."

About Databricks

Databricks was founded by the creators of Apache Spark, who have been working for the past six years on cutting-edge systems to analyze and process Big Data. They believe that Big Data is a tremendous opportunity that is still largely untapped, and are actively working to revolutionize what enterprises can do with it. Databricks is venture-backed by Andreessen Horowitz. For more information, visit http://www.databricks.com.

More Stories By Business Wire

Copyright © 2009 Business Wire. All rights reserved. Republication or redistribution of Business Wire content is expressly prohibited without the prior written consent of Business Wire. Business Wire shall not be liable for any errors or delays in the content, or for any actions taken in reliance thereon.

Latest Stories
“Media Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. CloudBerry Backup is a leading cross-platform cloud backup and disaster recovery solution integrated with major public cloud services, such as Amazon Web Services, Microsoft Azure and Google Cloud Platform.
In the next forty months – just over three years – businesses will undergo extraordinary changes. The exponential growth of digitization and machine learning will see a step function change in how businesses create value, satisfy customers, and outperform their competition. In the next forty months companies will take the actions that will see them get to the next level of the game called Capitalism. Or they won’t – game over. The winners of today and tomorrow think differently, follow different...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
SYS-CON Events announced today that Super Micro Computer, Inc., a global leader in Embedded and IoT solutions, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 7-9, 2017, at the Javits Center in New York City, NY. Supermicro (NASDAQ: SMCI), the leading innovator in high-performance, high-efficiency server technology, is a premier provider of advanced server Building Block Solutions® for Data Center, Cloud Computing, Enterprise IT, Hadoop/Big Data, HPC and ...
Without lifecycle traceability and visibility across the tool chain, stakeholders from Planning-to-Ops have limited insight and answers to who, what, when, why and how across the DevOps lifecycle. This impacts the ability to deliver high quality software at the needed velocity to drive positive business outcomes. In his general session at @DevOpsSummit at 19th Cloud Expo, Eric Robertson, General Manager at CollabNet, will discuss how customers are able to achieve a level of transparency that e...
Web Real-Time Communication APIs have quickly revolutionized what browsers are capable of. In addition to video and audio streams, we can now bi-directionally send arbitrary data over WebRTC's PeerConnection Data Channels. With the advent of Progressive Web Apps and new hardware APIs such as WebBluetooh and WebUSB, we can finally enable users to stitch together the Internet of Things directly from their browsers while communicating privately and securely in a decentralized way.
So you think you are a DevOps warrior, huh? Put your money (not really, it’s free) where your metrics are and prove it by taking The Ultimate DevOps Geek Quiz Challenge, sponsored by DevOps Summit. Battle through the set of tough questions created by industry thought leaders to earn your bragging rights and win some cool prizes.
For basic one-to-one voice or video calling solutions, WebRTC has proven to be a very powerful technology. Although WebRTC’s core functionality is to provide secure, real-time p2p media streaming, leveraging native platform features and server-side components brings up new communication capabilities for web and native mobile applications, allowing for advanced multi-user use cases such as video broadcasting, conferencing, and media recording.
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
SYS-CON Events announced today that Hitrons Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Hitrons Solutions Inc. is distributor in the North American market for unique products and services of small and medium-size businesses, including cloud services and solutions, SEO marketing platforms, and mobile applications.
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, will draw together recent research and lessons learned from emerging and established ...
November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...
SYS-CON Events announced today that Enzu will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their online busine...
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...