Welcome!

News Feed Item

Databricks Expands Platform for Turnkey Production Apache Spark Deployments in the Cloud

Company Launches Enhanced Reliability and Security Capabilities for Data Engineering on its Managed Spark Platform

LAS VEGAS, NV -- (Marketwired) -- 11/30/16 -- Databricks®, the company founded by the team that created the popular Apache® Spark™ project, announced new capabilities to its platform that further simplify the production deployment of Spark in the cloud. The production enhancements complement the existing Databricks environment for data science, which enable users to collaboratively analyze data in real-time with data science notebooks and immediately deploy them as production Spark jobs and workflows. The announcement was made today at the 2016 Amazon Web Services (AWS) re:Invent conference.

The production features announced today enable users to effortlessly setup and run Spark jobs and workflows without humans in the loop via APIs, monitor performance and troubleshoot errors with detailed logs, manage AWS EC2 costs with AWS Tags, control access to resources with AWS IAM Roles, and increase the scalability of long-running workloads with encrypted AWS Elastic Block Storage (EBS). Databricks is the first and only vendor to offer a SOC2 and HIPAA compliant Spark platform that provides turnkey deployment of both real-time analysis and production Spark workloads with a seamless transition from analysis to production.

As organizations across industries deploy Apache Spark in the public cloud, the task of minimizing costly downtimes of mission-critical workloads, such as applications that predict equipment failure, falls on data engineering teams. Yet, building sophisticated systems around Spark to ensure that such workloads are resilient, easy to troubleshoot, and secure, requires a high level of technical expertise and meticulous efforts that most organizations struggle to spare.

"As enterprises increasingly rely on Apache Spark to power more diverse production workloads supporting more people, it becomes critical to prevent business system outages that could cost millions of dollars," said Nik Rouda, Senior Analyst at Enterprise Strategy Group.

In Databricks' production environment, data engineers can bypass the difficult and tedious tasks of developing, configuring, tuning and securing infrastructure to easily achieve production requirements with features such as:

  • HIPAA and SOC2-compliant Apache Spark clusters fully managed and tuned by the Spark committers at Databricks;
  • REST APIs to orchestrate and monitor sophisticated Spark jobs and workflows programmatically, without humans in the loop;
  • End-to-end logs and performance metrics to easily debug and fine-tune Spark workloads, accessible via APIs programmatically or in the Databricks user interface;
  • Customizable AWS tags to manage the AWS EC2 usage of each Spark cluster;
  • Encrypted AWS Elastic Block Storage (EBS) to increase the reliability of long-running Spark jobs on AWS EC2 instances by automatically providing additional storage;
  • AWS IAM Roles integration to provide secure access to AWS resources to diverse user groups in the same organization;
  • Direct integration with the data science environment to let organizations instantly move exploratory work to production without re-engineering;
  • SSH Access to provide engineers direct access to the production environment to troubleshoot and inspect the Spark clusters.

"Databricks is experiencing unprecedented demand for a robust and secure Apache Spark platform in the cloud to run production workloads," says Ali Ghodsi, CEO and Co-Founder of Databricks. "We are proud to enable one of our core user groups, the data engineers, to meet the most stringent of operational requirements."

Visit databricks.com or Booth #1341 at AWS re:Invent to learn more.

Contact Databricks to get started: http://go.databricks.com/contact-databricks.

About Databricks
Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact [email protected].

© Databricks 2016. All rights reserved. Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.

More Stories By Marketwired .

Copyright © 2009 Marketwired. All rights reserved. All the news releases provided by Marketwired are copyrighted. Any forms of copying other than an individual user's personal reference without express written permission is prohibited. Further distribution of these materials is strictly forbidden, including but not limited to, posting, emailing, faxing, archiving in a public database, redistributing via a computer network or in a printed form.

Latest Stories
Predictive analytics tools monitor, report, and troubleshoot in order to make proactive decisions about the health, performance, and utilization of storage. Most enterprises combine cloud and on-premise storage, resulting in blended environments of physical, virtual, cloud, and other platforms, which justifies more sophisticated storage analytics. In his session at 18th Cloud Expo, Peter McCallum, Vice President of Datacenter Solutions at FalconStor, discussed using predictive analytics to mon...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and sh...
Today we can collect lots and lots of performance data. We build beautiful dashboards and even have fancy query languages to access and transform the data. Still performance data is a secret language only a couple of people understand. The more business becomes digital the more stakeholders are interested in this data including how it relates to business. Some of these people have never used a monitoring tool before. They have a question on their mind like “How is my application doing” but no id...
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Onalytica. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effici...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
Get deep visibility into the performance of your databases and expert advice for performance optimization and tuning. You can't get application performance without database performance. Give everyone on the team a comprehensive view of how every aspect of the system affects performance across SQL database operations, host server and OS, virtualization resources and storage I/O. Quickly find bottlenecks and troubleshoot complex problems.
In his session at 19th Cloud Expo, Claude Remillard, Principal Program Manager in Developer Division at Microsoft, contrasted how his team used config as code and immutable patterns for continuous delivery of microservices and apps to the cloud. He showed how the immutable patterns helps developers do away with most of the complexity of config as code-enabling scenarios such as rollback, zero downtime upgrades with far greater simplicity. He also demoed building immutable pipelines in the cloud ...
@DevOpsSummit taking place June 6-8, 2017 at Javits Center, New York City, is co-located with the 20th International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @DevOpsSummit at Cloud Expo New York Call for Papers is now open.
In IT, we sometimes coin terms for things before we know exactly what they are and how they’ll be used. The resulting terms may capture a common set of aspirations and goals – as “cloud” did broadly for on-demand, self-service, and flexible computing. But such a term can also lump together diverse and even competing practices, technologies, and priorities to the point where important distinctions are glossed over and lost.
As data explodes in quantity, importance and from new sources, the need for managing and protecting data residing across physical, virtual, and cloud environments grow with it. Managing data includes protecting it, indexing and classifying it for true, long-term management, compliance and E-Discovery. Commvault can ensure this with a single pane of glass solution – whether in a private cloud, a Service Provider delivered public cloud or a hybrid cloud environment – across the heterogeneous enter...
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of Soli...
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at Cloud Expo, Ed Featherston, a director and senior enterprise architect at Collaborative Consulting, discussed the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
Regulatory requirements exist to promote the controlled sharing of information, while protecting the privacy and/or security of the information. Regulations for each type of information have their own set of rules, policies, and guidelines. Cloud Service Providers (CSP) are faced with increasing demand for services at decreasing prices. Demonstrating and maintaining compliance with regulations is a nontrivial task and doing so against numerous sets of regulatory requirements can be daunting task...
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, drew together recent research and lessons learned from emerging and established compa...