|By Srinivasan Sundara Rajan||
|February 14, 2011 06:00 PM EST||
Data Warehousing As A Cloud Candidate
Over the past year, we have started seeing greater support for Cloud from major vendors and Cloud is here to stay. The bigger impact is that, the path is clearly drawn for the enterprises to adopt Cloud. With this in mind, it is time to identify the potential for existing data center applications to be migrated to Cloud.
Most of the major IT majors predict a HYBRID Delivery will be future, where by the future enterprises needs to look for a delivery model that comprises of certain work loads on Clouds and some of them continue to be on data centers and then look for a model that will integrate them together.
Before we go further into a blue print of How Data warehouses fit within a HYBRID Cloud environment, we will see the salient features of Data warehouses and how the Cloud tenants make them a very viable work load to be moved to Cloud.
A data warehouse is a subject oriented, integrated, time variant and non volatile collection of data in support of management's decision making process.
Data Warehousing Usage
Cloud Tenant Value Proposition
ETL (Extract, Cleaning, Transform, Load) process is subject to variable patterns. Normally we may get large files over the week end or in night time to be processed and loaded.
It is better to use the COMPUTE resources on demand for the ETL as they require , rather than having a fixed capacity
OLAP (Online Analytical Processing) and related processing needs for MOLAP (Multi dimensional OLAP) and / or ROLAP (Relational OLAP) are highly compute intensive and requires stronger processing needs
High Performance Computing and ability to scale up on demand, tenants of Cloud will be highly aligned to this need
Physical architecture needs are complex in a data warehousing environment.
Most of the IaaS , PaaS offerings like Azure platform, Amazon EC2 have built in provisions for a highly available architecture, with most of the day to day administration is abstracted from the enterprises.
The below are some of the advantages of SQL Azure Platform
Multiple Software and platform needs,
The product stack of data warehousing environment is really huge and most organizations will normally find it difficult to get into a ideal list of software and platforms and tools for their BI platform. platform. SaaS for applications like data cleansing or address validation and PaaS for reporting like Microsoft SQL Azure reporting will be ideal to solve the tools and platform maze.
The following are the ideal steps for migrating a in-premise data warehouse system to a cloud platform, for the sake of case study , Microsoft Windows Azure platform is chosen as the target platform.
1. Create Initial Database / Allocate Storage / Migrate Data
The existing STAR Schema design of the existing data warehousing system can be migrated to Cloud platform as it is. And migrating to a Relational database platform like SQL Azure should be straightforward. To migrate the data, the initial storage allocations of the existing database on the data center needs to be calculated and the same amount Storage resources will be allocated on the Cloud.
You can store any amount of data, from kilobytes to terabytes, in SQL Azure. However, individual databases are limited to 10 GB in size. To create solutions that store more than 10 GB of data, you must partition large data sets across multiple databases and use parallel queries to access the data.
Once a high scalable database infrastructure is setup on SQL Azure platform , the following are some of the methods in which the data from the existing on-premise data warehouses can be moved to SQL Azure.
Traditional BCP Tool : BCP is a command line utility that ships with Microsoft SQL Server. It bulk copies data between SQL Azure (or SQL Server) and a data file in a user-specified format. The bcp utility that ships with SQL Server 2008 R2 is fully supported by SQL Azure. You can use BCP to backup and restore your data on SQL Azure You can import large numbers of new rows into SQL Azure tables or export data out of tables into data files by using the bcp utility.
The following tools are also useful, if you existing Data warehouse is in Sql Server within the data center.
You can transfer data to SQL Azure by using SQL Server 2008 Integration Services (SSIS). SQL Server 2008 R2 or later supports the Import and Export Data Wizard and bulk copy for the transfer of data between an instance of Microsoft SQL Server and SQL Azure.
SQL Server Migration Assistant (SSMA for Access v4.2) supports migrating your schema and data from Microsoft Access to SQL Azure.
2. Set Up ETL & Integration With Existing On Premise Data Sources
After the initial load of the data warehouse on Cloud, it required to be continuously refreshed with the operational data. This process needs to extract data from different data sources (such as flat files, legacy databases, RDBMS, ERP, CRM and SCM application packages).
This process will also carry out necessary transformations such as joining of tables, sorting, applying various filters.
The following are typical options available in Sql Azure platform to build a ETL platform between the On Premise and data warehouse hosted on cloud. The tools mentioned above on the initial load of the data also holds good for ETL tool, however they are not repeated to avoid duplication.
SQL Azure Data Sync :
- Cloud to cloud synchronization
- Enterprise (on-premise) to cloud
- Cloud to on-premise.
- Bi-directional or sync-to-hub or sync-from-hub synchronization
The following diagram courtesy of Vendor will give a over view of how the SQL Azure Data Sync can be used for ETL purposes.
Integration provides common Biztalk Server integration capabilities (e.g. pipeline, transforms, adapters) on Windows Azure, using out-of-box integration patterns to accelerate and simplify development. It also delivers higher level business user enablement capabilities such as Business Activity Monitoring and Rules, as well as self-service trading partner community portal and provisioning of business-to-business pipelines. The following diagram courtesy of the vendor shows how the Windows Azure Appfabric Integration can be used as a ETL platform.
3. Create CUBES & Other Analytics Structures
The multi dimensional nature of OLAP requires a analytical engine to process the underlying data and create a multi dimensional view and the success of OLAP has resulted in a large number of vendors offering OLAP servers using different architectures.
MLOAP : A Proprietary multidimensional database with a aim on performance.
ROLAP : Relational OLAP is a technology that provides sophisticated multidimensional analysis that is performed on open relational databases. ROLAP can scale to large data sets in the terabyte range.
HOLAP : Hybrid OLAP is an attempt to combine some of the features of MOLAP and ROLAP technology.
SQL Azure Database does not support all of the features and data types found in SQL Server. Analysis Services, Replication, and Service Broker are not currently provided as services on the Windows Azure platform.
At this time there is no direct support for OLAP and CUBE processing on SQL Azure, however with the HPC (High Performance Computing ) attributes using multiple Worker roles, manually aggregation of the data can be achieved.
4. Generate Reports
Reporting consists of analyzing the data stored in the data warehouse in multiple dimensions and generate standard reports for business intelligence and also generate ad-hoc reports. These reports present data in graphical/tabular form and also provide statistical analysis features. These reports should be rendered as Excel, PDF and other formats.
It is better to utilize the SaaS based or PaaS based reporting infrastructure rather than custom coding all the reports.
SQL Azure Reporting enables developers to enhance their applications by embedding cloud based reports on information stored in a SQL Azure database. Developers can author reports using familiar SQL Server Reporting Services tools and then use these reports in their applications which may be on-premises or in the cloud.
SQL Azure Reporting also currently can connect only to SQL Azure databases.
The above steps will provide a path to migrate on premise Data warehousing applications to Cloud. As we needed lot of support from the vendor in terms of IaaS, PaaS and SaaS, Microsoft Azure Platform is chosen as a platform to support the case study. With several features integrated as part of this, Microsoft Cloud Platform positioned to be one of the leading platform for BI on Cloud.
The following diagram indicates a blue print of a typical Cloud BI Organization on a Microsoft Azure Platform.
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
Oct. 23, 2016 09:30 PM EDT Reads: 2,432
November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Penta Security is a leading vendor for data security solutions, including its encryption solution, D’Amo. By using FPE technology, D’Amo allows for the implementation of encryption technology to sensitive data fields without modification to schema in the database environment. With businesses having their data become increasingly more complicated in their mission-critical applications (such as ERP, CRM, HRM), continued ...
Oct. 23, 2016 09:15 PM EDT Reads: 960
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, will be adding the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor...
Oct. 23, 2016 09:00 PM EDT Reads: 722
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Arch...
Oct. 23, 2016 09:00 PM EDT Reads: 5,987
In his session at @DevOpsSummit at 19th Cloud Expo, Robert Doyle, lead architect at eCube Systems, will examine the issues and need for an agile infrastructure and show the advantages of capturing developer knowledge in an exportable file for migration into production. He will introduce the use of NXTmonitor, a next-generation DevOps tool that captures application environments, dependencies and start/stop procedures in a portable configuration file with an easy-to-use GUI. In addition to captu...
Oct. 23, 2016 09:00 PM EDT Reads: 1,510
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, will draw together recent research and lessons learned from emerging and established ...
Oct. 23, 2016 09:00 PM EDT Reads: 1,245
SYS-CON Events announced today that Enzu will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their online busine...
Oct. 23, 2016 08:45 PM EDT Reads: 1,284
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
Oct. 23, 2016 08:30 PM EDT Reads: 1,702
SYS-CON Events announced today that Cloudbric, a leading website security provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Cloudbric is an elite full service website protection solution specifically designed for IT novices, entrepreneurs, and small and medium businesses. First launched in 2015, Cloudbric is based on the enterprise level Web Application Firewall by Penta Security Sys...
Oct. 23, 2016 08:30 PM EDT Reads: 1,101
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
Oct. 23, 2016 08:00 PM EDT Reads: 3,964
SYS-CON Events announced today that Roundee / LinearHub will exhibit at the WebRTC Summit at @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LinearHub provides Roundee Service, a smart platform for enterprise video conferencing with enhanced features such as automatic recording and transcription service. Slack users can integrate Roundee to their team via Slack’s App Directory, and '/roundee' command lets your video conference ...
Oct. 23, 2016 07:30 PM EDT Reads: 2,067
The best way to leverage your Cloud Expo presence as a sponsor and exhibitor is to plan your news announcements around our events. The press covering Cloud Expo and @ThingsExpo will have access to these releases and will amplify your news announcements. More than two dozen Cloud companies either set deals at our shows or have announced their mergers and acquisitions at Cloud Expo. Product announcements during our show provide your company with the most reach through our targeted audiences.
Oct. 23, 2016 07:00 PM EDT Reads: 4,555
Established in 1998, Calsoft is a leading software product engineering Services Company specializing in Storage, Networking, Virtualization and Cloud business verticals. Calsoft provides End-to-End Product Development, Quality Assurance Sustenance, Solution Engineering and Professional Services expertise to assist customers in achieving their product development and business goals. The company's deep domain knowledge of Storage, Virtualization, Networking and Cloud verticals helps in delivering ...
Oct. 23, 2016 06:45 PM EDT Reads: 1,009
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
Oct. 23, 2016 05:30 PM EDT Reads: 840
Kubernetes, Docker and containers are changing the world, and how companies are deploying their software and running their infrastructure. With the shift in how applications are built and deployed, new challenges must be solved. In his session at @DevOpsSummit at19th Cloud Expo, Sebastian Scheele, co-founder of Loodse, will discuss the implications of containerized applications/infrastructures and their impact on the enterprise. In a real world example based on Kubernetes, he will show how to ...
Oct. 23, 2016 05:30 PM EDT Reads: 2,848