Most enterprise business customers today face a plethora of challenges in order to migrate their massive applications to the cloud environment. For instance, many enterprise applications are so large that one of the easiest ways to migrate a customer environment to the cloud is to physically ship the customer storage disks to the cloud provider. Second, some enterprise applications have sensitive data that requires compliance to security regulations, such as Sarbanes-Oxley for financial data and HIPAA for healthcare data. This data requires masking when moving the data away from current infrastructure and into the cloud. In addition, these enterprise applications often run as production copies within a customer environment while the reporting, development and test applications are hosted and run in the cloud environment, thereby requiring the cloud copies to be refreshed from the current production copy.
Three Good Reasons for Migrating to the Cloud
While many different cloud benefits have been touted in media over the past few years, there are three essential reasons put forth for why customers should migrate their current applications to a public, private or hybrid cloud environment. The following three benefits highlight the overall key benefits:
- Cost: From both a capital expenditure (CapEx) and operating expenditure (OpEx) perspective, customers save both time and money in migrating their physical environments and applications to a cloud infrastructure. The savings in cost, from both a short-term operating management cost and long-term capital expense perspective, are quite attractive. For example, by migrating a large 20TB Oracle Financials database to the cloud, the customer can save millions of dollars in storage cost and administration labor costs to support the cloud environment in comparison to the previous physical infrastructure. One recent study from an NTT 2013 Security survey reported that 70% of mature cloud adopters obtained a major financial benefit from their migration to the cloud.
- Experience: instead of reinventing the wheel, customers benefit from successful frameworks and processes worked out by experts with invaluable industry experience. The heavy lifting is already done and the successful framework is already in place. The processes have been developed, tested, and proven.
- Scalability: Running your systems in the cloud allows you the elasticity to expand quickly, either permanently or on an as-needed “burst” basis. Conversely, it also allows for scaling back, with commensurate cost savings, should that be necessary.
Surprisingly though, the main benefit of cloud migration is not cost savings but innovation and agility:
- Speed: fast easy implementation of new ideas and business functionality
- Time: more time to think strategically instead of maintenance
While cost savings are certainly desirable, in the bigger picture, the most important benefits are innovation and agility. In fact, there is significant evidence that innovation and agility are driving migration (2011 Cloud Computing Survey, CIO magazine, November 2010).
When one comprehends all the services, packages and functionality offered by a cloud provider such as Amazon, with the many options for computing resources, storage, databases, analytics, monitoring, application services and recovery, one may arrive at the conclusion that this has to be the future. How or why would anyone want to run these functions in-house unless it was his or her core business (e.g., Amazon, Google, etc.)?
Data Virtualization for Cloud Migration
Enter data virtualization, an enterprise software solution that solves many of the pain points of customers who need to migrate to the cloud. Migrating enterprise applications to the cloud poses unique challenges in terms of provisioning virtualized and converged infrastructure environments to host the cloud applications. While hardware vendors such as IBM, HP, NetApp, EMC and Cisco provide the infrastructure components for hosting enterprise applications in the cloud, they lack the key ingredient to solve the most challenging piece of the puzzle in moving customers to the cloud: how to migrate the applications. The virtualization setup is manual and migrating large enterprise applications is still a painful and laborious process. Data virtualization solutions allow customers to quickly migrate from in-house hosting to cloud providers regardless of storage infrastructure.
Migration Challenges for Customers Moving to the Cloud
Customers often are at a deadlock when it comes time to migrate their environments to the cloud. One challenge is lack of experienced resources to perform the actual migration. In-house staff often do not have the highly specialized technical cloud skills or experience to perform these complex migrations.
A simple example might be a sole proprietor running a web site on WordPress. In such a case, it would be a matter of uploading a copy of the WordPress directories to a hosted server and a quick import of some MySQL data. A WordPress migration represents a simple migration. As we move up the complexity scale from a single website, there is a broad scope of potential candidates for cloud migration. As cloud migration candidates become larger and larger and more and more complex, does the migration itself become too difficult to manage? Is it possible to move major enterprise applications to the cloud?
Let’s address that elephant in the middle of the room: The concern, voiced by IT leaders, is that cloud providers oversimplify their solutions and fail to appreciate both the complexity of their potential customers’ applications and their fears that migration could fail. (Growing pains in the cloud: 300 CIOs express their views about barriers to cloud adoption, NTT Com, May 2013) Despite the seemingly insurmountable obstacles that enterprise cloud migrations present, you would be hard-pressed to find a business reliant on its data systems that did not recognize and desire the tactical benefits of the cloud. Thus, the will to migrate is there.
But the question still persists: How can large enterprise applications be migrated to the cloud? Enter data virtualization software to the rescue. Data virtualization software provides graphical interfaces with many hooks and tools available to the administrator, allows on-premise to cloud migrations to be performed quickly and painlessly.
One large complex enterprise application that can be hosted in the cloud is Oracle E-Business Suite (EBS). EBS can be hosted in Amazon Web Service (AWS). AWS has Oracle and EBS templates available for the software distribution. Oracle impressively offers live EBS migration to AWS. Even with pre-existing templates and live migration functionality this migration might be considered as too risky an operation for companies whose business depends on these applications. Even if the companies are ready to accept the risk, there still exists the larger issue challenge of how to move a massive enterprise application that is terabytes in size, such as a 30TB Oracle SAP environment, to the cloud.
So, how do you move such an environment from in-house systems into the AWS cloud? Network performance and latency is a key factor to consider. For instance, for many customers, network connections to AWS fail to support a sufficiently fast transfer rate. In such a situation it would be difficult to move to the cloud via online migration. A migration would be even harder for big apps that need multiple copies of TBs of data like QA and development. Running development and QA in the cloud is a perfect case study for the cloud, where the amount of resources required varies depending on the development cycle. If the development cycle has just finished a sprint, then it might require a burst in QA capacity to quickly QA the code.
Solutions for moving to the cloud
The cloud migration concerns that companies have been voicing are
- How do we migrate terabytes of data for development and test environments to a cheaper, more efficient cloud infrastructure?
- How do we keep environments in the cloud in sync with on-premise environments without paying an exorbitant data transfer cost?
- How do we manage burst capacity during high load periods without incurring sustained maintenance/infrastructure costs?
- How do we ensure data is moved securely to the cloud and protected once it is in the cloud?
Moving, syncing, masking and optimally managing terabytes of data in the cloud requires innovation and new technology. There is a new technology that enables moving data seamlessly and easily to the cloud. The new technology that can address all these issues is called data virtualization software.
Data virtualization software at its core is a technology that leverages thin cloning along with compression and change tracking. By tracking the changes to all data blocks on a storage system and sharing any duplicate data blocks between data copies, data virtualization enables copies of data to be made almost instantly because there is no copying of data. A new copy of data is simply a new set of pointers to existing data. Copies of large databases can be made in minutes for almost no storage. Once the virtual copy of data has been made, any changes to that new copy are tracked separately from the original and kept private to the copy. Along with the massive data savings of sharing duplicate blocks, data virtualization incorporates data compression enabling even greater storage savings. Data virtualization is not just about sharing copies of data at specific points in time but also about sharing data anywhere in a time flow. A multi-week time flow of data is created and managed by tracing the data block changes across virtual copies. For example if one copy of a source database is made at the beginning of the week and another copy is made at the end of the week both copies will still share all the duplicate blocks between these two versions of the source data even though the copies are days apart. Typically only a small percent of data changes over time relative to the total size thus copies of data can share the majority of data blocks.
How does data virtualization software help data migrations to the cloud? Data virtualization software has three key components that facilitate and enable cloud migration. Those crucial components are:
- Single copy of all duplicate data blocks
- Compression of unique data blocks
- Streaming replication of changed data blocks
The first point, sharing all duplicate data, is the most impactful point. For example, if there is a source database of 9 TB and 4 copies of it, then migration to the cloud would normally require coping 5 databases at 9 TB each totaling 45 TB. Thus, migrating these 5 databases to the cloud would require copying 45 TB of data. With data virtualization we would instead just move a single copy of the database along with any changes unique to each of the copies, which are generally just a few percent of the total. The total movement would typically be under 10TB.
The second point, compression, further reduces the amount of data. From industry experience common LZ4 compression takes that database data down by a factor of 1/3 or more on average, so that 9TB would only be 3TB. Thus total changes would go down from an original 45TB to only 3TB.
Now 3 TB can still be a lot of data to move into the cloud depending on the network bandwidth. This is where the third point of streaming replication comes into play. With streaming replication, data virtualization can take place on premise and then the data can be replicated to the cloud. Data virtualization software typically integrates replication. This replication is done in a managed manner where transfers can be limited to hours of low usage so as to avoid network congestion. The transfer can run automatically over days. Most importantly, since data virtualization tracks changes, all changes that happen over the transfer period time period, which could be days, can then be transferred and applied at the end of the initial migration. Once a first full replication is accomplished, all changes that happen on premises can continue to be replicated to the cloud.
Data virtualization software provides an elegant solution to the above challenges for customers who need to quickly, efficiently and cost-effectively migrate to the cloud.
The cloud offers tremendous advantages but these advantages can be difficult to obtain due to the obstacles inherent in migrating to the cloud. The biggest obstacle to cloud migration for enterprise applications is transferring the large amounts of data these applications depend upon and generate. Transferring to the cloud and synchronizing the data cloud with in-house applications can be accomplished easily though with a new technology called data virtualization. Data virtualization provides automated and encapsulated data de-duplication, compression and replication from in house applications to applications in the cloud.
Kyle Hailey is a Technical Evangelist at Delphix. Before Delphix, he designed DB Optimizer at Embarcadero, and worked on a complete redesign of the Oracle Enterprise Manager 10g performance pages. His input shifted the screens away from confusing clutter to simple but powerful graphics based session load and wait bottlenecks, and this design has continued to be the foundation of OEM 11g and 12c. He has a long and distinguished career in the database world, having worked at Oracle, Quest, and Embarcadero as well as other companies in the industry on database performance tuning and optimization. He has designed tools to improve high end performance monitoring such as direct memory attach to bypass SQL and interactive graphic displays of performance data. He speaks regularly at conferences, teaches classes around the world on database performance tuning, and holds a patent on diagnosing database performance problems: 20060059205