At recently concluded Hadoop Summit, Microsoft presented the strategy for big data technologies and the work being done by the company from getting Hadoop is accessible in the cloud. Microsoft announced in the event that Azure HDInsight will now support Hadoop 2.4.
Microsoft says the massive scale, power, elasticity and low cost of storage makes the cloud the best place to deploy Hadoop – one of the reasons Microsoft has invested heavily in its cloud-based Hadoop solution Azure HDInsight. It combines the best of open source with the flexibility of cloud deployment. It also integrates with business intelligence tools, allowing easy access and processing of data to Excel and Power HDInsight BI to Office 365.
Hadoop is a cornerstone in the Microsoft data strategy and as part of this commitment, the company has contributed 30,000 lines of code and over 10,000 hours of engineering to support these projects, including support for Hadoop on Windows. This work was done in collaboration with Hortonworks, a partnership that ensures Hadoop solutions are based on compliant implementations of Hadoop. One result of this collaboration is the engineering work that has led to the Hortonworks Data Platform for Windows and Azure HDInsight.
Currently Microsoft is working on updating Azure HDInsight with support for Hadoop 2.4, the latest version of Hadoop. This review includes interactive consultation with Hive, using developments based on SQL Server technology, which is also contributing to the Hadoop ecosystem through the Stinger project. With this update to HDInsight, customers can use the speed and scalability of the cloud for improved 100x response time.
Windows Azure HDInsight offers security and administrative functions at the enterprise level. The tools can be used in conjunction with the new service including Microsoft BI tools such as PowerPivot and Power View. HDInsight is only part of the complete Microsoft data platform, which includes the basic components that clients need to process data from anywhere in the native format. Hadoop solution can also be extended to Microsoft Intelligent Service System to capture data generated by the machine within the Internet of Things; SQL Server and SQL Azure database to store and retrieve data; Azure HDInsight for deploying and provisioning of Hadoop clusters in the cloud; or Excel and BI to Office 365 to analyze and display data.
The latest release also includes improvements to YARN, which is a key component of the original Hadoop ecosystem. YARN offers more interaction patterns with HDFS data and provides a more generalized processing platform beyond the MapReduce technology.
Gartner recent Magic Quadrant report revealed that Microsoft is really catching up with AWS and giving the market leader a run for its money in the cloud solutions. Microsoft infrastructure and on-site applications driven by Azure, Hyper- V, Windows Server, Active Directory and System Center, as well as SaaS offerings has enabled it rapidly to attain the status of strategic cloud IaaS providers.
Earlier this month, Microsoft and SAP made an agreement to increase their collaboration in the areas of cloud, data, and mobility. In addition, Microsoft and Salesforce.com joined forces by announcing the signing of a comprehensive strategic partnership for mutual integration of their flagship solutions.