Hadoop for many is synonymous with big data. Hadoop is not an application but a set of open source multi-tools with the ultimate goal is to analyze volumes of structured and unstructured data.
The research firm IDC conducted a study to know how the companies were cohabiting big data analysis systems like Open Source Hadoop with other solutions to enhance their data. The survey commissioned by Red Hat, entitled “What trends for Hadoop deployments in business” reveals that 32 percent of companies surveyed have already made a Hadoop deployment, 31 percent intend to deploy Hadoop in the next 12 months, and 36 percent say to use a Hadoop deployment in more than a year.
The IDC report shows that companies combine Hadoop with other databases to make big data analysis. Nearly 39 percent of respondents say they use NoSQL databases like HBase, Cassandra and MongoDB, and nearly 36 percent say they are using Greenplum and Vertica in conjunction with Hadoop. This underscores the importance of causality and correlation, in which the traditional structured data sets are analyzed in conjunction with unstructured data from new sources.
Nearly 4 out of 10 managers surveyed indicated that they use big data technologies in the innovation of products and services in the context of modeling data to test scenarios. The less frequent use of Hadoop include deployments to work in conjunction with SQL technologies. A significant proportion using Hadoop is to replace traditional data warehouse technologies. Finally, enterprises use Hadoop for the analysis of large volumes of data generated by the Web.
The IDC study has the merit of showing the different ways companies use Hadoop. This ranges from the analysis of raw data, whether operational data, data from various machines and devices or point of sale systems, or data on the behavior of customers obtained by the e-commerce and retail systems. Approximately 39 percent of respondents said they used Hadoop to develop innovative services, such as the analysis of secondary data sets for modeling scenarios.
The file systems such as GPFS from IBM, Red Hat Storage (GlusterFS), EMC Isilon OneFS and others who have earned a reputation for their ability to robustness and scalability are acceptable as an alternative to Hadoop. Of these three, only Red Hat Linux provides a platform that combines an integrated open source distributed file system with a Hadoop connector, middleware enterprise and the ability to manage Hadoop natively.
The three benefits cited by the report using Hadoop are improve customer satisfaction, reduce development time and reducing costs operations. The difficulties encountered in the implementation of Hadoop include costs, lack of available skills and the difficulties of making the choice of technologies.
A recent MarketsandMarkets research report predicted worldwide Hadoop & big data analytics market is expected to grow to about $13.9 billion by 2017, at a CAGR of 54.9% from 2012 to 2017. The market will see growth in four Hadoop segmentation namely Hadoop performance monitoring software, Hadoop management software, Hadoop application software and Hadoop packaged software.