Oracle mounted on its Big Data, Hadoop last October, this was after everyone thought that Larry Ellison, CEO will announce their roll up of their open-source elephant. It turned out, however, that everyone’s assumptions were wrong.
The truth is, Oracle had made an OEM agreement with Cloudera, the largest commercial Big Data appliance reducer and mapper. Cloudera will link CDH3 version of Hadoop, plus it will have Cloudera Manger 3.7 as add-on.
Oracle has CDH3 on its core Hadoop, but customers need not worry because they will not be limited to NoSQL because Cetin Ozbutin, vice president of data warehousing technologies already announced that customers will now have access to Hadoop Distributed File System (HDFS). That is, if customers do not want to run on Oracle’s NoSQL. In addition, they will also have access to HBase, which is similar to Google’s BigTable data storage. The Big Data Appliance runs on Oracle’s community NoSQL and on Java’s HotSpot VM atop Oracle Enterprise Linux.
Oracle could have easily grabbed Apache Hadoop like how it did with Red Hat’s Enterprise Linux, but Ozbuton said they had to evaluate other Hadoop providers’ MapR and Hortonworks.
Ozbuton said, “We did consider a lot of different options, but we thought it best to partner with Cloudera. Cloudera is obviously the leader in this area and we have expertise in other areas that are complementary.”
What is significant to look into is the fact that the Big Data Appliance is not just an Oracle and Cloudera partnership. Ozbuton prides in saying that their IT spent several months of fine tuning their hardware configuration, algorithms, plug-ins and data storage.
Ozbuton says that the Big Data Appliance has superb features from its 18 Sun Fire x86 server nodes to its huge 144GB of memory. Plus they also have the right mix up of I/O and network bandwidth system. It costs at $450,000 per rack, which includes licenses to the core Oracle software and a lifetime OEM license to CDH3.
In addition to all these, Oracle hopes to roll out the Big Data Appliance and link it with other Hadoop data stores. First among these links is the Oracle Loader for Hadoop that allowed data migration from Oracle 11g R2 databases to Hadoop data stores. Second, was the Oracle Data Integrator for Hadoop that generates MapReduce code automatically. Third, is the Direct Connection for HDFS that allows file system mapping and fourth, is the R Connector for Hadoop.
These applications have definitely placed Oracle on the open-source statistics and analysis and all these cost $2,000 per server processor.