Setup Hadoop cluster environment for executing mapreduce or spark jobs with fully equipped tools and services. Here we shows how to setup Cloudera VM for Hadoop environment quickly.
Cloudera provides Virtualized clusters for easy installation on your desktop.
Cloudera QuickStart VMs (single-node cluster) make it easy to quickly get hands-on with CDH for testing, demo, and self-learning purposes, and include Cloudera Manager for managing your cluster.
Cloudera QuickStart VM also includes a tutorial, sample data, and scripts for getting started. Cloudera QuickStarts, deployed via Docker containers or VMs, are not intended or supported for use in production.
System Requirement for this setup:
Ubuntu OS is used for this tutorial
RAM: 8 GB or more
CPU: 2 Cores or more
VirtualBox is a general-purpose full virtualizer for x86 hardware, targeted at server, desktop and embedded use. This tool facilitate virtually any OS to run in another OS.
We have used VirtualBox to run Cloudera VM from Ubuntu Linux distro.
Download VirtualBox from the link https://www.virtualbox.org/ and install.
Setup Cloudera VM
Download https://www.cloudera.com/downloads/quickstart_vms/5-10.html and extract Cloudera VM zip from the link given above. You can find a *.ovf file from Cloudera.
Open VirtualBox and use the “File -> Import Appliance” menu to open your downloaded *.ovf file, or simply double-click on the file itself and VirtualBox should handle it from there.
Once after import you can find a new entry of Cloudera VM shows in virtual box OS list.
Right click on the entry and select
Settings -> System -> Processor -> Choose atleast 2 cores
Settings -> System -> Motherboard -> Base Memory choose at lease 8GB
Now we are ready to start Cloudera VM from VirtualBox. Click start or double click on the Cloudera VM that we have imported and that will start Cloudera environment.
In some cases booting of VM will freez and in that case press on escape button will resume the freez screen.