Cloudera

Before deploying Cloudera to Gemini Appliance, please activate the Cloudera platform from the Integration Center as described in section 5 Cloudera CDH Installation of this document. It also helps to have a general understanding of Cloudera capabilities and usage.

To continue, add the Fully Qualified Domain Names (FQDN) of the Gemini Appliances where Cloudera Manager and Cloudera Agents should get installed to.

Add as many agents as required by selecting Add Agent button. Each time another FQDN is added to the table, Manage will confirm proper DNS resolution and connectivity to the host. A green check indicates success on both fronts, and a red icon indicates a failure in either or both.

Additionally, the setup verifies whether the provided FQDN for the Cloudera Manager matches the appliance where the administrator is currently logged in. Also, at least one additional Cloudera Agent is required to continue the setup. If Manage cannot resolve the FQDN using a configured Name Server, a red icon indicates that the IP address needs to be added manually. When entering the IP address manually, Manage will again check the connectivity and update the status icon.

  • Example 1: Manage cannot resolved “mycustomfqdn”, admin is required to enter IP address manually.
  • Example 2: Manually entered IP address is incorrect.
  • Example 3: Manually entered IP address is correct and a valid Manage installation is detected.

When all prerequisites are met, the Deploy button is activated and the installation can start. Once deployment has started, the horizontal bar indicates the progress and the current step of the installation. This stage of the installation runs in the background, allowing the administrator to work on other areas while it is in progress. Returning to the Hadoop page will show the current status and allow continuation when the installation is complete.

Successful installation may take up to 20 minutes. The page reloads automatically once completed. Open the Cloudera Manage web interface with the equivalent button or using the URLs

  • http://:7180/
  • https://:7180/

Follow the instructions to perform a Cloudera Cluster setup.

If any unexpected errors occur during the installation, the setup will stop and show an error message. To restart the setup, reset the Cloudera installation on each appliance separately using the CLI command sbox cloudera --undo.

Accessing and Using Cloudera Manager

The Cloudera Manage web interface is available at

  • http://:7180/
  • https://:7180/ (for SSL-secured access)

The default username and password for Cloudera Manager are both admin. Make sure to accept the Cloudera End User License Terms and Conditions before proceeding.

After choosing the installation type (Cloudera Express, Cloudera Enterprise or Cloudera Enterprise Trial), select the hosts where the Cloudera Agents have been deployed to by Manage.

Choose Use Parcels (Recommended) as method and the latest supported CDH version (Currently: CDH 5.10). If required, choose any additional parcels.

The Cloudera Cluster Installation will now distribute all required parcels from the Manager to the Agents using a temporary mirror of the Cloudera Archive on the Cloudera Manager appliance. Please wait until all tasks are finished.

After passing the Host Inspector, finish the Cluster Installation and proceed to the Cluster Setup. Choose which Hadoop Services that you wish to use on your cluster from the list. To use Cloudera in conjunction with Splunk Analytics for Hadoop, choose Custom Services with Services Types HDFS and YARN.

Customize the the Role Assignments based on your requirements. Proceed to step three and select Use Custom Databases for production deployments and enter the connection details.

After the Database Connection test is successful, proceed to step four and update the Cloudera Setup paths according to the scheme in the table below:

Original Path Path on Manage
/opt/dfs/... /opt/cloudera/dfs/...
/var/lib/... /opt/cloudera/lib/...
/opt/yarn/... /opt/cloudera/yarn/...
/tmp/... /opt/cloudera/tmp/...
/var/log/... /opt/cloudera/log/...
/var/run/... /opt/cloudera/run/...

The wizard will run the Cloudera setup commands and, once complete, the Cloudera Cluster can be administered using the Cloudera Manager.

Add Node to Cloudera The Deployment of Cloudera CDH on the Gemini appliance can be extended with additional Cloudera Agent installations to new appliance nodes. To perform this action, open the Hadoop section from the Manage web interface and choose Add Node for every additionally required node. Manager will again perform validation checks, such as DNS Resolution and Connectivity, and indicate the result with green check marks or red icons.

If all checks are successful, start the installation by choosing Deploy. After the installation, the new node will be propagated automatically to Cloudera Manager where it can be added to existing Cloudera Clusters.

  1. Log in to the Cloudera Manager.
  2. Open Hosts > All Hosts and choose Add New Hosts to Cluster.
  3. If asked, choose Classic Wizard in the next step and continue.
  4. Switch to the Currently Managed Hosts tab, to view the newly-added Gemini Appliance.
  5. Select all hosts you would like to add to the Cluster and choose Continue.

The Cloudera Manager will distribute required parcels to the new hosts. After passing the Host Inspector, you can optionally choose a Host Template to be applied and finish the wizard.

You can now redistribute the Roles to the new host from the Cluster configuration.

Remove Node from Cloudera

To remove an Agent from a Cloudera deployment, first ensure that the Agent has no more Cloudera Services assigned, and is removed from any Cloudera Cluster. Go to Hosts > All Hosts in Cloudera Manager, select the node which should be removed and choose Actions for Selected (1) > Delete.

After removing the node from the Cloudera Configuration, uninstall all Cloudera artefacts from that node using the Manage web interface by selecting the red remove icon.