Radoop and Hortonworks Sandbox Connection Problem
Dear friends
I have a problem while connecting from Rpidminer7.3 Radoop to Hortonworks sandbox.
I have installed the following hortonworks sandbox on the vmware workstation sandbox HDP_2.5_docker_vmware_25_10_2016_08_59_25_hdp_2_5_0_0_1245_ambari_2_4_0_0_1225.ovf
and also applied the Distribution-Specific Notes of Radoop documents on it
http://docs.rapidminer.com/radoop/installation/distribution-notes.html#hdp-sandbox
But when I make a connection from Radoop to the sandbox and run Quick Test, I get the following error(Screenshots are included)
[Dec 21, 2016 6:25:57 PM] SEVERE: com.rapidminer.operator.UserError: Could not upload the necessary component to the directory on the HDFS: '/tmp/radoop/_shared/db_default/'
[Dec 21, 2016 6:25:57 PM] SEVERE: Hive jar (with additional functions) upload failed. Please check that the NameNode and DataNodes run and are accessible on the address and port you specified.
[Dec 21, 2016 6:25:57 PM] SEVERE: Test failed: UDF jar upload
[Dec 21, 2016 6:25:57 PM] SEVERE: Connection test for 'Hortonworks_Hadoop' failed.
Regards
when quick test is pressed a file will be created in db_default
Best Answer
-
phellinger Employee-RapidMiner, Member Posts: 103 RM Engineering
Hi All,
We have updated the guide to connecting to the latest Hortonworks Sandbox virtual machine. Thoroughly following the steps should solve the above issues.
Please follow the guide at http://docs.rapidminer.com/radoop/installation/distribution-notes.html.
For those interested in technical details, here is some explanation. The Hortonworks Sandbox connection problems appeared as Hortonworks updated their Sandbox environment, so that now Hadoop runs on Docker inside the Virtual Box. After this change in the networking, a hostname must be used to access the DataNodes, because it can be resolved to either the external or the internal IP depending on where it is resolved. Moreover, not all ports are exposed properly, that's why we need to add the permanent iptables rules as a workaround.
Best,
Peter
2
Answers
Hi,
Please try the following Advanced Hadoop Parameter:
Key = dfs.client.use.datanode.hostname
Value = true
Best, Zoltan
I also has the same issues.
Tried to import hadoop configuration files and also import from cluster manager. Added the extra advanced hadoop parameters as @zprekopcsak instructed.
But I still get the error