The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Radoop connection issues in v7.3
Hi,
Recently, i upgraded from rapidminer v7.2 to 7.3. After the upgrade, the radoop throws java.util.concurrent.TimeoutException while connecting to Hive server 2. In another rapidminer installation (v7.2), the same configuration works fine.
Current config details:
Hadoop version: Apache Hadoop 2.2+
Hadoop user name: hadoop
Hive Server2 (Hive 0.13 or newer)
default
hive
Spark 1.6
hdfs:///user/spark/spark-assembly.jar
Are there any configuration changes to be made in radoop for v7.3? I have tried with rapidminer v7.3 + radoop 7.2 as well as rapidminer v7.3 + radoop 7.3. Both of them does not work. Please help.
-Kris
Tagged:
0
Answers
Hi Kris,
It would be a bit surprising, if Studio 7.2 and 7.3 behaved differently with the same Radoop version. (So it is valuable, if we find such a case. :smileywink: ) Can you reproduce this behaviour consistently?
I'll copy my answer on how to move on with the problem from another topic.
The error states that there were no response from the HiveServer2 instance (specified by either the Master Address or the Hive Server Address fields, and the Hive Port) in a given time.
I would try the following:
Best,
Peter
Thanks peter for the response.
Yes. The behaviour is reproducible consistently. Yesterday, i tried creating a Amazon EMR cluster and tried connecting through Radoop. The same issue persists even if I open all inbound ports in the EMR master instance.
All URLs (namenode, history server, spark etc.) are accessible remotely. Only the hiveserver connection fails. Tried increasing the timeouts earlier upto 4minutes, but no luck. Hive works through beeline (tested this locally on the cluster).
Let me know if there are any other tests I can try out.
Peter,
Figured out the issue and resolved it.
The problem is the change made in Rapidminer v7.3 in the system -> preferences option. Earlier under system, one has to explicitly specify HTTP proxy and by default, it's no proxy. In the new version, the proxy is a separate option (under system->preferences) and by default it's set to 'System proxy'. Once i changed it to Direct (no proxy), it worked fine. I think the default option should be no proxy.
Sharing this as it might help others who might face similar issues due to upgrade.
-Kris