Spark job could not succeed for any supported Spark Version on Cloudera

Pavithra_Rao · December 2017

Symptoms

Error Message while running Full-Test to connect to Cloudera Cluster from RapidMiner platform.

"The Spark job could not succeed for any supported Spark Version. It seems that the specified assembly jar or its location is incorrect: local:///opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar

Diagnosis

Verified that the spark-assembly.jar is located on all the nodes.
Made sure there is no version mismatch between Spark version selected in Configuration Properties of Radoop Manage Connections window and Spark version of the Hadoop cluster

Solution

Cloudera's latest Spark builds (shipped with CDH 5.11 and 5.12) differ somewhat from the corresponding Apache Spark versions (they don't accept executor-cores and executor-memory options).

It is perfectly fine with using an Apache Spark release, that can be installed on HDFS with the following, or similar commands:

# do a kinit call, if Kerberos is used on the cluster
wget -O /tmp/spark-1.6.3-bin-hadoop2.6.tgz https://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz
tar xzvf /tmp/spark-1.6.3-bin-hadoop2.6.tgz -C /tmp/
hadoop fs -mkdir -p /tmp/spark
hadoop fs -put /tmp/spark-1.6.3-bin-hadoop2.6/lib/spark-assembly-1.6.3-hadoop2.6.0.jar /tmp/spark/

In this case, the specified assembly location in the Radoop connection should be:

"hdfs:///tmp/spark/spark-assembly-1.6.3-hadoop2.6.0.jar"

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Spark job could not succeed for any supported Spark Version on Cloudera

Symptoms

Diagnosis

Solution