The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Spark job could not succeed for any supported Spark Version on Cloudera

Pavithra_RaoPavithra_Rao Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 123 RM Data Scientist
edited November 2018 in Knowledge Base

Symptoms

Error Message while running Full-Test to connect to Cloudera Cluster from RapidMiner platform.

"The Spark job could not succeed for any supported Spark Version. It seems that the specified assembly jar or its location is incorrect: local:///opt/cloudera/parcels/CDH/lib/spark/lib/spark-assembly.jar

Diagnosis

  • Verified that the spark-assembly.jar is located on all the nodes.
  • Made sure there is no version mismatch between Spark version selected in Configuration Properties of Radoop Manage Connections window and Spark version of the Hadoop cluster

Solution

Cloudera's latest Spark builds (shipped with CDH 5.11 and 5.12) differ somewhat from the corresponding Apache Spark versions (they don't accept executor-cores and executor-memory options).

It is perfectly fine with using an Apache Spark release, that can be installed on HDFS with the following, or similar commands:          

 

# do a kinit call, if Kerberos is used on the cluster
wget -O /tmp/spark-1.6.3-bin-hadoop2.6.tgz https://d3kbcqa49mib13.cloudfront.net/spark-1.6.3-bin-hadoop2.6.tgz
tar xzvf /tmp/spark-1.6.3-bin-hadoop2.6.tgz -C /tmp/
hadoop fs -mkdir -p /tmp/spark
hadoop fs -put /tmp/spark-1.6.3-bin-hadoop2.6/lib/spark-assembly-1.6.3-hadoop2.6.0.jar /tmp/spark/

In this case, the specified assembly location in the Radoop connection should be:

"hdfs:///tmp/spark/spark-assembly-1.6.3-hadoop2.6.0.jar"

Tagged:
Sign In or Register to comment.