Radoop "Full Test" (Spark job) connection test error with Hadoop 2.8 and Saprk 2.1

m_tarhsaz · June 2017

I installed hadoop 2.8 , Spark 2.1.0 binaries, Rapidminer 7.5.001 , and Radoop 7.5.0

The Hadoop version in connection is "Apache Hadoop 2.2+" . (connection xml is attached)

I validated spark installation with SparkPi.

Quick Test finished successfully , but in Spark Job test (Full Test) I got this error in yarn:

17/06/29 02:42:36 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8030
17/06/29 02:42:36 INFO YarnRMClient: Registering the ApplicationMaster
17/06/29 02:42:36 INFO YarnAllocator: Will request 1 executor container(s), each with 1 core(s) and 2432 MB memory (including 384 MB of overhead)
17/06/29 02:42:36 INFO YarnAllocator: Submitted 1 unlocalized container requests.
17/06/29 02:42:36 INFO ApplicationMaster: Started progress reporter thread with (heartbeat : 3000, initial allocation : 200) intervals
17/06/29 02:42:37 INFO AMRMClientImpl: Received new token for : snd.hadoop.domain.com:33252
17/06/29 02:42:37 INFO YarnAllocator: Launching container container_1498686711343_0013_02_000002 on host snd.hadoop.domain.com
17/06/29 02:42:37 INFO YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them.
17/06/29 02:42:37 INFO ContainerManagementProtocolProxy: yarn.client.max-cached-nodemanagers-proxies : 0
17/06/29 02:42:37 INFO ContainerManagementProtocolProxy: Opening proxy : snd.hadoop.domain.com:33252
17/06/29 02:42:46 INFO YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(null) (192.168.0.14:47894) with ID 1
17/06/29 02:42:46 INFO YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
17/06/29 02:42:46 INFO YarnClusterScheduler: YarnClusterScheduler.postStartHook done
17/06/29 02:42:46 INFO BlockManagerMasterEndpoint: Registering block manager snd.hadoop.domain.com:38359 with 912.3 MB RAM, BlockManagerId(1, snd.hadoop.domain.com, 38359, None)
17/06/29 02:42:47 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 274.9 KB, free 366.0 MB)
17/06/29 02:42:48 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 22.9 KB, free 366.0 MB)
17/06/29 02:42:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.0.14:41278 (size: 22.9 KB, free: 366.3 MB)
17/06/29 02:42:48 INFO SparkContext: Created broadcast 0 from textFile at SparkTestCountJobRunner.java:43
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/tmp/radoop/root/tmp_1498687861970_0idqr77
   at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:287)
   at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
   at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
   at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
   at scala.Option.getOrElse(Option.scala:121)
   at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
   at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
   at scala.Option.getOrElse(Option.scala:121)
   at org.apache.spark.rdd.RDD.partitions(RDD.scala:250)
   at org.apache.spark.SparkContext.runJob(SparkContext.scala:1958)
   at org.apache.spark.rdd.RDD.count(RDD.scala:1157)
   at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:455)
   at org.apache.spark.api.java.AbstractJavaRDDLike.count(JavaRDDLike.scala:45)
   at eu.radoop.spark.SparkTestCountJobRunner.main(SparkTestCountJobRunner.java:45)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)

The error say "org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/tmp/radoop/root/tmp_1498687861970_0idqr77" but I checked HDFS and found folder "/tmp/radoop/root/tmp_1498687861970_0idqr77" which contain a file with sample data about iris.

The permission for that folder is "drwxrwxrwx".

Yarn logs attched.

So what's the problem ?

m_tarhsaz · July 2017

Sorry for redundant topic.

For Solution look at following answered topic :

Radoop Full Test (Spark job) test error , Hadoop 2.8 , Spark 2.1.1

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

Radoop "Full Test" (Spark job) connection test error with Hadoop 2.8 and Saprk 2.1

Best Answer