The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Radoop problem in executing process

tasbihmrtasbihmr Member Posts: 9 Contributor I
edited November 2018 in Help

Dear All,

I have managed to connect Hive and Spark and Hadoop and setup Radoop connection. I am now working with Radoop Nest in an example of "Titanic" data. I have put the titanic data in Hive and want to use Radoop Validation process on the data. The running process failes with this error:

 

HiveQL problem Message: Error running query: java.lang.NoClassDefFoundError: scala/collection.Iterable

 

Where do you think is my problem?

 

Regards,

Maziar

Tagged:

Answers

  • ztothztoth Member Posts: 5 Contributor II

    Dear Maziar,

     

    the issue is probably related to the Hive classpath on your Hadoop cluster. Let me ask a few details to make the problem solving easier:

    1. What kind of Hadoop distribution are you using? If it's CDH, do you use it with Hive on Spark? If so, setting "hive.execution.engine" to "mr" as an Advanced Hive Parameter in your connection may solve your problem immediately. It's also possible to fix the Hive on Spark execution, but it will probably require cluster-side configuration steps.
    2. Have you executed the Full Test on your Radoop connection? If not, please do so and share the logs (in case it was unsuccessful).

     

    Regards,

    Zsolt

  • tasbihmrtasbihmr Member Posts: 9 Contributor I

    Hi Szolt,

    I changed the "hive.execution.engine" to "mr" , and I received a response from Rapidminer that "The capabilites are insufficient on the data".

    For the Full test on the Radoop connection, I received an error at the test number 18, when it is bout "Import job into Hive". The full zip file of the test I completed by extracting the logfile, and I have placed it in the attachment, is this alright as log ? Or is it another step I need to show the log ?

     

    Regards,

    Maziar

     

  • ztothztoth Member Posts: 5 Contributor II

    Hi Maziar,

     

    the logs show that the JobHistoryServer address field in your Connection has a whitespace character after "localhost". Could you re-run the Full Test with the fixed value?

     

    Regards,

    Zsolt

  • tasbihmrtasbihmr Member Posts: 9 Contributor I

    Hi Szolt,

    You are absolutely right and I had a white space after the "localhost" at the JobHistory server.

    I corrected that and rerun the full test, still I have the same problem at the test18 of the FullRadoop connection test, at the "Job Import".

    Could you the new Log zipfile, it is attached.

    And just to let you know about the Hadoop and Hive and Yarn, I have installed Hadoop and Hive myself, by downloading the binaries from Apache site, and configured it from beginning, so I am not using Cloudera, but it seems that everything I have configured is not enough, and there some parameters missing or not configured.

    Regards,

    Maziar

     

  • ztothztoth Member Posts: 5 Contributor II

    Hi Maziar,

     

    it seems that you've set a few special settings in your connection as Advanced Hadoop Parameters. Radoop automatically sets the commonly required Hadoop properties, so there is no need to define e.g. fs.default.name as an advanced parameter.

    Are you using KMS on your cluster? If you have not configured it, the related properties are most likely not needed, you can safely turn off all KMS-related settings.

    In general, I'd suggest to disable every Advanced Hadoop Property you have in your connection and re-run the Full Test.

    (By the way, are you sure that your NameNode runs on port 54310? This is quite unusual.)

     

    Regards,

    Zsolt

Sign In or Register to comment.