The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Unable to get data read in when run process executed in AIHub.
Hi.
I have tried to follow other questions suggestions / answer, but to no avail.
Is there a set of steps detailed anywhere that I can follow to allow a process developed in Studio to access files when run in AIHub?
Attempt 1: File added through Add Content in the project
Contents tab. Once I updated Studio, I saw the file there. When I drag to the Process editor in Studio, I get a Read File.
Loop Parameters.input 1 (input 1)
Meta data: File
Generated by: Open File (2).file
Data: Repository location: //aihub-thesis/reg1.csv
Meta data: File
- Filename: reg1.csv
- Source: //aihub-thesis/reg1.csv
Generated by: Open File (2).file
Data: Repository location: //aihub-thesis/reg1.csv
In Studio, that errors, saying my process is getting the wrong type of data. Fair enough, I try Read CSV, set some values, then I get Missing Label. Again, I understand why, I have not set a label. However, if I run the Import Configuration Wizard, I am asked to specify a file on my laptop - I am not asked to set parameters based on the Open File operator feeding into Read CSV.
Attempt 2: Add file through Import Data in Studio. All works fine, can change types, etc,....
Process runs fine in Studio.
When I attempt to run in AIHub, it asks me to create a snapshot. I do and in the Contents listing in AIHub Server interface, I can see my file with a rmhdf5table extension.
Immediately upon starting, the request fails with:
The repository did not deliver the requested data. This can be caused by wrong file names, network errors, file system errors or broken entries in the repository.
and
Jul 11, 2021 8:04:40 PM com.rapidminer.tools.ResultService init
INFO: No filename given for result file, using stdout for logging results!
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStarts
INFO: Execution of process started
Jul 11, 2021 8:04:40 PM com.rapidminer.Process execute
INFO: Process //aihub-thesis/loops_mine_backward.rmp starts
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStartedOperator
INFO: Started operator : Process
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStartedOperator
INFO: Started operator : Retrieve multi1-for-aihub
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processEnded
INFO: Execution of process stopped
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.SimpleExecutor
SEVERE: Cannot retrieve repository data from entry 'multi1-for-aihub'. Reason: Cannot load data from 'multi1-for-aihub': com.rapidminer.versioning.repository.exceptions.DataRetrievalException: com.rapidminer.storage.hdf5.HdfReaderException: No valid HDF5 signature found. Please refer to the 'error.log' file for more details.
INFO: No filename given for result file, using stdout for logging results!
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStarts
INFO: Execution of process started
Jul 11, 2021 8:04:40 PM com.rapidminer.Process execute
INFO: Process //aihub-thesis/loops_mine_backward.rmp starts
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStartedOperator
INFO: Started operator : Process
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processStartedOperator
INFO: Started operator : Retrieve multi1-for-aihub
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.ExecutionProcessListener processEnded
INFO: Execution of process stopped
Jul 11, 2021 8:04:40 PM com.rapidminer.execution.jobcontainer.execution.SimpleExecutor
SEVERE: Cannot retrieve repository data from entry 'multi1-for-aihub'. Reason: Cannot load data from 'multi1-for-aihub': com.rapidminer.versioning.repository.exceptions.DataRetrievalException: com.rapidminer.storage.hdf5.HdfReaderException: No valid HDF5 signature found. Please refer to the 'error.log' file for more details.
What am I doing wrong?
OR
Is there something I can follow that will allow me to run my process?
Below is my file listing. Showing in case this helps to direct me to how to access the data I am trying to use.
Any advice will be appreciated.
Thank you.
0
Answers
Dortmund, Germany
Dortmund, Germany
I used to have this problem when moving to Projects. I solved the problem by following
1. reading the data, that you want to retrieve via repository in AI hub project, using database such as Radoop or JDBC connection. Therefore, it means you need to insert the data to database first.
2. Use store operator to store the data into repository into Project
3. don't forget to check and uncheck the box Resolve relative to "XXX" --> this is tricky, I need to do every time.
4. Create snapshot and add it to AI hub
5. You MUST run the process on AI hub only -->key point is to create repository via Server AI hub
Then you will see the history after Run process via Server
6. Now you can retrieve the repository on AI hub Project