The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Unable to read file from disk using Execute Python operator

ccrichaccricha Member Posts: 9 Contributor II
edited June 2019 in Help

Hello, I am trying to get a better understanding of how RM Server can interact with its environment. I wrote a log file using an Execute Python operator within RM to create a test log file. I am now trying to use a different Execute Python operator to read the log file from disk (Linux) and then use a Store operator to store this data in the remote repository. All of this if running on the Linux RM Server.

 

What ends up happening is that RM wries an empty dataset. When I look in the server.log file I see multiple lines of this:

 

WARNING [com.rapidminer.operator.Operator] (scheduledprocess_1503585018370) Read CSV: Could not parse line 0 in input: com.rapidminer.tools.CSVParseException: Value quotes not closed at position 0. Last characters read: ,"

 

Here is my overall process:

RM_fileread2.pngOverall processRM_fileread1.pngPython code

Is the data frame not being constructed properly? It appears that the Execute Python process is writing a temporary CSV file somewhere that RM is trying to read and is failing to do so.

Tagged:

Best Answer

  • ccrichaccricha Member Posts: 9 Contributor II
    Solution Accepted

    Hello Scott, thanks for your reply. This is mainly in case I need to do use a more detailed python process for more complex data transformations, needing to read/write to a database within that script, etc. and that I want to use python's logging module to log to disk. There are some cases were detailed logging is necessary and RM is not going to be a good tool for doing that. I know that I can successfully log to disk from a script in an ExecutePython operator, and I was finally able to read the file using the "Read Document" operator and then store it to the repository. It just seemed to me though that this should still work as it is returning a DataFrame object, but instead throws a CSVParseException. Anyway, I will look at using "Read Document" instead for reading and analyzing log files in the future.

     

    Thanks

     

Answers

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Hi @ccricha - good to have you here.  I guess my first question is why are you using python scripts to read/write log files?  There are very nice, easy-to-use operators built in to RapidMiner that will do this for you:

     

    Screen Shot 2017-08-24 at 1.35.31 PM.png

     

    I have used these operators in RM processes running on an Ubuntu server running RM Server with no problems at all.  Give it a try?

     

    Scott

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hi @ccricha - ok that makes sense and yes, Read Document will do a much better job in that it will just grab your text file instead of CSV which is looking for a structure.  Good luck.

     

    Scott

Sign In or Register to comment.