The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Automating a RM5 Process
Hello,
I am interested in automating a RM5 process
I have created a very simple RM5 application which has the operators: Read Model, Read Database, which join to Apply Model then Write database.
What I need is a way to automate this process. My database gets updated once a minute and I have 10 different data sources and models. Also I write predictions to 10 different tables.
I would like to continuously loop through this application and just change the parameters: Read Model – input file, Read Database –Sql statement and Write Database – Output table.
I have looked at the various process control loops without success. Should I create 10 different applications with the various configurations and load them through a Scheduled Task (windows cron job) or is there a better solution.
I am also working with RM5 Beta, and have been unable to load this application through the command line. (OS is Windows XP)
Thanks in advance,
Cleo
I am interested in automating a RM5 process
I have created a very simple RM5 application which has the operators: Read Model, Read Database, which join to Apply Model then Write database.
What I need is a way to automate this process. My database gets updated once a minute and I have 10 different data sources and models. Also I write predictions to 10 different tables.
I would like to continuously loop through this application and just change the parameters: Read Model – input file, Read Database –Sql statement and Write Database – Output table.
I have looked at the various process control loops without success. Should I create 10 different applications with the various configurations and load them through a Scheduled Task (windows cron job) or is there a better solution.
I am also working with RM5 Beta, and have been unable to load this application through the command line. (OS is Windows XP)
Thanks in advance,
Cleo
0
Answers
there will be something like an enterprise server which beneath many other things supports you by performing scheduled tasks. It will be released during the first quarter of 2010.
Executing RapidMiner from command line should work again with the final version of RapidMiner 5.
Beside this, you could wrap your process inside a loop operator. You then could enter 2.000.000.000 as number of executions. This will execute the inner process 2 billion times. If the inner process runs one second, this will keep the thing running for 63 years.
If your inner process executes too fast, you could insert a scripting operator, which just waits a little bit. For example the following script would wait 2 seconds and return the first input as first output: Greetings,
Sebastian
Thank you for the quick response and the suggestions. I have really enjoyed working with RM.
I have successfully wrapped my process within a loop operator and added the script. This solves my problem continuously loading my data.
Problem 1:
I have 10 different models, input data and predictions. My current setup has a loop operator. Within the loop operator there is a Read Model operator – which has a model file of “model i”, Read Database Operator with a query file of “query i”. These two operators join an apply model operator then it connects with a Write Database operator with a table name of “output_i”. “i” should iterate through 1 to 10.
I can think of two ways to accomplish this:
Setup 1: - I have 10 different instances of RM running.
Setup2: - Somehow iterate “i” with a one of the loop operators or the script. I would prefer this setup but I am not sure which operator or combination of operators would accomplish this.
Unrelated script question:
I have not looked at the scripting operator before this, but I think it could perform some custom preprocessing easier then my current method of using triggers to run sql queries within my mysql DB. Is the RM scripting language vbs scripting? Can I set break points within the script or what are the debug strategies you use? Have you seen a sample script which does any preprocessing?
Thanks again,
Cleo
let me first address your problem. You could simply add an loop parameter operator. If you insert an set macro operator as a child, you might define values for the macro value, which are inserted subsequently into the macro operator during the iteration. You could use this macro to attach the index into the file name. Please take a look at the following example process: This will give you hints, how to solve the problem.
To your scripting question: The language we use is Groovy, that's rather java like. There are no breakpoints available, so debugging of scripts is more complex. There's the possibility of writing your own rapid miner operators, I'm just writing an tutorial how to do this.
But I guess that RapidMiner's own operators already cope with almost all possible preprocessing steps needed. What do you miss? Maybe you just need to insert a chain of RapidMiner Operators...
Greetings,
Sebastian
Thanks again! With a couple of simple modifications to your RM5 file suits my needs perfectly.
The data I am working with is a time series and I am attempting preprocess the data in three ways.
1) Moving Average (average value of the last 2 values of col2 and col3 )[ Average(col2(t),col2(t+1),col3(t),col3(t+1)]
2) Percent change ([Col1(t)-Col1(t-1)]/Col1(t)*100)
3) Custom binomial result based on: if (col1(t)-col2(t+x))>const1 before (col1(t)-col3(t+x))>const2 then Result=Yes else Result=No ie which statement is true first
So far I have done this and other preprocessing in the database, but I think RM would be better at it. I believe cases 1 and 2 could be achievable with standard RM operators but I feel case 3 will require custom coding.
I have unsuccessfully tried to implement a “Hello world” groovy example from http://groovy.codehaus.org/.
If possible I would appreciate a small example script in Groovy. I have included the pseudo code of an example I which could adapted to my personal needs. Assuming the execute script operator is working with an exampleSet loaded from a Read excel operator containing with one sheet and the number 1,2,3,4,5 in column A.
Step 1) Load the exampleSet from the Excel Operator
Step2)Create a loop for each row:
Step2a) Print the value of the attribute Column A (row) {Print to a log file, or to the screen or anywhere else for debug proposes}
Step2b) Call a function passing it row and have the function return the result row+10
Step3) Return to RM the new ExampleSet containing two columns A and A+10 ie (1,11),(2,12),(3,13),(4,14),(5,15)
If you would like I could give you some feedback on the tutorial you are writing.
Thanks again,
Cleo
the first two steps would be perfectly fulfilled with the operators of the time series extension, we will publish with the final version.
The last step can be done without scripting, just using the Construct Attribute operator. It handles if-conditions, and even nested conditions. So it should be possible to extract the nominal target value with that.
As an example for your script, I will quote the still unfinished tutorial: And here's the resulting code after two types of explanations:
Hope that will help you.
Greetings,
Sebastian
Thanks for your help.
Cheers,
Cleo