The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Integrating RM Process with Text Processing plugin into Java Application"
dranammari
Member Posts: 13 Contributor II
Hello all,
I am relatively new to RapidMiner integration into Java Applications, though I am familiar with building RapidMiner processes using the RapidMiner GUI platform. I am using RapidMiner 5.1.014.
I have built a process that uses many operators from the "Text Processing" plugin. The process takes input from a mySQL database and produces a "wordlist" and "model" outputs saved as files in a defined RapidMiner repository. The process runs successfully using the RapidMiner GUI.
My problem is in integrating the process into a Java Application that I created using Netbeans 6.9.1. I have included all the libraries (JARs) that exist in the RapidMiner5\lib folder. I have even found and downloaded the rapidminer-Text Processing-5.0.007.jar and included it in the Application library too. However, the process does not run successfully when I run the main Java class from Netbeans. My Java code that is supposed to launch the RapidMiner process is as follows:
RapidMiner.setExecutionMode(com.rapidminer.RapidMiner.ExecutionMode.EMBEDDED_WITHOUT_UI);
RapidMiner.init();
Process process = new Process(new File("BuildYouTubeNoiseFiltration.rmp"));
process.run();
The BuildYouTubeNoiseFiltration.rmp is the XML file that has all the process information after I built the process using the GUI. Here is what I get in the Netbeans output window:
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Property rapidminer.home is not set. Guessing.
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Trying parent directory of 'C:\Service Documentation\YouTubeNoiseFiltration\Source Code\Java\YouTubeNoiseFiltration\lib\launcher.jar'...gotcha!
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Trying parent directory of 'C:\Service Documentation\YouTubeNoiseFiltration\Source Code\Java\YouTubeNoiseFiltration\lib\rapidminer.jar'...gotcha!
Dec 20, 2011 2:45:38 PM com.rapidminer.tools.ParameterService init
INFO: Reading configuration resource com/rapidminer/resources/rapidminerrc.
Dec 20, 2011 2:45:40 PM com.rapidminer.parameter.ParameterTypePassword decryptPassword
WARNING: Password in XML file looks like unencrypted plain text.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.OperatorService init
INFO: Loading additional operators specified by RapidMiner.PROPERTY_RAPIDMINER_OPERATORS_ADDITIONAL (${RAPIDMINER_OPERATORS_ADDITIONAL})
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.OperatorService init
SEVERE: Cannot find operator description file '${RAPIDMINER_OPERATORS_ADDITIONAL}'
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver org.postgresql.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver net.sourceforge.jtds.jdbc.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver org.hsqldb.jdbcDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties <init>
WARNING: Missing database driver class name for 'ODBC Bridge (e.g. Access)'
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver net.sourceforge.jtds.jdbc.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver com.ingres.jdbc.IngresDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver ca.ingres.jdbc.IngresDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver oracle.jdbc.driver.OracleDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The operator class 'text:data_to_documents' is unknown. Possibly you must install a plugin for operators of group 'text'.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: The parameter 'specify_weights' of type list is unknown for operator 'Data to Documents' (dummy).
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The operator class 'text:process_documents' is unknown. Possibly you must install a plugin for operators of group 'text'.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">Operator '<class>dummy</class>' may not have children. Ignoring.
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The input port <var>example set</var> is unknown at operator <var>Data to Documents</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>documents</var> is unknown at operator <var>Data to Documents</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>example set</var> is unknown at operator <var>Preprocessing</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>word list</var> is unknown at operator <var>Preprocessing</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.WrapperLoggingHandler log
INFO: No filename given for result file, using stdout for logging results!
Dec 20, 2011 2:45:41 PM com.rapidminer.Process run
INFO: Process BuildYouTubeNoiseFiltration.rmp starts
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.DatabaseHandler executeStatement
INFO: Executing query: 'SELECT `commentID`, `commentText`, `binaryScore`
FROM `comments`'
Dec 20, 2011 2:45:42 PM filtration.BuildYouTubeNoiseFiltration buildYouTubeNoiseFiltration
SEVERE: null
com.rapidminer.operator.UserError: The dummy operator Data to Documents (replacing text:data_to_documents) cannot be executed.
at com.rapidminer.operator.DummyOperator.doWork(DummyOperator.java:88)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.Process.run(Process.java:925)
at com.rapidminer.Process.run(Process.java:848)
at com.rapidminer.Process.run(Process.java:807)
at com.rapidminer.Process.run(Process.java:802)
at com.rapidminer.Process.run(Process.java:792)
at filtration.BuildYouTubeNoiseFiltration.buildYouTubeNoiseFiltration(BuildYouTubeNoiseFiltration.java:29)
at filtration.BuildYouTubeNoiseFiltration.main(BuildYouTubeNoiseFiltration.java:43)
Can you please help in telling me what I am missing so the RapidMiner process can be run successfully from my Java application as it is run from the GUI?
Many thanks,
Ahmad
I am relatively new to RapidMiner integration into Java Applications, though I am familiar with building RapidMiner processes using the RapidMiner GUI platform. I am using RapidMiner 5.1.014.
I have built a process that uses many operators from the "Text Processing" plugin. The process takes input from a mySQL database and produces a "wordlist" and "model" outputs saved as files in a defined RapidMiner repository. The process runs successfully using the RapidMiner GUI.
My problem is in integrating the process into a Java Application that I created using Netbeans 6.9.1. I have included all the libraries (JARs) that exist in the RapidMiner5\lib folder. I have even found and downloaded the rapidminer-Text Processing-5.0.007.jar and included it in the Application library too. However, the process does not run successfully when I run the main Java class from Netbeans. My Java code that is supposed to launch the RapidMiner process is as follows:
RapidMiner.setExecutionMode(com.rapidminer.RapidMiner.ExecutionMode.EMBEDDED_WITHOUT_UI);
RapidMiner.init();
Process process = new Process(new File("BuildYouTubeNoiseFiltration.rmp"));
process.run();
The BuildYouTubeNoiseFiltration.rmp is the XML file that has all the process information after I built the process using the GUI. Here is what I get in the Netbeans output window:
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Property rapidminer.home is not set. Guessing.
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Trying parent directory of 'C:\Service Documentation\YouTubeNoiseFiltration\Source Code\Java\YouTubeNoiseFiltration\lib\launcher.jar'...gotcha!
Dec 20, 2011 2:45:38 PM com.rapid_i.Launcher ensureRapidMinerHomeSet
INFO: Trying parent directory of 'C:\Service Documentation\YouTubeNoiseFiltration\Source Code\Java\YouTubeNoiseFiltration\lib\rapidminer.jar'...gotcha!
Dec 20, 2011 2:45:38 PM com.rapidminer.tools.ParameterService init
INFO: Reading configuration resource com/rapidminer/resources/rapidminerrc.
Dec 20, 2011 2:45:40 PM com.rapidminer.parameter.ParameterTypePassword decryptPassword
WARNING: Password in XML file looks like unencrypted plain text.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.OperatorService init
INFO: Loading additional operators specified by RapidMiner.PROPERTY_RAPIDMINER_OPERATORS_ADDITIONAL (${RAPIDMINER_OPERATORS_ADDITIONAL})
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.OperatorService init
SEVERE: Cannot find operator description file '${RAPIDMINER_OPERATORS_ADDITIONAL}'
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver org.postgresql.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver net.sourceforge.jtds.jdbc.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver org.hsqldb.jdbcDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties <init>
WARNING: Missing database driver class name for 'ODBC Bridge (e.g. Access)'
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver net.sourceforge.jtds.jdbc.Driver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver com.ingres.jdbc.IngresDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver ca.ingres.jdbc.IngresDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.JDBCProperties registerDrivers
INFO: JDBC driver oracle.jdbc.driver.OracleDriver not found. Probably the driver is not installed.
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The operator class 'text:data_to_documents' is unknown. Possibly you must install a plugin for operators of group 'text'.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: The parameter 'specify_weights' of type list is unknown for operator 'Data to Documents' (dummy).
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The operator class 'text:process_documents' is unknown. Possibly you must install a plugin for operators of group 'text'.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">Operator '<class>dummy</class>' may not have children. Ignoring.
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The input port <var>example set</var> is unknown at operator <var>Data to Documents</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>documents</var> is unknown at operator <var>Data to Documents</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>example set</var> is unknown at operator <var>Preprocessing</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.io.process.XMLImporter addMessage
INFO: <em class="error">The output port <var>word list</var> is unknown at operator <var>Preprocessing</var>.</em>
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.WrapperLoggingHandler log
INFO: No filename given for result file, using stdout for logging results!
Dec 20, 2011 2:45:41 PM com.rapidminer.Process run
INFO: Process BuildYouTubeNoiseFiltration.rmp starts
Dec 20, 2011 2:45:41 PM com.rapidminer.tools.jdbc.DatabaseHandler executeStatement
INFO: Executing query: 'SELECT `commentID`, `commentText`, `binaryScore`
FROM `comments`'
Dec 20, 2011 2:45:42 PM filtration.BuildYouTubeNoiseFiltration buildYouTubeNoiseFiltration
SEVERE: null
com.rapidminer.operator.UserError: The dummy operator Data to Documents (replacing text:data_to_documents) cannot be executed.
at com.rapidminer.operator.DummyOperator.doWork(DummyOperator.java:88)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.operator.execution.SimpleUnitExecutor.execute(SimpleUnitExecutor.java:51)
at com.rapidminer.operator.ExecutionUnit.execute(ExecutionUnit.java:709)
at com.rapidminer.operator.OperatorChain.doWork(OperatorChain.java:379)
at com.rapidminer.operator.Operator.execute(Operator.java:833)
at com.rapidminer.Process.run(Process.java:925)
at com.rapidminer.Process.run(Process.java:848)
at com.rapidminer.Process.run(Process.java:807)
at com.rapidminer.Process.run(Process.java:802)
at com.rapidminer.Process.run(Process.java:792)
at filtration.BuildYouTubeNoiseFiltration.buildYouTubeNoiseFiltration(BuildYouTubeNoiseFiltration.java:29)
at filtration.BuildYouTubeNoiseFiltration.main(BuildYouTubeNoiseFiltration.java:43)
Can you please help in telling me what I am missing so the RapidMiner process can be run successfully from my Java application as it is run from the GUI?
Many thanks,
Ahmad
Tagged:
0
Answers
please use instead of EMBEDDED_WITHOUT_UI, as the latter will NOT load plugins.
Regards,
Marco
Actually I have used the ExecutionMode.COMMAND_LINE too. Unfortunately the program still gives me a set of Exceptions. Here is the complete output: What could be wrong? The exact process runs very fine from the RapidMiner GUI.
Many thanks,
Ahmad
1) org.xml.sax.SAXParseException: Content is not allowed in prolog
2) java.io.IOException: Malformed XML operator help bundle: org.xml.sax.SAXParseException: Content is not allowed in prolog.
when I use "Resolve relative to the repository_name" when I store a wordlist from a (Process Documents) operator and store a model from an (X-Validation) operator, I get also the following error:
3) com.rapidminer.operator.UserError: Cannot resolve relative repository location 'NoiseFiltrationWordlist'. Process is not associated with a repository.
Would appreciate any help if possible. Here is the all the output: Many thanks,
Ahmad
does your process make use of R extension operators? It looks similiar to other R extension problems which have been posted in the forums.
If your process does not contain any confidential data, please post it here.
Regards,
Marco
No not at all, I am not using using any R operator or R script in my process.
Here is the process: Note that he process continues running despite the "Content is not allowed in prolog" error. However, it produces another error later after trying to query the mySQL database table. Here is the part of the command-line output in Netbeans where the second error occurs: I think the error now is that rapidminer API does not know where to save the 'NoiseFiltrationWordlist' object that the original process generates because the process is not associated with a repository. How to solve this problem?
Many thanks,
Ahmad
either use to load the process you execute via java so it has a repository location, or make sure your parameters use absolute repository locations instead of relative paths.
Regards,
Marco