The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"create exampleset with text plugin"
Hi all,
in Java code I would like to create an exampleset with the textplugin. I tried WVToolRapidMinerExample.java with my input and it works fine. I copied the exact code of this example to my method and I get an acces denied error. Debugging showed that WVToolRapidMinerExample.java treats my input as a directory containing traindocuments, as it should, but when I use my method directories are treated as files which of course results in an exception.
Here is my code, RapidMiner is initialised when this code is reached;
Martine
in Java code I would like to create an exampleset with the textplugin. I tried WVToolRapidMinerExample.java with my input and it works fine. I copied the exact code of this example to my method and I get an acces denied error. Debugging showed that WVToolRapidMinerExample.java treats my input as a directory containing traindocuments, as it should, but when I use my method directories are treated as files which of course results in an exception.
Here is my code, RapidMiner is initialised when this code is reached;
private ExampleSet buildTrainExampleSetNieuw(Category category)This is the code in WVToolRapidMinerExample.java that does a good job;
throws OperatorCreationException, OperatorException {
OperatorChain wvtoolOperator = (OperatorChain) OperatorService
.createOperator("TextInput");
wvtoolOperator.setParameter(TextInput.PARAMETER_DEFAULT_CONTENT_TYPE,
"application/xml");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_LANGUAGE, "dutch");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_ENCODING, "iso-8859-1");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_BELOW, "3");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_ABOVE, "10");
List<Object[]> textList = new LinkedList<Object[]>();
textList
.add(new Object[] { "Ambtenarenrecht",
"c:/workspace/documentclassification/trainset/Ambtenarenrecht/" });
textList
.add(new Object[] { "non-Ambtenarenrecht",
"c:/workspace/documentclassification/trainsetnon/Ambtenarenrecht/" });
wvtoolOperator.addOperator(OperatorService
.createOperator(SimpleTokenizer.class));
wvtoolOperator.setListParameter("texts", textList);
IOContainer out = wvtoolOperator.apply(new IOContainer());
return out.get(ExampleSet.class);
}
public static void main(String[] args) throws Exception {Any ideas on what I am doing wrong in my code? I use Rapidminer/textplugin 4.2. Any suggestions that help solve this problem will be much appreciated.
FileInputStream inputStream = new FileInputStream(
"C:\\workspace\\textplugin\\resources\\operators.xml");
RapidMiner.init(inputStream, new File("rm_plugins"), true, false,
false, true);
inputStream.close();
OperatorChain wvtoolOperator = (OperatorChain) OperatorService
.createOperator("TextInput");
wvtoolOperator.setParameter(TextInput.PARAMETER_DEFAULT_CONTENT_TYPE,
"application/xml");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_LANGUAGE, "dutch");
wvtoolOperator.setParameter(
TextInput.PARAMETER_DEFAULT_CONTENT_ENCODING, "iso-8859-1");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_BELOW, "3");
wvtoolOperator.setParameter(TextInput.PARAMETER_PRUNE_ABOVE, "10");
List<Object[]> textList = new LinkedList<Object[]>();
// adjust data input
textList
.add(new Object[] { "Ambtenarenrecht",
"c:/workspace/documentclassification/trainset/Ambtenarenrecht/" });
textList
.add(new Object[] { "non-Ambtenarenrecht",
"c:/workspace/documentclassification/trainsetnon/Ambtenarenrecht/" });
wvtoolOperator.addOperator(OperatorService
.createOperator(SimpleTokenizer.class));
wvtoolOperator.setListParameter("texts", textList);
IOContainer out = wvtoolOperator.apply(new IOContainer());
System.out.println("klaar");
}
Martine
Tagged:
0
Answers
to be honest, I'm really surprised that anybody still uses such an ancient version of RapidMiner 4.2 is outdated and no longer maintained since 2 years, so I can't really help you. Anyway I would suggest updating to 5.0 and the new Text Processing Extension, since the quality of code especially in the completely revised Text Extension is much better now.
After doing this, the white paper "How to extend RapidMiner" from our shop might help to construct new example sets.
Greetings,
Sebastian