The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"ArrayIndexOutOfBoundsException when loading pdf files"
Hi,
When I load PDF files in my process I get the following exception:
Thanks in advance,
Behi
When I load PDF files in my process I get the following exception:
Here's my process:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at com.rapidminer.operator.TermWeightClusterCharacterizer.apply(Unknown Source)
at com.rapidminer.operator.Operator.apply(Operator.java:664)
at com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:377)
at com.rapidminer.operator.Operator.apply(Operator.java:664)
at com.rapidminer.Process.run(Process.java:612)
at com.rapidminer.Process.run(Process.java:582)
at com.rapidminer.Process.run(Process.java:572)
at org.behrang.clustering.Main.createProcess(Main.java:77)
at org.behrang.clustering.Main.main(Main.java:26)
System.setProperty("rapidminer.home", "C:\\Java\\RapidMiner-4.2");
RapidMiner.init();
Process p = new Process();
OperatorChain textInput = (OperatorChain) OperatorService.createOperator("TextInput");
textInput.setParameter(PARAMETER_DEFAULT_CONTENT_LANGUAGE, "english");
textInput.setParameter(PARAMETER_PRUNE_ABOVE, "15");
textInput.setParameter(PARAMETER_PRUNE_BELOW, "5");
// textInput.setParameter(PARAMETER_DEFAULT_CONTENT_TYPE, "pdf");
List<Object[]> textList = new LinkedList<Object[]>();
for (File f : new File("fit4005").listFiles()) {
textList.add(new Object[] {
f.getAbsolutePath(),
f.getAbsolutePath()
});
}
// for (File f : new File("newsgroup/graphics").listFiles()) {
// textList.add(new Object[] {
// f.getAbsolutePath(),
// f.getAbsolutePath()
// });
// }
// for (File f : new File("newsgroup/hardware").listFiles()) {
// textList.add(new Object[] {
// f.getAbsolutePath(),
// f.getAbsolutePath()
// });
// }
// textList.add(new Object[] {"graphics","newsgroup/graphics"});
// textList.add(new Object[] {"hardware","newsgroup/hardware"});
textInput.setListParameter("texts", textList);
textInput.addOperator(OperatorService.createOperator("StringTokenizer"));
textInput.addOperator(OperatorService.createOperator("EnglishStopwordFilter"));
Operator tlfOperator = OperatorService.createOperator("TokenLengthFilter");
tlfOperator.setParameter("min_chars", "5");
textInput.addOperator(tlfOperator);
textInput.addOperator(OperatorService.createOperator("PorterStemmer"));
p.getRootOperator().addOperator(textInput);
p.getRootOperator().addOperator(OperatorService.createOperator("KMeans"));
p.getRootOperator().addOperator(OperatorService.createOperator("AttributeSumClusterCharacterizer"));
p.save(new File("Process.xml"));
IOContainer io = p.run();
SimpleExampleSet ses = (SimpleExampleSet) io.get(SimpleExampleSet.class);
System.out.println(ses.getExample(0));
System.exit(0);
fit4005contains the PDF files. If I load text files everything works fine. Any ideas why is this happening and how can I fix it?
Thanks in advance,
Behi
Tagged:
0
Answers
sorry, but I do not have a direct solution. But I would suggest that you setup the process in the GUI first and use the possibility for breakpoints etc. in order to trace down the problem. If everything works fine in the GUI, you can then simply use or and in order to deploy the process. It is usually much easier to get things right with the GUI mode before you include the complete process into your own application.
Cheers,
Ingo