The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Feature Selection for Text Categorization"
Hi,
I use the Brown corpora for an experiment. I try to limit the number of features for the experiment by using feature selection.
I have used the wizard which comes with RapidMiner to setup a process. The data is loaded from a sparse matrix file into a sparse matrix.
How can I prevent that RM is running out of memory?
Thank you
P Jan 12, 2009 2:12:21 PM: Initialising process setup
P Jan 12, 2009 2:12:21 PM: [NOTE] No filename given for result file, using stdout for logging results!
P Jan 12, 2009 2:12:21 PM: Checking properties...
P Jan 12, 2009 2:12:21 PM: Properties are ok.
P Jan 12, 2009 2:12:21 PM: Checking process setup...
P Jan 12, 2009 2:12:21 PM: Inner operators are ok.
P Jan 12, 2009 2:12:21 PM: Checking i/o classes...
P Jan 12, 2009 2:12:21 PM: i/o classes are ok. Process output: ExampleSet, AttributeWeights, PerformanceVector.
P Jan 12, 2009 2:12:21 PM: Process ok.
P Jan 12, 2009 2:12:21 PM: Process initialised
P Jan 12, 2009 2:12:21 PM: [NOTE] Process starts
P Jan 12, 2009 2:12:21 PM: Process:
Root[1] (Process)
+- SparseFormatExampleSource[1] (SparseFormatExampleSource)
+- FS[1] (FeatureSelection)
+- FSChain[0] (OperatorChain)
+- XValidation[0] (XValidation)
| +- Learner[0] (LibSVMLearner)
| +- ApplierChain[0] (OperatorChain)
| +- Applier[0] (ModelApplier)
| +- Evaluator[0] (Performance)
+- ProcessLog[0] (ProcessLog)
P Jan 12, 2009 2:12:21 PM: [NOTE] SparseFormatExampleSource: The ID attribute 'id' is defined with a nominal value type but the possible values are not defined! Although this often does not lead to problems (unlike for labels or regular nominal attributes) you might want to specify the possible values by inner tags <value>first</value><value>second</value>....
G Jan 12, 2009 2:13:23 PM: [Fatal] OutOfMemoryError occured in 1st application of FS (FeatureSelection)
G Jan 12, 2009 2:13:23 PM: [Fatal] Process failed: Java heap space
Root[1] (Process)
+- SparseFormatExampleSource[1] (SparseFormatExampleSource)
here ==> +- FS[1] (FeatureSelection)
+- FSChain[0] (OperatorChain)
+- XValidation[0] (XValidation)
| +- Learner[0] (LibSVMLearner)
| +- ApplierChain[0] (OperatorChain)
| +- Applier[0] (ModelApplier)
| +- Evaluator[0] (Performance)
+- ProcessLog[0] (ProcessLog)
G Jan 12, 2009 2:13:24 PM: [Fatal] Java heap space
java.lang.OutOfMemoryError: Java heap space
at com.rapidminer.operator.features.selection.FeatureSelectionOperator.createInitialPopulation(FeatureSelectionOperator.java:172)
at com.rapidminer.operator.features.FeatureOperator.apply(FeatureOperator.java:264)
at com.rapidminer.operator.features.selection.FeatureSelectionOperator.apply(FeatureSelectionOperator.java:151)
at com.rapidminer.operator.Operator.apply(Operator.java:663)
at com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:377)
at com.rapidminer.operator.Operator.apply(Operator.java:663)
at com.rapidminer.Process.run(Process.java:667)
at com.rapidminer.Process.run(Process.java:637)
at com.rapidminer.Process.run(Process.java:627)
at com.rapidminer.gui.ProcessThread.run(ProcessThread.java:61)
I use the Brown corpora for an experiment. I try to limit the number of features for the experiment by using feature selection.
I have used the wizard which comes with RapidMiner to setup a process. The data is loaded from a sparse matrix file into a sparse matrix.
How can I prevent that RM is running out of memory?
Thank you
P Jan 12, 2009 2:12:21 PM: Initialising process setup
P Jan 12, 2009 2:12:21 PM: [NOTE] No filename given for result file, using stdout for logging results!
P Jan 12, 2009 2:12:21 PM: Checking properties...
P Jan 12, 2009 2:12:21 PM: Properties are ok.
P Jan 12, 2009 2:12:21 PM: Checking process setup...
P Jan 12, 2009 2:12:21 PM: Inner operators are ok.
P Jan 12, 2009 2:12:21 PM: Checking i/o classes...
P Jan 12, 2009 2:12:21 PM: i/o classes are ok. Process output: ExampleSet, AttributeWeights, PerformanceVector.
P Jan 12, 2009 2:12:21 PM: Process ok.
P Jan 12, 2009 2:12:21 PM: Process initialised
P Jan 12, 2009 2:12:21 PM: [NOTE] Process starts
P Jan 12, 2009 2:12:21 PM: Process:
Root[1] (Process)
+- SparseFormatExampleSource[1] (SparseFormatExampleSource)
+- FS[1] (FeatureSelection)
+- FSChain[0] (OperatorChain)
+- XValidation[0] (XValidation)
| +- Learner[0] (LibSVMLearner)
| +- ApplierChain[0] (OperatorChain)
| +- Applier[0] (ModelApplier)
| +- Evaluator[0] (Performance)
+- ProcessLog[0] (ProcessLog)
P Jan 12, 2009 2:12:21 PM: [NOTE] SparseFormatExampleSource: The ID attribute 'id' is defined with a nominal value type but the possible values are not defined! Although this often does not lead to problems (unlike for labels or regular nominal attributes) you might want to specify the possible values by inner tags <value>first</value><value>second</value>....
G Jan 12, 2009 2:13:23 PM: [Fatal] OutOfMemoryError occured in 1st application of FS (FeatureSelection)
G Jan 12, 2009 2:13:23 PM: [Fatal] Process failed: Java heap space
Root[1] (Process)
+- SparseFormatExampleSource[1] (SparseFormatExampleSource)
here ==> +- FS[1] (FeatureSelection)
+- FSChain[0] (OperatorChain)
+- XValidation[0] (XValidation)
| +- Learner[0] (LibSVMLearner)
| +- ApplierChain[0] (OperatorChain)
| +- Applier[0] (ModelApplier)
| +- Evaluator[0] (Performance)
+- ProcessLog[0] (ProcessLog)
G Jan 12, 2009 2:13:24 PM: [Fatal] Java heap space
java.lang.OutOfMemoryError: Java heap space
at com.rapidminer.operator.features.selection.FeatureSelectionOperator.createInitialPopulation(FeatureSelectionOperator.java:172)
at com.rapidminer.operator.features.FeatureOperator.apply(FeatureOperator.java:264)
at com.rapidminer.operator.features.selection.FeatureSelectionOperator.apply(FeatureSelectionOperator.java:151)
at com.rapidminer.operator.Operator.apply(Operator.java:663)
at com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:377)
at com.rapidminer.operator.Operator.apply(Operator.java:663)
at com.rapidminer.Process.run(Process.java:667)
at com.rapidminer.Process.run(Process.java:637)
at com.rapidminer.Process.run(Process.java:627)
at com.rapidminer.gui.ProcessThread.run(ProcessThread.java:61)
Tagged:
0
Answers
could you please post your process file (*.xml) so that I can see what rapid miner does when it runs out of memory?
Other informations I need are size of RAM, size of the data set, number of attributes.
Greetings,
Sebastian