The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Speeding up Relevance Vector Mahine
edwinanto2003
Member Posts: 1 Learner III
Hello,
I am relatively new to Rapidminer and am learning through video tutorials. I am trying to perform a text classification on the movie-review dataset which has about 1000 positive and negative reviews.
I wanted to know if there is a way I can speed up the process. I tried running the algorithm on the whole dataset and it keeps running out of memory. I referred to earlier problems and increased the MAX_JAVA_MEMORY in
the script files to about 4 gb and and tried running a subset of 100 files. The process has been running for 2 days now. Please let me know if there is a way I can speed up the process or if it would be even possible to
classify on the whole dataset (2000 files - positive and negative reviews).
Any help would be greatly appreciated. Thank you very much .
I am relatively new to Rapidminer and am learning through video tutorials. I am trying to perform a text classification on the movie-review dataset which has about 1000 positive and negative reviews.
I wanted to know if there is a way I can speed up the process. I tried running the algorithm on the whole dataset and it keeps running out of memory. I referred to earlier problems and increased the MAX_JAVA_MEMORY in
the script files to about 4 gb and and tried running a subset of 100 files. The process has been running for 2 days now. Please let me know if there is a way I can speed up the process or if it would be even possible to
classify on the whole dataset (2000 files - positive and negative reviews).
Any help would be greatly appreciated. Thank you very much .
0
Answers
did you try an SVM with linear kernel instead? For an SVM it should be no problem to handle 2000 examples. Just keep in mind that for good results with the SVM you have to optimize the C parameter. A good range you should try is 1e-6 to 1 or 10 on a logarithmic scale.
Best regards,
Marius