The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
ModelApplier needs to much memory with high-dimensional data?
Legacy User
Member Posts: 0 Newbie
Hi again,
I was playing around with the cross validation for some time using one of the templates that come with RapidMiner and the sparse toy data file. Using the toy data, the standard-XVal with a LibSVM classification learner + ModelApplier + Evauator runs in less than 2 sek.
Then I changed the the dimension of the data from the current 25 features to something larger (e.g. 100000), simply by adding 1 additional feature with the index 99999 and some value to each of my 10 sparse data vectors.
Unfortunately, the application (!) of the learned model to the test data now runs extremely long, using incredible amounts of memory. When I do the same without RapidMiner, using a simple perl script and the standard LibSVM implementation, the XVal is again done in seconds. Am I using the wrong ModelApplier or wrong options?
Thank you so much,
Mome
I was playing around with the cross validation for some time using one of the templates that come with RapidMiner and the sparse toy data file. Using the toy data, the standard-XVal with a LibSVM classification learner + ModelApplier + Evauator runs in less than 2 sek.
Then I changed the the dimension of the data from the current 25 features to something larger (e.g. 100000), simply by adding 1 additional feature with the index 99999 and some value to each of my 10 sparse data vectors.
Unfortunately, the application (!) of the learned model to the test data now runs extremely long, using incredible amounts of memory. When I do the same without RapidMiner, using a simple perl script and the standard LibSVM implementation, the XVal is again done in seconds. Am I using the wrong ModelApplier or wrong options?
Thank you so much,
Mome
0
Answers
this might result from some internal conversions, but I'm not sure. Could you please send me the example data file and the process?
Greetings,
Sebastian
The SparseFormatExampleSource has a "DataManagement" parameter. When I store 1 Mio (very sparse set) attributes for thousands of samples using a double_array, I assume this leads to an extremely large (and extremely sparse) matrix. Choosing "boolean_sparse_array" instead worked well for my problem. I promis to read the operator description more carefully next time
Thanks a lot
Mome