The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Nomial2Binomial always throws OutOfMemory
hello,
The data set I am working on almost 500k entries. It has about 500 entries each for about 1000 stock tickers. For each entry, the row contains a stock ticker symbol (like C or XIN) and a binomial 0,1.
So i am trying to generate frequent item sets using fpgrowth, but it complains that the ticker synbols are not binomial. So i try to run Nomial2Binomial on the data first. But it runs out of heap space quickly. So i increased the heap size to 3gb, It ran for alot longer, but in the end, it still threw a OutOfMemory exception because the heap space was exhausted.
So my question is there another way to do this? I just started using rapidminer yesterday so if I am missing something obvious, please point it out.
Thanks!
The data set I am working on almost 500k entries. It has about 500 entries each for about 1000 stock tickers. For each entry, the row contains a stock ticker symbol (like C or XIN) and a binomial 0,1.
So i am trying to generate frequent item sets using fpgrowth, but it complains that the ticker synbols are not binomial. So i try to run Nomial2Binomial on the data first. But it runs out of heap space quickly. So i increased the heap size to 3gb, It ran for alot longer, but in the end, it still threw a OutOfMemory exception because the heap space was exhausted.
So my question is there another way to do this? I just started using rapidminer yesterday so if I am missing something obvious, please point it out.
Thanks!
0
Answers
I am not fully sure that I got you right but nevertheless: it seems that you have transactional data and want to transform it into basket data. For that purpose you could use a pivotization. There was a thread about that some days ago:
http://rapid-i.com/rapidforum/index.php/topic,648.0.html
If you want to apply Nominal2Binominal in memory (not in the database) on your data set, this would result in 500K * 1000 symbolds * 8 byte which are approx. 3.8 Gb raw data without any overhead. This is possible on a 64 bit machine with 8Gb+ memory, but certainly not on a 32 bit machine which only uses about 1.5 Gb for Java (no matter what you specify).
Cheers,
Ingo
I searched in the user guide, but could not find any mention of it. I am using Rapidminer 4.3