The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Automodel - large (?) CSV dataset memory issues
I'm doing research on the CIC IDS 2017 dataset which contains 200-300MB of data for one file.
I try to do a automodel and predict the source IP based on other attributes. I get into memory issues running this (I have 16GB RAM) but I assume that I have used a too large dataset or too many attributes for the modeling.
So my question is what number of lines and attributes can I expect to be handled doing this?
Tagged:
0
Answers
Hello,
RapidMiner is a bit resource hungry, but it shouldn't be a problem to load a large file like that one. I have 4 Gb of RAM on my MacBook Air and can load the file.
The thing is that with such a limited amount of memory, I usually do three things to maximize:
However, I also do tune my RapidMiner Studio installation to use more memory. In this case:
Preferences > System > Data Management. I configure that number to be at least twice the amount of training data.
Hope this helps,
Hi,
can you tell us in which step of Automodel this happens? If you read the file and save it as IOObject in the repository, do you still have the same problems?
You can also try avoiding Deep Learning or Gradient Boosting Models, which are very resource consuming.
Cheers,
Sebastian