The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Locking up during data import
drobertson123
Member Posts: 4 Contributor I
Hello
I am hoping someone has some advice. Being new to Rapid Miner I am not sure if I am missing something, but this doesn't seem to be right.
I am seeing a consistent problem while I attempt to import data from a CSV file. The file contains roughly 5 million rows of data. Each row is comma seperated values containing 3 data items. A date (example: 12/3/2010), an integer representing the time during the day and a decimal value. Everything seems to go fine during the import specification process. When I actually ask it to finish and do the import the software freezes. If I go away and come back to it the program screen is black. It stays that way until I kill the Rapid Miner process.
In Task Manager Rapid Miner is not using any CPU cycles and it isn't consuming much RAM. The program just seems to be blocked.
Does anyone have any idea what is happening and what to do to fix it?
Thanks for the help.
Doug
I am hoping someone has some advice. Being new to Rapid Miner I am not sure if I am missing something, but this doesn't seem to be right.
I am seeing a consistent problem while I attempt to import data from a CSV file. The file contains roughly 5 million rows of data. Each row is comma seperated values containing 3 data items. A date (example: 12/3/2010), an integer representing the time during the day and a decimal value. Everything seems to go fine during the import specification process. When I actually ask it to finish and do the import the software freezes. If I go away and come back to it the program screen is black. It stays that way until I kill the Rapid Miner process.
In Task Manager Rapid Miner is not using any CPU cycles and it isn't consuming much RAM. The program just seems to be blocked.
Does anyone have any idea what is happening and what to do to fix it?
Thanks for the help.
Doug
0
Answers
Assuming you've got enough RAM etc., in your position I'd break the problem down by ...
1. Breaking the data into chunks,
2. Cutting down the column separator possibilities in the CSV read operator,
Because this sort of problem can be caused simply by a wayward column separator, like a space, and scrolling through five million lines is not hugely thrilling!
Happy hunting, hope that nails it down ;D
I have 8GB of RAM on a windows 7 system. I tried a smaller batch of data (4000 records) and it worked. I am trying to nail down where the issue is but I still can't seem to find it.
Are there limitations on the number of rows imported? I work with large data sets and it would be nice to know what limitations I have. Also, should I be upping the memroy settings anywhere to get better performance on large data sets?
I apreciate any advice you can give. This looks like a great tool, but I am still learning a lot.
Doug
As far as I know the limits are OS imposed, and the memory allowed is tweakable in the startup scripts; but I'm on XP and Vista 64 and not familiar with 7.
Good luck!
I would suggest switching to the result perspective while executing the process and watching the memory monitor. If the memory consumption increases steadily and finally the monitor turns red and the gui starts to hang, then you simply have not enough memory.
Of course this might be caused by wrong parameter settings of the importer, but this is unlikely if it works with 4000 samples.
Greetings,
Sebastian
I figured out what the issue was. I am running windows 7 x64 and I had the 32 bit Rapid Miner installed. Despite it running in a 32 bit space it seemed to cause many problems. Please watch out for this in the future.
I now have the 64 bit version installed and it works fine.
I apreciate the good advice people gave.
Thanks,
Doug Robertson