RapidMiner Server: memory management for repository access

MariusHelf · August 2016

Hi all, hi Marco!

We were wondering how RapidMiner Server handles the reading and writing of big big objects from and to the Server Repository.

Say, we write an ExampleSet or a big model (e.g. a complex RandomForest model) of 2 GB to the Server repository. Does the Server cache the complete object in memory, or does it stream to the database? What when we read it back?

In other words: if the memory of the server is restricted to 2 GB, can we still reliably store bigger objects in the repository? (whether this is good practive is another question, but sometimes you have no choice...)

Also, does accessing the repository count against the api limit of the free RapidMiner Server, or does the api limit only apply to processes that are exposed as a webservice?

Cheers,

Marius

Marco_Boeck · August 2016

Hi Marius!

Let me provide a few more details here:

If you create the objects on RM Server itself, then at that point, they already have to be entirely in memory. Does not matter which type of object. So that naturally becomes tricky if they are larger than the maximum memory of RM Server.
If you store an ExampleSet on Server via Studio or the REST API, it will be streamed. This means yes, you can upload larger sets than your memory limit.
If you store a model (or any other IOObject really), it is stored as a binary blob. Thus it will be completely loaded into memory and as such problematic for objects larger than the memory limit.
Indeed, only actual web services count against the limit. Neither does testing those count, nor does using the repository either via REST API or via Studio (SOAP API).

Cheers,

Marco

IngoRM · August 2016

Hi Marius,

Good to hear from you :-)

Unless I am corrected by one of your Server experts, I think the answers to your questions are "yes" and "no". Yes, you can write larger objects to the repository as part of the process execution on Server. If the result is a data set though and you read it back into a free RapidMiner Studio still the row limit would apply though.

And no, the repository access does not count against the API limit. Only web service calls do.

Cheers,

Ingo

MariusHelf · August 2016

Hi Ingo, hi Marco,

thanks for your replies! That answers all my questions

So basically, if you need just the former Collaboration Tier, the Free server will do, unless you train overly complex models or otherwise create big IOObjects. The real use of course only comes in when you can also execute background and heavy duty jobs on the server, so the limit of the Free edition will quickly be reached...

Cheers,

~Marius

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

RapidMiner Server: memory management for repository access

Best Answer

Answers