How much RAM, CPU and Disk do I need for RapidMiner Server?
Question
What are the minimum or optimum CPU, Memory and Disk requirements for the effective use of RapidMiner Server?
Answer
It depends on your data and analytics methods run.
Crunching millions of records with hundreds of fields would need more resources than a few thousand records.
Especially if we you are going to use computationally intesive algorithms like Gradient Boosted Trees or Neural Network based algortihms without Principal Component Analysis (PCA) or any other type of data reduction in the pre-processing stage.
Typically a good start is 8 cores with 32GB RAM and 1TB Hard Disk. This will happily deal with the demands of a small to medium-sized business.
If the plan is to run something really ambitious on terabytes of data on a RapidMiner Server, then an iterative approach may be better.
I.e. Try the analyses with a subset first, pick your preprocessing and your algorithm and test the systems behaviour with more and more data. Then economic decision making will kick in, tradeoffs between cost and benefit etc.
This however begs the question: Why is your data not in Hadoop? If it is, then your server can run on a minimal installation and push the analytics down to your cluster! Have a look here for more details: https://rapidminer.com/products/radoop/
By the way, the repository database does not need much space, a few hundred megabytes more than the installation itself (if you remember what megabytes are). For example, If you are going to train SVM on lots of large text data sets, the SVM models can get large, around 100MB each and this is as big as they get.