Best Practices for Folder Structures in Repositories
RapidMiner Repositories give you the option to store anything in folders. Here is a ‘best practice’ on how to organize the folders to make them easier to use.
There should be one folder per project. This can be either at the top-level of your Local Repository or in a projects folder on the top level of a Server repository. Our proposed folder structure would be:
- app
- View 1
- View 2
- data
- debug
- models
- processes
- subprocesses
- results
- webservices
Note: Italic folders are not mandatory
app
The app folder contains all processes related to an app. In larger processes it makes sense to use subfolders for each View on the app – View 1, View 2, above. Only the global processes (like !Initialize) would be on the top level.
data
Simply contains all data used in the analysis.
debug
From time to time it is needed to have debug data - mostly to test things during the design of the process. A common example would be a data base sample which might be used instead of the real, full database.
processes
This is the main folder holding all processes of your analysis. It often makes sense to create a subprocess folder which contains function-like processes which are used throughout the main processes via Execute Process.
results / models
The results folder contains all results of the modelling process. Usually there are performance and models. In the case of multiple models - either because there are many different types of models you want to try, or because you want to predict many labels - it makes sense to have a dedicated folder for each model.
webservices
contains all processes which are used to offer a webservice. In rare cases a subprocess folder might be of use.
Dortmund, Germany