How to limit the amount of data coming out of a Radoop Nest
Will this bring all my data out of Hadoop if I am just viewing it?
This is a question often asked about the Radoop Nest operator, especially as it is the controlling operator for all Radoop operators.
In order to limit the amount of data coming out of a Radoop Nest, we have these two options in cascading order of permanence:
1. At the Parameters panel of the Radoop Nest operator, we tick the change sample size box and set the limit there.This will only apply to the process where this particular instance of the Radoop Nest is deployed.
2. At Settings > Preferences > Radoop
This will now set this limit universally across your processes.
Please bear in mind that the value 0 (zero) is a reserved value which means that the sample size is unlimited.
If therefore, you do want to bring all of your data out of your clusted, then setting Sample size overall to "0" will achieve that.
For further details please refer to this documentation page: https://docs.rapidminer.com/7.6/radoop/overview/radoop-settings.html