The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Cluster Sampling in RapidMiner
Hi,
i would like to use the Cluster Sampling Method in RapidMiner (e.g. look at Towardsdatascience Article for Sampling Techniques)
Do you have any suggestions?
Thank you very much.
Bes
i would like to use the Cluster Sampling Method in RapidMiner (e.g. look at Towardsdatascience Article for Sampling Techniques)
Do you have any suggestions?
Thank you very much.
Bes
Tagged:
0
Comments
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
I am not sure if there is a particular operator in RM to do this. If this is implemented in Python or R, you can use the script operators to embed in the RM process.
One disadvantage from my view is that it is selecting entire sampled data from a few clusters which might either over-represent or under-represent the distributions in data. The problem with this is the high variations (low precision) in results. The major advantage is the processing time (fast) as it doesn't go through all the samples in our dataset. If you would like to have more precise results, you can go with stratified sampling.
Based on the concept, one way to do what you need is by using clustering algorithms to generate clusters and select few clusters from that and test your process and observe how it goes. I didn't try this but got an idea based on the concept.
Hope this helps.
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing