The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Outlier detection (in real-time)
Hello RapidMiner-Community,
I'm quite new to the topic of machine learning algorithm. I usually detect outliers simply by calculating a mean value and then use a 2 or 3-fold standard deviation to decide if a data instance is normal or an outlier.
Now, I'd like to add and test some other attributes to detect outliers and so the mentioned method above doesn't do well anymore. The condition is that the algorithms can detect outliers in real time (or with a short delay). The model itself can be calculated during the day.
As I read in Chandolas paper "Anomaly detection: A survey" classification algorithm like one-class SVM could be used, because the testing phase is fast after the model is trained. Also clustering mechanism may be possible because each new instance has to be tested against a few clusters.
Now, I would be very thankful if you could give me some technical advice what algorithm in RapidMiner could actually work to detect outliers in real time, especially with spatio-temporal datasets.
Thanks for your help!
Greetings,
Chis
I'm quite new to the topic of machine learning algorithm. I usually detect outliers simply by calculating a mean value and then use a 2 or 3-fold standard deviation to decide if a data instance is normal or an outlier.
Now, I'd like to add and test some other attributes to detect outliers and so the mentioned method above doesn't do well anymore. The condition is that the algorithms can detect outliers in real time (or with a short delay). The model itself can be calculated during the day.
As I read in Chandolas paper "Anomaly detection: A survey" classification algorithm like one-class SVM could be used, because the testing phase is fast after the model is trained. Also clustering mechanism may be possible because each new instance has to be tested against a few clusters.
Now, I would be very thankful if you could give me some technical advice what algorithm in RapidMiner could actually work to detect outliers in real time, especially with spatio-temporal datasets.
Thanks for your help!
Greetings,
Chis
0
Answers
Let me know how it turns out,
rk
Did you try Cobweb clustering ? It is sensitive to examples' ordering, thus taking into account time parameter.
Thomas.
Thomas.