The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
W-NaiveBayesMultiNomial vs NaiveBayes
Does anyone already have an idea what are the main differences between the two NB classifier implementations?
There should be some, because I am obtaining totally different results using them.
Most of the time weka implementation yields to a lot better results with my dataset which has around 200 attributes and 100 samples.
?
There should be some, because I am obtaining totally different results using them.
Most of the time weka implementation yields to a lot better results with my dataset which has around 200 attributes and 100 samples.
?
0
Answers
which RapdiMiner version do you use?
If someone could say me, what the WEKA operator does, I could explain the differences. But it seems to me, they aren't doing just NaiveBayes...
Greetings,
Sebastian
I am using RM v4.4. I already gave a look to the source code of RM NaiveBayes and I agree that it implements pretty straight forward NaiveBayes, whereas my feeling about Weka (without reading the source code) is also in line with yours. It seems that it implements some other things (at least more numerical tricks)which helps quite a lot to make NB more robust.
I also added a few things to the original RM NB to make it more robust (and/or suit better to my dataset), i.e. homogeneous priors assumption, Poisson dist. assumption instead of Gaussian, calculation of log-likelihoods instead of likelihoods. And these helped me to get better prediction accuracy but I am still not as good as the Weka implemantation.
Does anyone have any idea what exactly Weka NaiveBayes is doing??
Erk
actually, there had been some issues with the Naive Bayes implementation around the time we released RM 4.4 and we put some effort into stabilising NB numerically. As far as I remember, this was shortly after the release of 4.4. We also added the calculation of log-likelihoods then. If I remember correctly, Weka does not compute log-likelihoods, but rescales the probabilities during the multiplication of the conditional attribute value probabilities if the product becomes to small. Both these ways to gain numerical stability should be possible and yield relatively similar results - which is what we observed in numerous tests we have run to test our implementation. To sum up, the momentary version of NB should be more stable than the 4.4 version and you might want to have a look at it.
Hope that helps,
kind regards,
Tobias
Thanks for the reply. Unfortunately, using RM 5.0 I still observe the similar trend that Weka NB outperforms RM one significantly. A Weka savy user may bring some light on this issue, I hope.
Erk