The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Naive bayes vs Naive bayes(kernel)
hi all,
My data set contains numerical values, which are configured as data type " real". Im able to use both operators naive bayes as well as Naive Bayes(kernel) type., with slightly different performance. However, I also see in RM documentation, only Naive bayes(kernel) to be used for numeric attribute.
should I consider only NB(kernel) result, enventhough rapidminer accepts using normal Naive bayes operator too? or
both are acceptable for numercial attribute?
regds
thiru
My data set contains numerical values, which are configured as data type " real". Im able to use both operators naive bayes as well as Naive Bayes(kernel) type., with slightly different performance. However, I also see in RM documentation, only Naive bayes(kernel) to be used for numeric attribute.
should I consider only NB(kernel) result, enventhough rapidminer accepts using normal Naive bayes operator too? or
both are acceptable for numercial attribute?
regds
thiru
Tagged:
0
Best Answer
-
BalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 UnicornHi Thiru,
the difference between stock NB and NB (kernel) is the way numeric attributes are put into the model. You can easily compare this when looking at the model output charts.
Naive Bayes (which can be used with numeric attributes) just assumes that the numerical inputs are normally distributed, calculates the parameters of this normal distribution, and uses it for assigning likelihoods to classes. You see two (or more) Gaussian curves in the model.
Naive Bayes (kernel) instead tries to fit a smoothed curve to the actual values. Therefore you can change some numeric parameters. If your attribute values don't follow a normal distribution, this can better fit them, so the prediction will be better, at the cost of a longer calculation time and more complex models (even with the danger of overfitting in some conditions).
If you find a good set of parameters for you use case and cross validate correctly, both will give you results you can rely on. Depending on your use case, you might want to select the variant giving better results, or the simpler model.
Regards,
Balázs6
Answers