kNN prediction score
Hello.
I work with a KNN-model for regression. I would like to evaluate the predictions my model does on my testset.
I imagine that you could evaluate how close the new point (with unknown label) is on an existing point from the training set. If the new point is exactly on top of it, the prediction score would be 1, and otherwise fall progressively the farther the point is from other points.
Is this possible in Rapid Miner?
Best Answer
-
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
Hi @onep, you certainly can do something like this.
If you download the "anomaly detection" extension from the marketplace there is an operator called "k-NN global anomaly score". This will produce a value called "outlier" which is the distance to the k-NN you specify. So in your case, you would run your k-NN model with k=1, then run the k-NN global anomaly score also with k=1, and then you can transform the predicted score with the outlier value using whatever function you want (using "generate attributes").
One caution: I am not sure, conceptually, why the prediction would necessarily fall in magnitude the farther it is from its nearest neighbor. I guess it depends on the structure of your dataset and what you are modeling. Perhaps a more intuitive representation here is that as the k-NN distance grows, the confidence in the prediction accuracy falls, but that doesn't necessarily mean that the true value is lower than the predicted value.
I hope this is helpful!
Regards,
0
Answers
Hi Mathias,
maybe the weighted vote is something for you? It is not exactly what you want to do, but it weights the influence of every neighbour by it's distance.
Another option would be to use a SVM with a radial kernel. Even though the math is different and more complex it often turns out to be similar to a k-NN in terms of decision bounderies.
~Martin
Dortmund, Germany
Thank you for your suggestion to look at k-NN global anomaly score - looks like what I was looking for!
I totally agree with you that it's the confidence we are looking at and not the prediction accuracy - thank for making that clear