"Performance and normalization"

jiri · November 2009

I have question regarding normalization of data and impact on performance (prediction).
I normalize sample attribute values (Z-transfromation, Portion or Range) and than use slidingwindow validation (x,1,1,1) with OneR classifier.
Normalization positively improves performence but I´m not sure if it is correct and normalization of whole dataset somehow project future values to the past .

Question: can I apply normalization on whole sample (dataset) in case of slidingwindow?

ExampleSet
Normalization ? (Z-Transformation)
SlidingWindow (window,1,1)
OneR
ModelApplier
Performance

haddock · November 2009

Hi Jiri,

I think you're right, the normalisation, because it runs over all the examples, is data-snooping, and it is quite a fiddle to get it right ( you have to repeatedly make/save/apply the normalisation model - a quick search on this forum will find some code ). On the other hand, well done for spotting the danger; I take it you've already searched this forum on sliding window validation - I've warned elsewhere of the data-snooping risks it entails.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

"Performance and normalization"

Answers