The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Answers
2. click on "CSVExampleSource", and change it so loads your own dataset.
3. My dataset also uses the class attribute label: "close" so this should be the same.
If not chance all occurrences of close to your own label.
4. Change W-REPTree to any regression learning algorithm that suits you.
#### example.xml ####
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="D:\wessel\Desktop\testBook1.csv"/>
</operator>
<operator name="MultivariateSeries2WindowExamples (2)" class="MultivariateSeries2WindowExamples">
<parameter key="window_size" value="2"/>
<parameter key="label_attribute" value="close"/>
<parameter key="add_incomplete_windows" value="true"/>
</operator>
<operator name="ChangeAttributeName" class="ChangeAttributeName">
<parameter key="old_name" value="label"/>
<parameter key="new_name" value="close"/>
</operator>
<operator name="FeatureNameFilter" class="FeatureNameFilter">
<parameter key="skip_features_with_name" value="close-0"/>
</operator>
<operator name="FixedSplitValidation" class="FixedSplitValidation" expanded="yes">
<parameter key="training_set_size" value="30"/>
<operator name="W-REPTree" class="W-REPTree">
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
<parameter key="create_view" value="true"/>
</operator>
<operator name="RegressionPerformance" class="RegressionPerformance">
<parameter key="root_mean_squared_error" value="true"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
</operator>
</operator>
</operator>
</operator>
can you explain where is exacly you set the prediction ittem?and how can i control it, i mean how i predict one day or 2 days or three days??
why other attrbute give open-0, open-1...what does this mean?
i hope i'm not bothering you as i'm new for RM.
BR
Please be very careful about the validation operator you use, with what you have you can easily be training on examples that occur after the examples you are testing! It makes more sense to train on the past, or am I missing something? I think I brought this up quite recently, http://rapid-i.com/rapidforum/index.php/topic,908.msg3395.html#msg3395 , and amazingly here as well http://rapid-i.com/rapidforum/index.php/topic,954.msg3593.html#msg3593 so I'm probably wasting my time bringing it up again.
But are we really training on future examples, and testing on past examples?
Let's say we have a simple dataset, made by hand.
It consists of 3 numerical attributes:
t: "hour of the day"
u: "barometer"
v: "wind speed"
And it only has 8 example instances.
## Dataset
t u v
--------------
t1 u1 v1
t2 u2 v2
t3 u3 v3
t4 u4 v4
t5 u5 v5
t6 u6 v6
t7 u7 v7
t8 u8 v8
## Forecasting v9
Let's say we wish to predict the wind speed at the next hour, so the value of v+1.
We already know the value of t+1, since time or "hour of the day" is a fully determinative attribute.
t+1 = t+0 + 1; if(t+1 == 24) {t+1 = 0}
Unlike the value of u+1, since we cannot look ahead into the future and see what our barometer looks like.
## Start simple
So we could start simple, throw away barometer for now, and see how well we can forecast v+1 using just t. Intuitively I would do a “keep order” 66% split to test the performance.
So train on:
t1 v1
t2 v2
t3 v3
t4 v4
t5 v5
And test on:
t6 .
t7 .
t8 .
But I guess training on a “random order” 66% split would work just as well.
So train on: (randomly selected 6,4,8,1,2)
t6 v6
t4 v4
t8 v8
t1 v1
t2 v2
And test on:
t7 .
t5 .
t3 .
Or am I making some thinking error here?
## A bit harder
Now when we make the problem a bit harder, by adding barometer information, I think this still holds.
Let’s say we take window size of 2, so we are adding the information of the barometer readings from the pervious hour u-1. Then our dataset would look like this:
t-0 u-0 v-0 t-1 u-1 v-1
--------------------------------------------
t1 u1 v1 ? ? ?
t2 u2 v2 t1 u1 v1
t3 u3 v3 t2 u2 v2
t4 u4 v4 t3 u3 v3
t5 u5 v5 t4 u4 v4
t6 u6 v6 t5 u5 v5
t7 u7 v7 t6 u6 v6
t8 u8 v8 t7 u7 v7
## Throw away
I don’t think adding a training example with ? adds any useful information. So I think its best to just throw it away. But maybe I’m wrong here.
Secondly we can’t use training examples of the form:
t-0 u-0 v-0 t-1 u-1 v-1
Because u-0 is future information.
(t-0 isn’t because t is fully deterministic)
So we have to throw u-0 away.
But since t-0 and t-1 are correlated 100% we might as well also throw it away, without loosing any information. So then we end up with a new dataset:
v-0 t-1 u-1 v-1
---------------------------
v2 t1 u1 v1
v3 t2 u2 v2
v4 t3 u3 v3
v5 t4 u4 v4
v6 t5 u5 v5
v7 t6 u6 v6
v8 t7 u7 v7
Now does it matter in what order you feed your training examples to your learning algorithm? Well that’s a hard question actually. I think it does, depending on the learning algorithm.
And I guess you could test this quite easily.
Evaluate the performance of your learning algorithm 3 times, using cross validation, random order percentage split, and fixed order percentage split. And see if there is a significant difference.
While this doesn't matter if the domain, like the study of the humble Iris, exhibits constant properties, it is less clear that it is in any way appropriate with closing prices. Surely guessing 10 days hence is easier if you have information about day 9?
Anyways, here's the demo.
Dataset:
v-0 t-1 u-1 v-1
---------------------------
v2 t1 u1 v1
v3 t2 u2 v2
v4 t3 u3 v3
v5 t4 u4 v4
v6 t5 u5 v5
v7 t6 u6 v6
v8 t7 u7 v7
Good:
Sampling_type == "linear"
v2 t1 u1 v1 TRAIN
v3 t2 u2 v2
v4 t3 u3 v3
v5 t4 u4 v4
v6 t5 u5 v5
v7 t6 u6 v6 TEST
v8 t7 u7 v7
Bad:
Sampling_type == "shuffled"
v2 t1 u1 v1 TRAIN
v8 t7 u7 v7
v5 t4 u4 v4
v4 t3 u3 v3
v6 t5 u5 v5
v7 t6 u6 v6 TEST
v3 t2 u2 v2
I don't think this is either GOOD or BAD.
I'm not sure for what learners the order of trainingsexamples matters.
I guess it doesn't matter for tree's, nearest neighbour, linear regression, Bayesian networks, because they take an entire exampleset as input.
I guess it does matter for neural networks, updatable Baysian networks, because they iterate over the example set.
Of course it ALWAYS matters when your trying to learn a non-static target function.
Let's say we're trying to predict the temperature for the next day.
Since the earth is getting hotter and hotter its no good learning from data from a 1000 year ago.
But then you just have to throw away the old data I guess.
Or putting it more succinctly, if you want credible results in economic time-series forecasting use sliding window validations.
If it makes generates testing examples the same way as "MultivariateSeries2WindowExamples" then it should be the same?
The same as:
Good:
Sampling_type == "linear"
v2 t1 u1 v1 TRAIN
v3 t2 u2 v2
v4 t3 u3 v3
v5 t4 u4 v4
v6 t5 u5 v5
v7 t6 u6 v6 TEST
v8 t7 u7 v7
But let's say I have 1200 hours of data.
|--------------------------------------------------------------------------------------------------------------|
And I take a
training window with of 120
training window step size -1
test window with of 120
horizon 1
would it then do this?
|--------------------------------------------------------------------------------------------------------------|
||----------||----------||----------||----------||----------||----------||----------||----------||----------||----------||
And how do is split the inner windows into?
|-------...|
train, test
As you have it set up
1. the window first lands when it has filled its training window, so last example =100,
2. Model gets built on training window,
3. Model is run on the test window, in this case examples 101- 200.
4. Then it advances the training window by the step size, which in your case means 100 because -1 makes the step size the same as the training window size, so the training window is now 101-200, and the test window 201-300,
5. Then it goes to step 2 and repeats until there are not enough examples to fill the test set.
If you run the following you'll see the sequence.. In the diagram above |----------||----------| represents a training set and a test set |---Train---||---Test---|
In the case of a horizon of 8 there would be 7 examples not used between the train and test set so like this..
|---Train---|horizon|---Test---|
So the window stops sliding when there are less than training window size + horizon -1 examples left after the last training example.
Hope that helps,
Good weekend!
- why isn't the correlation of Zero-R, 0?
- why isn't the relative_error of Zero-R, 100%?
2. How do I output my found models in the final result?
- so when I hit play, without setting break points
3. Is this finally a fair time-series comparison?
Weka Zero-R
PerformanceVector
PerformanceVector: absolute_error: 155.000 +/- 0.000 (mikro: 155.000 +/- 6.922)
relative_error: 19.53% +/- 8.29% (mikro: 19.53% +/- 8.32%)
normalized_absolute_error: 25.833 +/- 0.000 (mikro: 25.833)
correlation: 1.000 prediction_average: 922.500 +/- 325.552 (mikro: 922.500 +/- 325.625)
spearman_rho: 0.000 +/- 0.000 (mikro: 0.000)
kendall_tau: 0.000 +/- 0.000 (mikro: 0.000)
Linear Regression
PerformanceVector
PerformanceVector: absolute_error: 0.000 +/- 0.000 (mikro: 0.000 +/- 0.000)
relative_error: 0.00% +/- 0.00% (mikro: 0.00% +/- 0.00%)
normalized_absolute_error: 0.000 +/- 0.000 (mikro: 0.000)
correlation: 1.000 +/- 0.000 (mikro: 1.000)
prediction_average: 922.500 +/- 325.552 (mikro: 922.500 +/- 325.625)
spearman_rho: 1.000 +/- 0.000 (mikro: 47.000)
kendall_tau: 1.000 +/- 0.000 (mikro: 47.000)
<?xml version="1.0" encoding="windows-1252"?>
<process version="4.4">
<operator name="Root" class="Process" expanded="yes">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="encoding" value="SYSTEM"/>
<operator name="1500 examples" class="ExampleSetGenerator">
<parameter key="target_function" value="random"/>
<parameter key="number_examples" value="1500"/>
<parameter key="number_of_attributes" value="1"/>
<parameter key="attributes_lower_bound" value="-10.0"/>
<parameter key="attributes_upper_bound" value="10.0"/>
<parameter key="local_random_seed" value="-1"/>
<parameter key="datamanagement" value="double_array"/>
</operator>
<operator name="IdTagging" class="IdTagging">
<parameter key="create_nominal_ids" value="false"/>
</operator>
<operator name="make ID regular" class="ChangeAttributeRole">
<parameter key="name" value="id"/>
<parameter key="target_role" value="regular"/>
</operator>
<operator name="rename id to wind" class="ChangeAttributeName">
<parameter key="old_name" value="id"/>
<parameter key="new_name" value="wind"/>
</operator>
<operator name="wind only" class="FeatureNameFilter">
<parameter key="filter_special_features" value="false"/>
<parameter key="skip_features_with_name" value=".*"/>
<parameter key="except_features_with_name" value="wind"/>
</operator>
<operator name="MultivariateSeries2WindowExamples" class="MultivariateSeries2WindowExamples">
<parameter key="series_representation" value="encode_series_by_examples"/>
<parameter key="horizon" value="0"/>
<parameter key="window_size" value="96"/>
<parameter key="step_size" value="1"/>
<parameter key="create_single_attributes" value="true"/>
<parameter key="add_incomplete_windows" value="false"/>
</operator>
<operator name="remove horizon attributes" class="FeatureNameFilter">
<parameter key="filter_special_features" value="false"/>
<parameter key="skip_features_with_name" value="wind-([1-9]|1[0-9]|2[0-3])"/>
</operator>
<operator name="set label: wind-0" class="ChangeAttributeRole">
<parameter key="name" value="wind-0"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="IOMultiplier" class="IOMultiplier">
<parameter key="number_of_copies" value="1"/>
<parameter key="io_object" value="ExampleSet"/>
<parameter key="multiply_type" value="multiply_one"/>
<parameter key="multiply_which" value="1"/>
</operator>
<operator name="SlidingWindowValidation ZR" class="SlidingWindowValidation" expanded="yes">
<parameter key="keep_example_set" value="false"/>
<parameter key="create_complete_model" value="false"/>
<parameter key="training_window_width" value="240"/>
<parameter key="training_window_step_size" value="-1"/>
<parameter key="test_window_width" value="24"/>
<parameter key="horizon" value="24"/>
<parameter key="cumulative_training" value="false"/>
<parameter key="average_performances_only" value="true"/>
<operator name="W-ZeroR" class="W-ZeroR">
<parameter key="keep_example_set" value="false"/>
<parameter key="D" value="false"/>
</operator>
<operator name="OperatorChain ZR" class="OperatorChain" expanded="yes">
<operator name="ModelApplier ZR" class="ModelApplier">
<parameter key="keep_model" value="true"/>
<list key="application_parameters">
</list>
<parameter key="create_view" value="false"/>
</operator>
<operator name="RegressionPerformance ZR" class="RegressionPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="main_criterion" value="relative_error"/>
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="false"/>
<parameter key="relative_error_strict" value="false"/>
<parameter key="normalized_absolute_error" value="true"/>
<parameter key="root_relative_squared_error" value="false"/>
<parameter key="squared_error" value="false"/>
<parameter key="correlation" value="true"/>
<parameter key="squared_correlation" value="false"/>
<parameter key="prediction_average" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="kendall_tau" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="true"/>
</operator>
</operator>
</operator>
<operator name="SlidingWindowValidation" class="SlidingWindowValidation" expanded="yes">
<parameter key="keep_example_set" value="false"/>
<parameter key="create_complete_model" value="false"/>
<parameter key="training_window_width" value="240"/>
<parameter key="training_window_step_size" value="-1"/>
<parameter key="test_window_width" value="24"/>
<parameter key="horizon" value="24"/>
<parameter key="cumulative_training" value="false"/>
<parameter key="average_performances_only" value="true"/>
<operator name="LinearRegression" class="LinearRegression">
<parameter key="keep_example_set" value="false"/>
<parameter key="feature_selection" value="M5 prime"/>
<parameter key="eliminate_colinear_features" value="true"/>
<parameter key="use_bias" value="true"/>
<parameter key="min_standardized_coefficient" value="1.5"/>
<parameter key="ridge" value="1.0E-8"/>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<parameter key="keep_model" value="true"/>
<list key="application_parameters">
</list>
<parameter key="create_view" value="false"/>
</operator>
<operator name="RegressionPerformance" class="RegressionPerformance">
<parameter key="keep_example_set" value="true"/>
<parameter key="main_criterion" value="relative_error"/>
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="true"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="false"/>
<parameter key="relative_error_strict" value="false"/>
<parameter key="normalized_absolute_error" value="true"/>
<parameter key="root_relative_squared_error" value="false"/>
<parameter key="squared_error" value="false"/>
<parameter key="correlation" value="true"/>
<parameter key="squared_correlation" value="false"/>
<parameter key="prediction_average" value="true"/>
<parameter key="spearman_rho" value="true"/>
<parameter key="kendall_tau" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="true"/>
</operator>
</operator>
</operator>
</operator>
</process>
highest ID is 150, which is the number of samples in the datafile.
for the 2nd breakpoint I get
So the highest trainer is 150, but the lowest tester is somewhere around 1...5 or so.
I
ZeroR models predict that each label in the test set will have the average label value found in the training set. By using ZeroR in a time-series you create a moving average predictor, lagged by the distance between the mid-point of the training set and the mid-point of the test set, put more prosaically as...
Training window/2 + Horizon-1 + Test window/2
So in your case
Lag=240/2 + 24-1 +24/2 = 120+23+12 = 155
As each label has a value one more than the one before, we would expect predictions to be, on average, less than actual by an amount equal to the lag, and that is what the absolute error shows. Moreover changes in prediction plot linearly against changes in actual, providing a correlation slope of 1; for these changes to provide correlation of 0.0 you'd need plots like the bottom row here..
http://en.wikipedia.org/wiki/Correlation Because only two conditions produce a relative error of 100%, the first when the prediction is 0.00, the second when the absolute value of the prediction is twice the actual value; in your set up neither scenario is possible.
see also. http://en.wikipedia.org/wiki/Relative_error To use the models later within the process IOStore/IORetrieve, to use in another process ModelWriter/ModelLoader see above. In this context, how is 'fair' defined?
My example is supposed to be as a good one for time series forecasting.
You are using the iris dataset, which is a classification problem, not a forecasting problem.
Using ID tagging, and an Example Set Generator I generate a a univariate time series, with target function v-0 = v-1 + 1.
Or I think you correctly name it like this. Maybe target function: v_t=0 = v_t=-1 + 1
Yet the goal of my set-up is to compare the performance of two different methods to time series forecasting.
It's currently only showing, for the 2 approaches in 2 different tabs.
I would like this information to be in 1 single tab.
absolute_error
relative_error
normalized_absolute_error
correlation
spearman_rho
kendall_tau
Also it would be nice if it said the number of training / testing examples used.
And the output of the model in text.
And if possible, also if the difference in performance is significant.