The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Tomorrow and the day after tomorrow..
Hi there!
I did validation & test job for stock price forcast as below.
Could you tell me is there anything wrong in my understanding?
(1)Data : I have 780 X 5 data(as like below)
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
(2)Validation : I trained my PolynomialRegressin model by SlidingWindowValidation.
and wrote this model.
Here's XML for validation
*Here's XML for test
1.Training_Window_Width : 75
2.Training_Window_Step_size : 1
3.Test_window_width : 1
4.Horizon : 1
Did I used "the day after tomorrow's data" to predict "tomorrow' price" in test process even after 75th data?
I did validation & test job for stock price forcast as below.
Could you tell me is there anything wrong in my understanding?
(1)Data : I have 780 X 5 data(as like below)
-------------------------------------------------------------------------------
Date ND_C DJ_C KSP_O PL 2006-03-30 0.13 -0.58 0.09 2.80 2006-03-31 -0.04 -0.37 1.85 1.55 2006-04-03 -0.13 0.32 1.04 1.05 2006-04-04 0.37 0.53 0.67 1.05 ... 2009-06-01 0.06 3.02 2.57 -3.35 |
(2)Validation : I trained my PolynomialRegressin model by SlidingWindowValidation.
and wrote this model.
Here's XML for validation
<operator name="Root" class="Process" expanded="yes">(3)Test : I loaded that model and apply to "SAME" data set that was used in Condition2.
<operator name="ExcelExampleSource" class="ExcelExampleSource">
<parameter key="excel_file" value="C:\NDDJ_3cls.xls"/>
<parameter key="sheet_number" value="2"/>
<parameter key="first_row_as_names" value="true"/>
<parameter key="create_label" value="true"/>
<parameter key="label_column" value="5"/>
<parameter key="create_id" value="true"/>
</operator>
<operator name="ExampleVisualizer" class="ExampleVisualizer">
</operator>
<operator name="SlidingWindowValidation" class="SlidingWindowValidation" expanded="yes">
<parameter key="training_window_width" value="75"/>
<parameter key="training_window_step_size" value="1"/>
<parameter key="test_window_width" value="1"/>
<operator name="OperatorChain (2)" class="OperatorChain" expanded="yes">
<operator name="PolynomialRegression" class="PolynomialRegression">
</operator>
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="C:\DJ_NN_SW.mod"/>
</operator>
</operator>
<operator name="OperatorChain" class="OperatorChain" expanded="yes">
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="Performance" class="Performance">
</operator>
</operator>
</operator>
</operator>
*Here's XML for test
(4)SlidingWidow parameter
<operator name="Root" class="Process" expanded="yes">
<operator name="ExcelExampleSource" class="ExcelExampleSource">
<parameter key="excel_file" value="C:\NDDJ_3cls.xls"/>
<parameter key="sheet_number" value="3"/>
<parameter key="first_row_as_names" value="true"/>
<parameter key="label_column" value="4"/>
<parameter key="create_id" value="true"/>
</operator>
<operator name="ModelLoader" class="ModelLoader">
<parameter key="model_file" value="C:\DJ_NN_SW.mod"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
</operator>
1.Training_Window_Width : 75
2.Training_Window_Step_size : 1
3.Test_window_width : 1
4.Horizon : 1
Did I used "the day after tomorrow's data" to predict "tomorrow' price" in test process even after 75th data?
0
Answers
Hope that doesn't make things more confusing...
First, the ModelWriter will be executed for each iteration of the SlidingWindowValidation and since you are using a constant file name, your model will be overwritten again and again. Your second process will read only the result of the last iteration. To avoid this behaviour, you can use %{a} in the filename to append the iteration number to the filename. In that case, you will end up with several models, so you have to modify your second process.
Apart from that, you are not training on time series because your data contains one entry for each point in time. To transform this series into windows, you can, e.g., use the MultivariateSeries2WindowExamples.
Best,
Simon