Multivariable time series forecasting
Hello all,
I am doing a time series project in rapidminer. I am trying to forecast future with multiple independent variables.
However, i could not add the independent variables to the process. When the independent variables are windowed, they all have same value.
I tried to use "MultivariateSeries2WindowExamples" and many other tools, but could not manage it.
Can somebody help me in these regard? My process and data is attached below.
Also anothor problem i encountered is to usage of optimize parameters tool in order to optimization of horizon, and windowing size and step size, but despide all the waiting RM could not respond.
I'd be very grateful if someone could provide me any information.
Thanks in advance.
<?xml version="1.0" encoding="UTF-8"?><process version="8.2.000">
<context>
<input/>
<output/>
<macros>
<macro>
<key>futureMonths</key>
<value>15</value>
</macro>
<macro>
<key>horizon</key>
<value>1</value>
</macro>
<macro>
<key>windowSize</key>
<value>6</value>
</macro>
</macros>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="read_excel" compatibility="8.1.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="85">
<parameter key="excel_file" value="C:\Users\sony\Desktop\Data.xlsx"/>
<parameter key="imported_cell_range" value="A1:G11"/>
<parameter key="first_row_as_names" value="false"/>
<list key="annotations">
<parameter key="0" value="Name"/>
</list>
<parameter key="date_format" value="yyyy"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="Date.true.date.attribute"/>
<parameter key="1" value="Loaded.true.integer.attribute"/>
<parameter key="2" value="Unloaded.true.integer.attribute"/>
<parameter key="3" value="GDP.true.integer.attribute"/>
<parameter key="4" value="IPI.true.integer.attribute"/>
<parameter key="5" value="POP.true.integer.attribute"/>
<parameter key="6" value="Energy.true.integer.attribute"/>
</list>
</operator>
<operator activated="true" class="subprocess" compatibility="8.2.000" expanded="true" height="82" name="Set Predictions_Params" width="90" x="179" y="85">
<process expanded="true">
<operator activated="true" class="set_macro" compatibility="8.2.000" expanded="true" height="82" name="Set Window_Size" width="90" x="45" y="34">
<parameter key="macro" value="WindowSize"/>
<parameter key="value" value="6"/>
</operator>
<operator activated="true" class="set_macro" compatibility="8.2.000" expanded="true" height="82" name="Set Horizon" width="90" x="179" y="34">
<parameter key="macro" value="horizon"/>
<parameter key="value" value="1"/>
</operator>
<operator activated="true" class="set_macro" compatibility="8.2.000" expanded="true" height="82" name="Set Future_Years" width="90" x="313" y="34">
<parameter key="macro" value="futureYears"/>
<parameter key="value" value="4"/>
</operator>
<connect from_port="in 1" to_op="Set Window_Size" to_port="through 1"/>
<connect from_op="Set Window_Size" from_port="through 1" to_op="Set Horizon" to_port="through 1"/>
<connect from_op="Set Horizon" from_port="through 1" to_op="Set Future_Years" to_port="through 1"/>
<connect from_op="Set Future_Years" from_port="through 1" to_port="out 1"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="set_role" compatibility="5.3.013" expanded="true" height="82" name="Set Role" width="90" x="112" y="289">
<parameter key="attribute_name" value="Date"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="289">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Loaded"/>
<parameter key="attributes" value="Loaded"/>
</operator>
<operator activated="true" class="filter_examples" compatibility="6.4.000" expanded="true" height="103" name="Filter Examples" width="90" x="447" y="85">
<parameter key="condition_class" value="no_missing_attributes"/>
<list key="filters_list"/>
</operator>
<operator activated="true" breakpoints="after" class="series:windowing" compatibility="5.2.000" expanded="true" height="82" name="Windowing for Training" width="90" x="648" y="238">
<parameter key="window_size" value="%{WindowSize}"/>
<parameter key="create_label" value="true"/>
<parameter key="label_attribute" value="Loaded"/>
<parameter key="horizon" value="%{horizon}"/>
</operator>
<operator activated="true" class="series:sliding_window_validation" compatibility="7.4.000" expanded="true" height="124" name="Validation" width="90" x="782" y="34">
<parameter key="training_window_width" value="1"/>
<parameter key="test_window_width" value="1"/>
<parameter key="horizon" value="%{horizon}"/>
<process expanded="true">
<operator activated="true" class="linear_regression" compatibility="8.2.000" expanded="true" height="103" name="Linear Regression" width="90" x="112" y="85"/>
<connect from_port="training" to_op="Linear Regression" to_port="training set"/>
<connect from_op="Linear Regression" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="8.2.000" expanded="true" height="82" name="Apply Model (2)" width="90" x="112" y="34">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance_regression" compatibility="8.2.000" expanded="true" height="82" name="Performance" width="90" x="380" y="34">
<parameter key="relative_error" value="true"/>
<parameter key="squared_correlation" value="true"/>
</operator>
<connect from_port="model" to_op="Apply Model (2)" to_port="model"/>
<connect from_port="test set" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" breakpoints="after" class="series:windowing" compatibility="5.2.000" expanded="true" height="82" name="Windowing for Application" width="90" x="849" y="493">
<parameter key="window_size" value="%{WindowSize}"/>
<parameter key="label_attribute" value="inputYt"/>
</operator>
<operator activated="true" class="extract_macro" compatibility="8.2.000" expanded="true" height="68" name="Extract Example Count" width="90" x="983" y="493">
<parameter key="macro" value="exampleCount"/>
<list key="additional_macros"/>
</operator>
<operator activated="true" class="filter_example_range" compatibility="8.2.000" expanded="true" height="82" name="Filter Example Range" width="90" x="1117" y="493">
<parameter key="first_example" value="%{exampleCount}"/>
<parameter key="last_example" value="%{exampleCount}"/>
</operator>
<operator activated="true" class="remember" compatibility="8.2.000" expanded="true" height="68" name="Remember" width="90" x="1251" y="493">
<parameter key="name" value="data"/>
</operator>
<operator activated="true" class="loop" compatibility="8.2.000" expanded="true" height="82" name="Loop" width="90" x="983" y="238">
<parameter key="iterations" value="%{futureYears}"/>
<process expanded="true">
<operator activated="true" class="recall" compatibility="8.2.000" expanded="true" height="68" name="Recall" width="90" x="45" y="136">
<parameter key="name" value="data"/>
</operator>
<operator activated="true" class="apply_model" compatibility="7.1.001" expanded="true" height="82" name="Apply Model" width="90" x="179" y="34">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="multiply" compatibility="8.2.000" expanded="true" height="103" name="Multiply" width="90" x="447" y="30"/>
<operator activated="true" class="materialize_data" compatibility="8.2.000" expanded="true" height="82" name="Materialize Data (2)" width="90" x="179" y="187"/>
<operator activated="true" class="generate_attributes" compatibility="6.4.000" expanded="true" height="82" name="Increase Date (2)" width="90" x="380" y="187">
<list key="function_descriptions">
<parameter key="Date" value="date_add(Date, 1, DATE_UNIT_YEAR)"/>
</list>
</operator>
<operator activated="true" class="set_role" compatibility="5.3.013" expanded="true" height="82" name="Set Role (2)" width="90" x="179" y="340">
<parameter key="attribute_name" value="prediction(label)"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes (2)" width="90" x="313" y="340">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Loaded-5"/>
<parameter key="attributes" value="loaded-4|IPI-4"/>
<parameter key="invert_selection" value="true"/>
</operator>
<operator activated="true" class="rename" compatibility="8.2.000" expanded="true" height="82" name="Rename" width="90" x="447" y="340">
<parameter key="old_name" value="Loaded-4"/>
<parameter key="new_name" value="Loaded-5"/>
<list key="rename_additional_attributes">
<parameter key="Loaded-3" value="Loaded-4"/>
<parameter key="Loaded-2" value="Loaded-3"/>
<parameter key="Loaded-1" value="Loaded-2"/>
<parameter key="Loaded-0" value="Loaded-1"/>
<parameter key="prediction(label)" value="Loaded-0"/>
</list>
</operator>
<operator activated="true" class="remember" compatibility="8.2.000" expanded="true" height="68" name="Remember (2)" width="90" x="581" y="340">
<parameter key="name" value="data"/>
</operator>
<connect from_port="input 1" to_op="Apply Model" to_port="model"/>
<connect from_op="Recall" from_port="result" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_port="output 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Materialize Data (2)" to_port="example set input"/>
<connect from_op="Materialize Data (2)" from_port="example set output" to_op="Increase Date (2)" to_port="example set input"/>
<connect from_op="Increase Date (2)" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
<connect from_op="Set Role (2)" from_port="example set output" to_op="Select Attributes (2)" to_port="example set input"/>
<connect from_op="Select Attributes (2)" from_port="example set output" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Remember (2)" to_port="store"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_output 1" spacing="0"/>
<portSpacing port="sink_output 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="append" compatibility="8.2.000" expanded="true" height="82" name="Append" width="90" x="1184" y="136"/>
<connect from_op="Read Excel" from_port="output" to_op="Set Predictions_Params" to_port="in 1"/>
<connect from_op="Set Predictions_Params" from_port="out 1" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Windowing for Training" to_port="example set input"/>
<connect from_op="Windowing for Training" from_port="example set output" to_op="Validation" to_port="training"/>
<connect from_op="Windowing for Training" from_port="original" to_op="Windowing for Application" to_port="example set input"/>
<connect from_op="Validation" from_port="model" to_op="Loop" to_port="input 1"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
<connect from_op="Windowing for Application" from_port="example set output" to_op="Extract Example Count" to_port="example set"/>
<connect from_op="Extract Example Count" from_port="example set" to_op="Filter Example Range" to_port="example set input"/>
<connect from_op="Filter Example Range" from_port="example set output" to_op="Remember" to_port="store"/>
<connect from_op="Loop" from_port="output 1" to_op="Append" to_port="example set 1"/>
<connect from_op="Append" from_port="merged set" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
Answers
Hi everyone again,
I found similar question on this post https://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Prediction-with-several-attributes/m-p/47321#M30517
However, my question still stands. After the append except for the label other variables stays same value.
Can anyone help me on this issue.
Thanks
tagging @tftemme...
You can use free modeling of rapid miner and see path of this way...
Best regards.
@elham_calm
Hi,
As far as I know the series extension only support univariate models. You can try a windowing approach, aggregating data in each window and then applying a "normal" model like decision trees. But that's definitely different from applying multivariate time series.
If you really need the multivariate time series approach, your option is to use one of the scripting operators. I have done so successfully with Execute R and the R package "vars".
I hope that VAR models appear on the series extension someday!
Hi,
i disagree. Using Windowing e.g. with Aggregations together with a multi-variate model like an SVM is mult-variate time series forecasting. It often also yields better results than multi-variate expansions of ARIMA.
BR,
Martin
Dortmund, Germany
Hi Martin,
I don't think it's time series from a model point of view (i.e. model with regressor and noise terms). I think the time series approach could be used if the interest lies on the model itself and its coefficients. Otherwise a more black box approach with complex models should give better predictions.
Regards,
Sebastian
Hi @SGolbert,
ahh, good. so we are on the same page.
BR,
Martin
Dortmund, Germany