The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How can I use deep learning with windowing operator when the horizon is larger than one?
Hello Guys,
I am trying to use the process example "s&p 500 regression using windowing and convolution" and it works well to predict the price for the next day when in windowing operator (horizon=1); however if horizon is larger than 1 (a few days ahead forecast) the deep learning operator fails.
Question: Do you have an example where I can use deep learning, windowing and horizon > 1? I will be happy if the example "s&p 500 regression using windowing and convolution" could be modified to consider horizon > 1. I am aiming to forecast price for the next few mins ahead therefore I need horizon > 1.
I have also tried to the same deep learning operator used in the example mentioned above but this time using multi horizon forecast and the same problem occurs. Deep learning cant handle a situation when horizon > 1. I am not expert in deep learning operator but I think the apparent limitation is associated with multi label handling?
Other operators such as gradient boosted tree works well with horizon > 1
Below I have attached the process.
I am trying to use the process example "s&p 500 regression using windowing and convolution" and it works well to predict the price for the next day when in windowing operator (horizon=1); however if horizon is larger than 1 (a few days ahead forecast) the deep learning operator fails.
Question: Do you have an example where I can use deep learning, windowing and horizon > 1? I will be happy if the example "s&p 500 regression using windowing and convolution" could be modified to consider horizon > 1. I am aiming to forecast price for the next few mins ahead therefore I need horizon > 1.
I have also tried to the same deep learning operator used in the example mentioned above but this time using multi horizon forecast and the same problem occurs. Deep learning cant handle a situation when horizon > 1. I am not expert in deep learning operator but I think the apparent limitation is associated with multi label handling?
Other operators such as gradient boosted tree works well with horizon > 1
Below I have attached the process.
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<operator activated="true" class="retrieve" compatibility="9.6.000" expanded="true" height="68" name="Retrieve s&p-500-data" width="90" x="45" y="238">
<parameter key="repository_entry" value="//Keras Samples/sp_500_regression/s&p-500-data"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<operator activated="true" class="subprocess" compatibility="9.6.000" expanded="true" height="103" name="Subprocess" origin="GENERATED_SAMPLE" width="90" x="179" y="238">
<process expanded="true">
<operator activated="true" class="select_attributes" compatibility="9.6.000" expanded="true" height="82" name="Select Attributes" origin="GENERATED_SAMPLE" width="90" x="45" y="136">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Close"/>
<parameter key="attributes" value="Date|Open|Close"/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<description align="center" color="transparent" colored="false" width="126">Reducing the data to the attribute we want to predict: 'Close' - Which is the closing price of respective stocks.</description>
</operator>
<operator activated="true" class="normalize" compatibility="9.6.000" expanded="true" height="103" name="Normalize" origin="GENERATED_SAMPLE" width="90" x="179" y="136">
<parameter key="return_preprocessing_model" value="false"/>
<parameter key="create_view" value="false"/>
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="Close"/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="method" value="Z-transformation"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="1.0"/>
<parameter key="allow_negative_values" value="false"/>
<description align="center" color="transparent" colored="false" width="126">Often normalizing data helps a neural network to perform better.</description>
</operator>
<operator activated="true" class="time_series:windowing" compatibility="9.6.000" expanded="true" height="82" name="Windowing (2)" origin="GENERATED_SAMPLE" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="has_indices" value="false"/>
<parameter key="indices_attribute" value=""/>
<parameter key="window_size" value="30"/>
<parameter key="no_overlapping_windows" value="false"/>
<parameter key="step_size" value="1"/>
<parameter key="create_horizon_(labels)" value="true"/>
<parameter key="horizon_attribute" value="Close"/>
<parameter key="horizon_size" value="1"/>
<parameter key="horizon_offset" value="0"/>
<description align="center" color="transparent" colored="false" width="126">Using windowing to convert the data into a form, that displays one entry as an attribute with preceeding 30<br/> entries as additional attributes.</description>
</operator>
<operator activated="true" class="split_data" compatibility="9.6.000" expanded="true" height="103" name="Split Data" origin="GENERATED_SAMPLE" width="90" x="447" y="136">
<enumeration key="partitions">
<parameter key="ratio" value="0.9"/>
<parameter key="ratio" value="0.1"/>
</enumeration>
<parameter key="sampling_type" value="linear sampling"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<description align="center" color="transparent" colored="false" width="126">Split data into training and test.</description>
</operator>
<connect from_port="in 1" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Normalize" to_port="example set input"/>
<connect from_op="Normalize" from_port="example set output" to_op="Windowing (2)" to_port="example set"/>
<connect from_op="Windowing (2)" from_port="windowed example set" to_op="Split Data" to_port="example set"/>
<connect from_op="Split Data" from_port="partition 1" to_port="out 1"/>
<connect from_op="Split Data" from_port="partition 2" to_port="out 2"/>
<portSpacing port="source_in 1" spacing="0"/>
<portSpacing port="source_in 2" spacing="0"/>
<portSpacing port="sink_out 1" spacing="0"/>
<portSpacing port="sink_out 2" spacing="0"/>
<portSpacing port="sink_out 3" spacing="0"/>
</process>
<description align="center" color="transparent" colored="false" width="126">Data Preparation: Normalization, Windowing, Label Setting</description>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<operator activated="true" class="deeplearning:dl4j_sequential_neural_network" compatibility="0.9.004" expanded="true" height="145" name="Deep Learning" origin="GENERATED_SAMPLE" width="90" x="313" y="187">
<parameter key="loss_function" value="Mean Squared Error (Linear Regression)"/>
<parameter key="epochs" value="50"/>
<parameter key="use_early_stopping" value="false"/>
<parameter key="condition_strategy" value="score improvement"/>
<parameter key="patience" value="5"/>
<parameter key="minimal_score_improvement" value="0.0"/>
<parameter key="best_epoch_score" value="0.01"/>
<parameter key="max_iteration_score" value="3.0"/>
<parameter key="max_iteration_time" value="10"/>
<parameter key="use_miniBatch" value="false"/>
<parameter key="batch_size" value="32"/>
<parameter key="updater" value="RMSProp"/>
<parameter key="learning_rate" value="0.099"/>
<parameter key="momentum" value="0.9"/>
<parameter key="rho" value="0.95"/>
<parameter key="epsilon" value="1.0E-6"/>
<parameter key="beta1" value="0.9"/>
<parameter key="beta2" value="0.999"/>
<parameter key="RMSdecay" value="0.95"/>
<parameter key="weight_initialization" value="Xavier Uniform"/>
<parameter key="bias_initialization" value="0.0"/>
<parameter key="use_regularization" value="false"/>
<parameter key="l1_strength" value="0.1"/>
<parameter key="l2_strength" value="0.1"/>
<parameter key="optimization_method" value="Conjugate Gradient Line Search"/>
<parameter key="cudnn_algo_mode" value="Prefer fastest"/>
<parameter key="backpropagation" value="Standard"/>
<parameter key="backpropagation_length" value="50"/>
<parameter key="infer_input_shape" value="true"/>
<parameter key="network_type" value="Simple Neural Network"/>
<parameter key="log_each_epoch" value="true"/>
<parameter key="epochs_per_log" value="10"/>
<parameter key="use_local_random_seed" value="false"/>
<parameter key="local_random_seed" value="1992"/>
<process expanded="true">
<operator activated="true" class="deeplearning:dl4j_convolutional_layer" compatibility="0.9.004" expanded="true" height="68" name="Add Convolutional Layer" origin="GENERATED_SAMPLE" width="90" x="112" y="136">
<parameter key="number_of_activation_maps" value="64"/>
<parameter key="kernel_size" value="2.2"/>
<parameter key="stride_size" value="1.1"/>
<parameter key="activation_function" value="ReLU (Rectified Linear Unit)"/>
<parameter key="use_dropout" value="false"/>
<parameter key="dropout_rate" value="0.25"/>
<parameter key="overwrite_networks_weight_initialization" value="false"/>
<parameter key="weight_initialization" value="Normal"/>
<parameter key="overwrite_networks_bias_initialization" value="false"/>
<parameter key="bias_initialization" value="0.0"/>
</operator>
<operator activated="true" class="deeplearning:dl4j_pooling_layer" compatibility="0.9.004" expanded="true" height="68" name="Add Pooling Layer" origin="GENERATED_SAMPLE" width="90" x="313" y="136">
<parameter key="Pooling Method" value="max"/>
<parameter key="PNorm Value" value="1.0"/>
<parameter key="Kernel Size" value="2.2"/>
<parameter key="Stride Size" value="1.1"/>
</operator>
<operator activated="true" class="deeplearning:dl4j_dense_layer" compatibility="0.9.004" expanded="true" height="68" name="Add Fully-Connected Layer" origin="GENERATED_SAMPLE" width="90" x="514" y="136">
<parameter key="number_of_neurons" value="100"/>
<parameter key="activation_function" value="ReLU (Rectified Linear Unit)"/>
<parameter key="use_dropout" value="false"/>
<parameter key="dropout_rate" value="0.25"/>
<parameter key="overwrite_networks_weight_initialization" value="false"/>
<parameter key="weight_initialization" value="Normal"/>
<parameter key="overwrite_networks_bias_initialization" value="false"/>
<parameter key="bias_initialization" value="0.0"/>
<description align="center" color="transparent" colored="false" width="126">Often architectures using convolutional layers end with a fully-connected layer before the last layer.</description>
</operator>
<operator activated="true" class="deeplearning:dl4j_dense_layer" compatibility="0.9.004" expanded="true" height="68" name="Add Fully-Connected Layer (2)" origin="GENERATED_SAMPLE" width="90" x="648" y="136">
<parameter key="number_of_neurons" value="1"/>
<parameter key="activation_function" value="None (identity)"/>
<parameter key="use_dropout" value="false"/>
<parameter key="dropout_rate" value="0.25"/>
<parameter key="overwrite_networks_weight_initialization" value="false"/>
<parameter key="weight_initialization" value="Normal"/>
<parameter key="overwrite_networks_bias_initialization" value="false"/>
<parameter key="bias_initialization" value="0.0"/>
<description align="center" color="transparent" colored="false" width="126">Since regression is performed on neuron and the 'None (identity)' activation function has to be used.</description>
</operator>
<connect from_port="layerArchitecture" to_op="Add Convolutional Layer" to_port="layerArchitecture"/>
<connect from_op="Add Convolutional Layer" from_port="layerArchitecture" to_op="Add Pooling Layer" to_port="layerArchitecture"/>
<connect from_op="Add Pooling Layer" from_port="layerArchitecture" to_op="Add Fully-Connected Layer" to_port="layerArchitecture"/>
<connect from_op="Add Fully-Connected Layer" from_port="layerArchitecture" to_op="Add Fully-Connected Layer (2)" to_port="layerArchitecture"/>
<connect from_op="Add Fully-Connected Layer (2)" from_port="layerArchitecture" to_port="layerArchitecture"/>
<portSpacing port="source_layerArchitecture" spacing="0"/>
<portSpacing port="sink_layerArchitecture" spacing="0"/>
<description align="center" color="gray" colored="true" height="63" resized="false" width="712" x="75" y="448">This network architecture uses convolutional and pooling layers in combination with standard fully-connected layers.</description>
<description align="center" color="yellow" colored="false" height="407" resized="false" width="167" x="75" y="32">A convolutional layer uses a sliding window to only take a subset of provided information into account.<br><br><br><br><br><br>This is done mutiple times (= activation map count), while automatically changing the so called kernel that is used as a mask for windowing.<br/><br/>This method has the advantage of being able to focus on local patterns.</description>
<description align="center" color="yellow" colored="false" height="313" resized="false" width="183" x="269" y="34">A pooling layer eases the training process by reducing the information.<br><br><br/><br/><br/><br/><br><br>Here only the maximum value of each 2x2 kernel window (created in the previous Convolutional Layer) is kept.</description>
</process>
<description align="center" color="transparent" colored="false" width="126">Open the Deep Learning operator by double-clicking on it, to discovere the layer setup.</description>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<operator activated="true" class="apply_model" compatibility="9.6.000" expanded="true" height="82" name="Apply Model" origin="GENERATED_SAMPLE" width="90" x="447" y="238">
<list key="application_parameters"/>
<parameter key="create_view" value="false"/>
</operator>
</process>
<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">
<operator activated="true" class="performance_regression" compatibility="9.6.000" expanded="true" height="82" name="Performance" origin="GENERATED_SAMPLE" width="90" x="581" y="238">
<parameter key="main_criterion" value="first"/>
<parameter key="root_mean_squared_error" value="false"/>
<parameter key="absolute_error" value="false"/>
<parameter key="relative_error" value="true"/>
<parameter key="relative_error_lenient" value="false"/>
<parameter key="relative_error_strict" value="false"/>
<parameter key="normalized_absolute_error" value="false"/>
<parameter key="root_relative_squared_error" value="false"/>
<parameter key="squared_error" value="false"/>
<parameter key="correlation" value="false"/>
<parameter key="squared_correlation" value="false"/>
<parameter key="prediction_average" value="false"/>
<parameter key="spearman_rho" value="false"/>
<parameter key="kendall_tau" value="false"/>
<parameter key="skip_undefined_labels" value="true"/>
<parameter key="use_example_weights" value="true"/>
</operator>
</process>
0
Best Answer
-
jacobcybulski Member, University Professor Posts: 391 UnicornI see your problem, the issue is that currently RM Deep Learning (built in and extensions) do not support returning results as tensors, i. e. multiple labels or their vectors. This means that in Deep Learning forecasting you are limited to the horizon of one, this is regardless of what algorithm you use CNN or LSTM. However, as @Telcontar120 suggested RM features multi-horizon operators, both in the new Forecasting extension and in Time Series extension, which could solve your problem using the classical forecasting algorithms.
5
Answers
You might want to look at the new Forecasting extension, which has some automated operators for both univariate and multivariate forecasting.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
I agree with Brian and I can not prevent me to quote Pierre DAC :
"...Forecasting is difficult, especially when it comes to the future..."
Regards,
Lionel
Pull the intraday data using Python->Apply STL to remove some noise->(I am using multi-variable), normalize series->weight by PCA to focus on those variables that real matter-> windowing (to train, validate, and use my last row to apply my model) and another parallel windowing to enrich the data (feature generation) with parameters such as min, max, std deviation, etc. Split the data in three parts: train, evaluate the model, and use the last row as unseen, multi-horizon forecasting using GBT (GBT wow! when you tune it), multi-horizon performance. Apply model to the unseen data, multi-horizon performance, Tune ARIMA and apply it to the sequence to compare its performance/forecasting with that designed using GBT.
Then you can get some satisfactions when "some times" you get it right.
What I learn?
- Rapidminer did a great job with this time series extension.
- Each stock has its own emotion and behavior.
- There is not such thing of "free lunch" , you shall develop one model per stock. Each stock has its own personality and emotions.
- GBT well tuned can surprise you.
I want to remark. I am not an expert in time series either stock market. I am just curious and the process described above may be subject to missing steps.Thank you guys for your time