Arima & R
Hallo,
I try to replicate an Arima example found at www.rapidminer.com. Here is the XML file:
<?xml version="1.0" encoding="UTF-8"?><process version="7.4.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="quantx1:yahoo_historical_data_extractor" compatibility="1.0.006" expanded="true" height="82" name="Yahoo Historical Stock Data" width="90" x="45" y="120">
<parameter key="I agree to abide by Yahoo's Terms & Conditions on financial data usage" value="true"/>
<parameter key="Quick Stock Ticker Data" value="true"/>
<parameter key="Stock Ticker" value="S&P"/>
<parameter key="select_fields" value="VOLUME|OPEN|DAY_LOW|DAY_HIGH|CLOSE|ADJUSTED_CLOSE"/>
<parameter key="date_format" value="yyyy-MM-dd"/>
<parameter key="date_start" value="2013-01-01"/>
<parameter key="date_end" value="2015-06-03"/>
<parameter key="data_frequency" value="DAILY"/>
<parameter key="Cache Data in Memory" value="false"/>
</operator>
<operator activated="true" class="rename" compatibility="7.4.000" expanded="true" height="82" name="Rename" width="90" x="179" y="120">
<parameter key="old_name" value="S&P_ADJUSTED_CLOSE"/>
<parameter key="new_name" value="AClose"/>
<list key="rename_additional_attributes">
<parameter key="S&P_CLOSE" value="Close"/>
<parameter key="S&P_DAY_HIGH" value="High"/>
<parameter key="S&P_DAY_LOW" value="Low"/>
<parameter key="S&P_OPEN" value="Open"/>
<parameter key="S&P_VOLUME" value="Volume"/>
</list>
</operator>
<operator activated="true" class="multiply" compatibility="7.4.000" expanded="true" height="124" name="Multiply" width="90" x="313" y="120"/>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Forecasting" width="90" x="715" y="435">
<parameter key="script" value="### Call this R scripts to get AIC from ARIMA models rm_main = function(data) { 	library(forecast) 	sp <- data 	sp$Date <- as.Date(sp$Date) 	arima <- arima(ts(sp$Close), order=c(3,1,3)) 	print(arima) 	arimaforecast <- forecast.Arima(arima, h=5) 	print(arimaforecast) 	return(as.data.frame(arimaforecast)) } "/>
</operator>
<operator activated="true" class="optimize_parameters_grid" compatibility="7.4.000" expanded="true" height="103" name="Optimize Parameters (Grid)" width="90" x="514" y="300">
<list key="parameters">
<parameter key="Set p.value" value="[0;3;3;linear]"/>
<parameter key="Set d.value" value="[0.0;2;2;linear]"/>
<parameter key="Set q.value" value="[0.0;4;4;linear]"/>
</list>
<parameter key="error_handling" value="fail on error"/>
<process expanded="true">
<operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set p" width="90" x="112" y="30">
<parameter key="macro" value="p"/>
<parameter key="value" value="3.0"/>
</operator>
<operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set d" width="90" x="112" y="120">
<parameter key="macro" value="d"/>
<parameter key="value" value="2.0"/>
</operator>
<operator activated="true" class="set_macro" compatibility="7.4.000" expanded="true" height="76" name="Set q" width="90" x="112" y="210">
<parameter key="macro" value="q"/>
<parameter key="value" value="4.0"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="112" name="ARIMA" width="90" x="447" y="75">
<parameter key="script" value="### Call this R scripts to get AIC from ARIMA models rm_main = function(data) { 	sp <- data 	sp$Date <- as.Date(sp$Date) 	arima <- arima(sp$Close, order=c(%{p},%{d},%{q})) 	#print(arima$aic) 	return(as.data.table(arima$aic)) } "/>
<description align="center" color="transparent" colored="false" width="126">Fit ARIMA model in R with diffeferent(p,d,q)</description>
</operator>
<operator activated="true" class="extract_performance" compatibility="7.4.000" expanded="true" height="76" name="Performance" width="90" x="581" y="75">
<parameter key="performance_type" value="data_value"/>
<parameter key="statistics" value="average"/>
<parameter key="attribute_name" value="V1"/>
<parameter key="example_index" value="1"/>
<parameter key="optimization_direction" value="minimize"/>
</operator>
<operator activated="true" class="log" compatibility="7.4.000" expanded="true" height="76" name="Log" width="90" x="715" y="75">
<list key="log">
<parameter key="aic" value="operator.Performance.value.performance"/>
<parameter key="p" value="operator.Set p.parameter.value"/>
<parameter key="d" value="operator.Set d.parameter.value"/>
<parameter key="q" value="operator.Set q.parameter.value"/>
</list>
<parameter key="sorting_type" value="none"/>
<parameter key="sorting_k" value="100"/>
<parameter key="persistent" value="false"/>
</operator>
<connect from_port="input 1" to_op="Set p" to_port="through 1"/>
<connect from_op="Set p" from_port="through 1" to_op="ARIMA" to_port="input 1"/>
<connect from_op="Set d" from_port="through 1" to_op="ARIMA" to_port="input 2"/>
<connect from_op="Set q" from_port="through 1" to_op="ARIMA" to_port="input 3"/>
<connect from_op="ARIMA" from_port="output 1" to_op="Performance" to_port="example set"/>
<connect from_op="Performance" from_port="performance" to_op="Log" to_port="through 1"/>
<connect from_op="Log" from_port="through 1" to_port="performance"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="source_input 2" spacing="0"/>
<portSpacing port="sink_performance" spacing="36"/>
<portSpacing port="sink_result 1" spacing="0"/>
</process>
</operator>
<connect from_op="Yahoo Historical Stock Data" from_port="example set" to_op="Rename" to_port="example set input"/>
<connect from_op="Rename" from_port="example set output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply" from_port="output 2" to_op="Optimize Parameters (Grid)" to_port="input 1"/>
<connect from_op="Multiply" from_port="output 3" to_op="Forecasting" to_port="input 1"/>
<connect from_op="Forecasting" from_port="output 1" to_port="result 3"/>
<connect from_op="Optimize Parameters (Grid)" from_port="performance" to_port="result 2"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="90"/>
<portSpacing port="sink_result 2" spacing="162"/>
<portSpacing port="sink_result 3" spacing="126"/>
<portSpacing port="sink_result 4" spacing="36"/>
<description align="center" color="yellow" colored="false" height="62" resized="true" width="816" x="305" y="18">Look at Economic Time Series Data (automatically pulled) from public sites and integrate with ARIMA in R extension</description>
<description align="center" color="yellow" colored="false" height="133" resized="true" width="635" x="490" y="83">Charts for data. Identify any unusual observations for all attributes: day low, high, open, close, adjusted close, volumn</description>
<description align="center" color="yellow" colored="false" height="177" resized="true" width="626" x="500" y="228">Find the optimized parameter for ARIMA (iterative, and TAKE TIME!! about 1 min)<br>Use R extension for ARIMA models<br>for this demo data, we have ARIMA(3,1,3) as the best fit<br/>To chose the best fit model: check Log result, rank by AIC<br/>and find the values of p, d, q corresponding to min AIC</description>
<description align="center" color="yellow" colored="false" height="116" resized="true" width="415" x="713" y="414">Apply ARIMA(3,1,3) for forcasting<br>predict the next 5 days close price<br></description>
</process>
</operator>
</process>
Question 1: At Rename, the message "Attribute not found" pops up. I try to modify the parameters but issue remains. Finally, I removed this operator.
Question 2: With operator Rename removed, the message "Rscript not found" shows up at Forescasting (Execute R) operator.
R version 3.5.0 (2018-04-23) is installed.
Can anyone help?
Regards,
Maerkli
Best Answers
-
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hallo @Maerkli,
To answer to Question 1 :
It seems that the Yahoo Historical Stock Data operator is down : When I set a "breapoint after" on this operator
the resulting example set is empty, so the Rename operator can't be executed and raise the error you described.
Regards,
Lionel
0 -
lionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
Hi
Effectively, this extension is dead. My memory is failing...:smileysad: : We talked about that in this thread.
@Maerkli, inside this thread, you can find a link towards the presentation of the Alpha Vantage API (an alternative to Yahoo API).
This API could be useful if you are looking for financial data.
Regards,
Lionel
2 -
Thomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
@lionelderkrikor In search engines we trust. My memory is long gone.
1 -
Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
You can also check out Quandl if you are willing to pay for a premium subscription, they have very complete financial data available in friendly APIs that work well with RapidMiner.
1 -
Maerkli Member Posts: 84 Guru
Hallo RapidMiner Community,
Un grand merci for your coopetation.
Maerkli
0
Answers
It is true, Lionel. My data file is empty. Thanks for the support.
Maerkli
@Maerkli and @lionelderkrikor there's been a lot of discussion on the community (go search for it) about the old Fin/Econ extension. 1) It's dead, no one is updating it anymore and 2) Yahoo changed it's internals to make it harder to extract stock data.
So you're left with using something else or manually downloading the stock prices via a CSV and loading it in via a Read CSV operator.
http://investexcel.net/multiple-stock-quote-downloader-for-excel/
this still works its great you can download stock data for say 100 AIM Tickers, and each gets printed to its own csv.
you can merge and find say, all stocks under X with avarage volume Y over N days
to quickly wittle down Stocks with growing interest that are Low cost.
I picked ELan oil and gas with this method @ 0.27 touched 1.37 couple of days ago
Have not incoorporated in rapidminer, would be great to load 100s of ticker data and perform some filtering analysis on them ...
Hi just a quick update, you have to click the "get stock quote" button twice sometimes to get the cookie it seems..
London stock exchange AIM List Tickers
you can get cryptos like this using the spreadsheet
XRP-USD
BTC-USD
ETH-USD