The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Rapidminer and JSON

wirtcalwirtcal Member Posts: 16 Maven
edited November 2018 in Help

Hello!

How can I convert a JSON object into a table that Rapidminer can handle? 

 

This is the JSON that I am working on:

[{"date":1465632900,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00078269,"quoteVolume":0.39276167,"weightedAverage":0.00199281},{"date":1465633200,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00034535,"quoteVolume":0.17329899,"weightedAverage":0.00199281},{"date":1465633500,"high":0.00198761,"low":0.00198761,"open":0.00198761,"close":0.00198761,"volume":0.00126317,"quoteVolume":0.63552206,"weightedAverage":0.00198761},{"date":1465633800,"high":0.00200383,"low":0.00199217,"open":0.00199217,"close":0.00200383,"volume":0.99928894,"quoteVolume":499.17633002,"weightedAverage":0.00200187}]

 

I tried:

Get Page (url) ---> JSON to XML

However I got this message:

"A JSONObject text must begin with '{' at character 1"

 

I also realized that my JSON has not any root/enclosing name and I guess this may be the problem.

 

What can I do to read this JSON as a table? Thanks!

Tagged:

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Solution Accepted

    Hi wirtcal,

     

    json to data does the job. The data needs to be depivoted afterwards. A process is attached.

     

    ~Martin

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.2.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="text:create_document" compatibility="7.2.000" expanded="true" height="68" name="Create Document" width="90" x="112" y="34">
    <parameter key="text" value="[{&quot;date&quot;:1465632900,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00078269,&quot;quoteVolume&quot;:0.39276167,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633200,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00034535,&quot;quoteVolume&quot;:0.17329899,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633500,&quot;high&quot;:0.00198761,&quot;low&quot;:0.00198761,&quot;open&quot;:0.00198761,&quot;close&quot;:0.00198761,&quot;volume&quot;:0.00126317,&quot;quoteVolume&quot;:0.63552206,&quot;weightedAverage&quot;:0.00198761},{&quot;date&quot;:1465633800,&quot;high&quot;:0.00200383,&quot;low&quot;:0.00199217,&quot;open&quot;:0.00199217,&quot;close&quot;:0.00200383,&quot;volume&quot;:0.99928894,&quot;quoteVolume&quot;:499.17633002,&quot;weightedAverage&quot;:0.00200187}]&#10;&#10; "/>
    </operator>
    <operator activated="true" class="text:json_to_data" compatibility="7.2.000" expanded="true" height="82" name="JSON To Data" width="90" x="246" y="34"/>
    <operator activated="true" class="de_pivot" compatibility="7.2.001" expanded="true" height="82" name="De-Pivot" width="90" x="380" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <connect from_op="Create Document" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
    <connect from_op="JSON To Data" from_port="example set" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    Just an FYI. The JSON to Data operator is found in the Text Mining extension. Download that first.

  • wirtcalwirtcal Member Posts: 16 Maven

    Thank you both!

     

    I was just about to ask if this operator was only available in the Rapidminer Pro.

     

    I will install the Text Mining extension to check this out.

     

    Cheers!

  • AndrewB1AndrewB1 Member Posts: 3 Contributor I

    is there a straight forward way to handle non uniform JSON data.   ie json with dynamic fields that don't appear on each example.   I get an error in this case but would prefere to get back null for that example /  attribute. 

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn

    Well, it depends on your definition of straight forward ;-)

     

    JSON to Data gives you a very wide example set, usually with only one example. You can work in it with Loop Attributes, but I found it sometimes easier to transpose or rotate the example set (using the Transpose operator). It might be easier to extract stuff like example indexes and so on with the transposed structure. However, it will convert numeric data to text with a standard formatting. You might want to do that before the Transpose yourself with the operator of your choosing (Format Numbers, Numerical to Polynominal).

     

  • shoebjoardershoebjoarder Member Posts: 1 Learner III

    Hi, where do you put this process code?

    I have a JSON file extracted from Elasticsearch but I get only one row of data when i convert it from json to data. I would like to separate the data in form of a proper table.

    Could you please explain it in a bit detail?

     

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    If you want to use the XML process code, check out this KB article on how to do it: http://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/How-can-I-share-processes-without-RapidMiner-Server/ta-p/37047

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn

    If you see XML code in the Community and want to use it, just copy everything, activate the XML tab in Studio (View/Show Panel/XML) and paste it there.

     

    JSON can have a very complex structure, it is not guaranteed to be a "proper table". So what JSON to Data does is taking all elements and naming them by the "path" to the element. It's something you need to get used to, but it avoids a lot of complexity for simple documents. 

     

    To extract tabular data, look at the metadata of the table or transpose it using the Transpose operator. You'll see that the attribute names (or the ID of the transposed line) have a structure that you can extract. E. g. you might have a name like example[1][1]. You can use Generate Attributes or Replace for extracting the index numbers (1, 1). Then you do some filtering, maybe joining or pivoting to come up with the structure you need.

     

    You might not like this approach. There's another: JSON to XML in the Web Mining extension. 

    Take your JSON document, use JSON to XML and Write Document to export the generated XML file. Then you can use the Read XML operator's wizard to extract contents in a more structured way. However, this doesn't always work as JSON is more flexible than XML, so there are many JSON documents that can't be converted.

     

    Regards,

    Balázs

  • thapli_64thapli_64 Member Posts: 18 Maven

    Just what I was looking for! Thank you!

  • websiteguywebsiteguy Member Posts: 24 Maven

    @mschmitz

    Hi

    can you help me out with the timestampI cant work out how to convert from string of numbers to date. I just keep ending up with 1970 ?

     

    https://poloniex.com/public?command=returnChartData&currencyPair=USDT_XRP&end=9999999999&period=1440...

    to get it from a string to a date time format, I spent hours trying to suss it out?

     

     

    I did post here

     

    https://community.rapidminer.com/t5/RapidMiner-Studio-Forum/JSON-to-data-and-de-pivot-for-exampleset/m-p/47827#M30657 as well

    thanks in advance, lee

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy the problem is that the date is an integer and all the other values are real. The trick is to use a Numerical to Real operator. You'll have to convert the dates back to a readable format downstream. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="false" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="[{&quot;date&quot;:1465632900,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00078269,&quot;quoteVolume&quot;:0.39276167,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633200,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00034535,&quot;quoteVolume&quot;:0.17329899,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633500,&quot;high&quot;:0.00198761,&quot;low&quot;:0.00198761,&quot;open&quot;:0.00198761,&quot;close&quot;:0.00198761,&quot;volume&quot;:0.00126317,&quot;quoteVolume&quot;:0.63552206,&quot;weightedAverage&quot;:0.00198761},{&quot;date&quot;:1465633800,&quot;high&quot;:0.00200383,&quot;low&quot;:0.00199217,&quot;open&quot;:0.00199217,&quot;close&quot;:0.00200383,&quot;volume&quot;:0.99928894,&quot;quoteVolume&quot;:499.17633002,&quot;weightedAverage&quot;:0.00200187}]&#10;&#10; "/>
    </operator>
    <operator activated="true" class="web:get_webpage" compatibility="7.3.000" expanded="true" height="68" name="Get Page" width="90" x="45" y="34">
    <parameter key="url" value="https://poloniex.com/public?command=returnChartData&amp;currencyPair=USDT_XRP&amp;end=9999999999&amp;period=14400&amp;start=1405699200"/>
    <parameter key="random_user_agent" value="true"/>
    <list key="query_parameters"/>
    <list key="request_properties"/>
    </operator>
    <operator activated="true" class="text:json_to_data" compatibility="8.1.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="313" y="34"/>
    <operator activated="true" class="de_pivot" compatibility="8.1.001" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <connect from_op="Get Page" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
    <connect from_op="JSON To Data" from_port="example set" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • websiteguywebsiteguy Member Posts: 24 Maven

    Hi Tom, and @sgenzer (this process below) thanks for the reply.

    I have that part already resolved after folowing the tut in this thread, the issue im having is the date seems to be a time stamp. A long string of numbers, after converting it using the Numerical to real, how to then convert the date in to a format as below.  Reason - Alpha Vantage does not have intra day prices for this ticker.

    Using an AREMA process I found on the forum :)


    @Thomas_Ott wrote:

    @websiteguy the problem is that the date is an integer and all the other values are real. The trick is to use a Numerical to Real operator. You'll have to convert the dates back to a readable format downstream. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="false" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="[{&quot;date&quot;:1465632900,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00078269,&quot;quoteVolume&quot;:0.39276167,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633200,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00034535,&quot;quoteVolume&quot;:0.17329899,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633500,&quot;high&quot;:0.00198761,&quot;low&quot;:0.00198761,&quot;open&quot;:0.00198761,&quot;close&quot;:0.00198761,&quot;volume&quot;:0.00126317,&quot;quoteVolume&quot;:0.63552206,&quot;weightedAverage&quot;:0.00198761},{&quot;date&quot;:1465633800,&quot;high&quot;:0.00200383,&quot;low&quot;:0.00199217,&quot;open&quot;:0.00199217,&quot;close&quot;:0.00200383,&quot;volume&quot;:0.99928894,&quot;quoteVolume&quot;:499.17633002,&quot;weightedAverage&quot;:0.00200187}]&#10;&#10; "/>
    </operator>
    <operator activated="true" class="web:get_webpage" compatibility="7.3.000" expanded="true" height="68" name="Get Page" width="90" x="45" y="34">
    <parameter key="url" value="https://poloniex.com/public?command=returnChartData&amp;currencyPair=USDT_XRP&amp;end=9999999999&amp;period=14400&amp;start=1405699200"/>
    <parameter key="random_user_agent" value="true"/>
    <list key="query_parameters"/>
    <list key="request_properties"/>
    </operator>
    <operator activated="true" class="text:json_to_data" compatibility="8.1.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="313" y="34"/>
    <operator activated="true" class="de_pivot" compatibility="8.1.001" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <connect from_op="Get Page" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
    <connect from_op="JSON To Data" from_port="example set" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    thanks for your time, appreciated.

     Lee


    @Thomas_Ott wrote:

    @websiteguy the problem is that the date is an integer and all the other values are real. The trick is to use a Numerical to Real operator. You'll have to convert the dates back to a readable format downstream. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="false" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="[{&quot;date&quot;:1465632900,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00078269,&quot;quoteVolume&quot;:0.39276167,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633200,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00034535,&quot;quoteVolume&quot;:0.17329899,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633500,&quot;high&quot;:0.00198761,&quot;low&quot;:0.00198761,&quot;open&quot;:0.00198761,&quot;close&quot;:0.00198761,&quot;volume&quot;:0.00126317,&quot;quoteVolume&quot;:0.63552206,&quot;weightedAverage&quot;:0.00198761},{&quot;date&quot;:1465633800,&quot;high&quot;:0.00200383,&quot;low&quot;:0.00199217,&quot;open&quot;:0.00199217,&quot;close&quot;:0.00200383,&quot;volume&quot;:0.99928894,&quot;quoteVolume&quot;:499.17633002,&quot;weightedAverage&quot;:0.00200187}]&#10;&#10; "/>
    </operator>
    <operator activated="true" class="web:get_webpage" compatibility="7.3.000" expanded="true" height="68" name="Get Page" width="90" x="45" y="34">
    <parameter key="url" value="https://poloniex.com/public?command=returnChartData&amp;currencyPair=USDT_XRP&amp;end=9999999999&amp;period=14400&amp;start=1405699200"/>
    <parameter key="random_user_agent" value="true"/>
    <list key="query_parameters"/>
    <list key="request_properties"/>
    </operator>
    <operator activated="true" class="text:json_to_data" compatibility="8.1.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="313" y="34"/>
    <operator activated="true" class="de_pivot" compatibility="8.1.001" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <connect from_op="Get Page" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
    <connect from_op="JSON To Data" from_port="example set" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>

     

    Sat, 17 Mar 2018 12:00:00 +0000

    date-timestamp-date.pngdate-timestamp.pngdate.png

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy could you use a Numerical to Date operator and set the Offset parameter?

  • websiteguywebsiteguy Member Posts: 24 Maven

    Hi @Thomas_Ott

     

    Wow that was quick, went backt to edit my reply.. any you already replied.

     

    Could you provide an example please in a process? (I been trying to resolve, with limited exsperiance all night)

     

    Do you have to define the date as an attribute first?

     

    Found this

    https://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Date-Time-Formatting-UTC/td-p/28160

     

    Is this the right method?

    date-reading.png

     

    As a side note,

     

    I intend to prind the manual, whats the best written document to read? I dont have much maths, or coding, but im learnig from example, by butchering different processes together.

     

     

     

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy try this process, it uses a Generate Attributes operator. 


    Gives you this nice chart of Ripple.  You should buy a lot of it. I own many units of it at $0.73.

    XRP.png

     

     

     

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="false" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="45" y="136">
    <parameter key="text" value="[{&quot;date&quot;:1465632900,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00078269,&quot;quoteVolume&quot;:0.39276167,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633200,&quot;high&quot;:0.00199281,&quot;low&quot;:0.00199281,&quot;open&quot;:0.00199281,&quot;close&quot;:0.00199281,&quot;volume&quot;:0.00034535,&quot;quoteVolume&quot;:0.17329899,&quot;weightedAverage&quot;:0.00199281},{&quot;date&quot;:1465633500,&quot;high&quot;:0.00198761,&quot;low&quot;:0.00198761,&quot;open&quot;:0.00198761,&quot;close&quot;:0.00198761,&quot;volume&quot;:0.00126317,&quot;quoteVolume&quot;:0.63552206,&quot;weightedAverage&quot;:0.00198761},{&quot;date&quot;:1465633800,&quot;high&quot;:0.00200383,&quot;low&quot;:0.00199217,&quot;open&quot;:0.00199217,&quot;close&quot;:0.00200383,&quot;volume&quot;:0.99928894,&quot;quoteVolume&quot;:499.17633002,&quot;weightedAverage&quot;:0.00200187}]&#10;&#10; "/>
    </operator>
    <operator activated="true" class="web:get_webpage" compatibility="7.3.000" expanded="true" height="68" name="Get Page" width="90" x="45" y="34">
    <parameter key="url" value="https://poloniex.com/public?command=returnChartData&amp;currencyPair=USDT_XRP&amp;end=9999999999&amp;period=14400&amp;start=1405699200"/>
    <parameter key="random_user_agent" value="true"/>
    <list key="query_parameters"/>
    <list key="request_properties"/>
    </operator>
    <operator activated="true" class="text:json_to_data" compatibility="8.1.000" expanded="true" height="82" name="JSON To Data" width="90" x="179" y="34"/>
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="313" y="34"/>
    <operator activated="true" class="de_pivot" compatibility="8.1.001" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    </operator>
    <operator activated="true" breakpoints="after" class="real_to_integer" compatibility="8.1.001" expanded="true" height="82" name="Real to Integer" width="90" x="581" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="date"/>
    </operator>
    <operator activated="true" class="generate_attributes" compatibility="8.1.001" expanded="true" height="82" name="Generate Attributes" width="90" x="715" y="34">
    <list key="function_descriptions">
    <parameter key="Coverted Date" value="date_add(date_parse(&quot;01/01/1970&quot;),date,DATE_UNIT_SECOND)"/>
    </list>
    </operator>
    <operator activated="false" class="numerical_to_date" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Date" width="90" x="581" y="187">
    <parameter key="attribute_name" value="date"/>
    </operator>
    <connect from_op="Get Page" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
    <connect from_op="JSON To Data" from_port="example set" to_op="Numerical to Real" to_port="example set input"/>
    <connect from_op="Numerical to Real" from_port="example set output" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_op="Real to Integer" to_port="example set input"/>
    <connect from_op="Real to Integer" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
    <connect from_op="Generate Attributes" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process> 

     

  • websiteguywebsiteguy Member Posts: 24 Maven

    Thanks @Thomas_Ott

     

    you just made my day :)

  • websiteguywebsiteguy Member Posts: 24 Maven

    Hi @Thomas_Ott do you have this processed with the ARIMA for predictions?

    Ive been in to XRP since 2014... waiting patiently ...

    Do you have any other processes crypto analysis you can share / (PM) :)???

     this is cool you seen this?

    https://blog.patricktriest.com/analyzing-cryptocurrencies-python/

     

    Cheers Lee

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy i haven't seen that post but I'll check it out later. Thanks.

     

    WRT to doing ARIMA. I adapted this ARIMA process that @luc_bartkowski put together. Just take the process above with the JSON and timestamp conversions and attached it to this. Might take a few minutes to process. 

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.1.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="generate_macro" compatibility="8.1.001" expanded="true" height="68" name="Current Date" width="90" x="916" y="289">
    <list key="function_descriptions">
    <parameter key="CurrentDate" value="date_now()"/>
    </list>
    </operator>
    <operator activated="true" class="set_macro" compatibility="8.1.001" expanded="true" height="68" name="Training To Date" width="90" x="916" y="187">
    <parameter key="macro" value="TrainingDateTo"/>
    <parameter key="value" value="%{CurrentDate}"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="8.1.001" expanded="true" height="68" name="Prediction Horizon" width="90" x="849" y="85">
    <parameter key="macro" value="PredictionHorizon"/>
    <parameter key="value" value="20"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="8.1.001" expanded="true" height="68" name="Training From Date" width="90" x="782" y="187">
    <parameter key="macro" value="AnalysesDateFrom"/>
    <parameter key="value" value="2016/02/11"/>
    </operator>
    <operator activated="true" class="set_macro" compatibility="8.1.001" expanded="true" height="68" name="Optimization Cycles" width="90" x="782" y="289">
    <parameter key="macro" value="OptimizeCycles"/>
    <parameter key="value" value="50"/>
    </operator>
    <operator activated="false" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Get/Join Data" width="90" x="514" y="85">
    <process expanded="true">
    <operator activated="false" class="subprocess" compatibility="8.1.001" expanded="true" height="82" name="Oil Futures" width="90" x="112" y="136">
    <process expanded="true">
    <operator activated="false" class="jdbc_connectors:read_database" compatibility="8.1.001" expanded="true" height="68" name="Read Database (2)" width="90" x="45" y="34">
    <parameter key="connection" value="MySQL"/>
    <parameter key="query" value="SELECT *&#10;FROM `oil`&#10;ORDER BY Date desc&#10;limit 9999"/>
    <enumeration key="parameters"/>
    </operator>
    <operator activated="false" class="store" compatibility="8.1.001" expanded="true" height="68" name="Store (11)" width="90" x="45" y="136">
    <parameter key="repository_entry" value="//Cloud Repository/Samples/data/oilfuturesvw"/>
    </operator>
    <operator activated="false" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve (2)" width="90" x="179" y="136">
    <parameter key="repository_entry" value="../data/oilfuturesvw"/>
    </operator>
    <operator activated="false" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve AAPL" width="90" x="179" y="34">
    <parameter key="repository_entry" value="../data/AAPL"/>
    </operator>
    <operator activated="false" class="rename" compatibility="8.1.001" expanded="true" height="82" name="Rename (8)" width="90" x="782" y="85">
    <parameter key="old_name" value="Date"/>
    <parameter key="new_name" value="oilDate"/>
    <list key="rename_additional_attributes">
    <parameter key="High" value="oilHigh"/>
    <parameter key="Low" value="oilLow"/>
    <parameter key="Open" value="oilOpen"/>
    <parameter key="Previous Day Open Interest" value="oilPrevDayOpenInt"/>
    <parameter key="Settle" value="oilSettle"/>
    <parameter key="Volume" value="oilVolume"/>
    <parameter key="Last" value="oilLast"/>
    </list>
    </operator>
    <operator activated="false" class="retrieve" compatibility="8.1.001" expanded="true" height="68" name="Retrieve NVDA (1)" width="90" x="313" y="136">
    <parameter key="repository_entry" value="../data/NVDA (1)"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes (2)" width="90" x="514" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Date|High|Low|Open|Volume|Last|Adj Close"/>
    </operator>
    <operator activated="true" class="nominal_to_date" compatibility="8.1.001" expanded="true" height="82" name="Nominal to Date (8)" width="90" x="648" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    </operator>
    <connect from_port="in 1" to_op="Select Attributes (2)" to_port="example set input"/>
    <connect from_op="Select Attributes (2)" from_port="example set output" to_op="Nominal to Date (8)" to_port="example set input"/>
    <connect from_op="Nominal to Date (8)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="read_csv" compatibility="8.1.000" expanded="true" height="68" name="Read CSV" width="90" x="112" y="34">
    <parameter key="csv_file" value="C:\Users\TomOtt\Downloads\AAPL.csv"/>
    <parameter key="column_separators" value=","/>
    <parameter key="first_row_as_names" value="false"/>
    <list key="annotations">
    <parameter key="0" value="Name"/>
    </list>
    <parameter key="encoding" value="windows-1252"/>
    <list key="data_set_meta_data_information"/>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes (3)" width="90" x="246" y="34">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Date|High|Low|Open|Volume|Last|Adj Close"/>
    </operator>
    <operator activated="true" class="nominal_to_date" compatibility="8.1.001" expanded="true" height="82" name="Nominal to Date (2)" width="90" x="380" y="34">
    <parameter key="attribute_name" value="Date"/>
    <parameter key="date_format" value="yyyy-MM-dd"/>
    </operator>
    <connect from_op="Read CSV" from_port="output" to_op="Select Attributes (3)" to_port="example set input"/>
    <connect from_op="Select Attributes (3)" from_port="example set output" to_op="Nominal to Date (2)" to_port="example set input"/>
    <connect from_op="Nominal to Date (2)" from_port="example set output" to_port="out 1"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="productivity:execute_process" compatibility="8.1.001" expanded="true" height="68" name="Execute XRP Crypto" width="90" x="45" y="85">
    <parameter key="process_location" value="XRP Crypto"/>
    <list key="macros"/>
    </operator>
    <operator activated="true" class="rename" compatibility="8.1.001" expanded="true" height="82" name="Rename" width="90" x="179" y="85">
    <parameter key="old_name" value="Coverted Date"/>
    <parameter key="new_name" value="Date"/>
    <list key="rename_additional_attributes">
    <parameter key="close" value="Adj Close"/>
    </list>
    </operator>
    <operator activated="true" class="sort" compatibility="8.1.001" expanded="true" height="82" name="Sort" width="90" x="313" y="85">
    <parameter key="attribute_name" value="Date"/>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Set Role" width="90" x="112" y="238">
    <parameter key="attribute_name" value="Adj Close"/>
    <parameter key="target_role" value="label"/>
    <list key="set_additional_roles">
    <parameter key="Date" value="id"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Select Attributes" width="90" x="246" y="238">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="oilLast|oilHigh|oilLow|oilOpen|oilSettle|oilPrevDayOpenInt|oilVolume|Volume|Open|Low|High|Adj Close"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Filter Start of Trend" width="90" x="380" y="238">
    <parameter key="parameter_expression" value="date_after(Date, date_parse_custom(%{AnalysesDateFrom}, &quot;yyyy/MM/dd&quot;))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Train until Hold-off" width="90" x="514" y="238">
    <parameter key="parameter_expression" value="date_before(Date, date_now())"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="8.1.001" expanded="true" height="124" name="Multiply (3)" width="90" x="112" y="544"/>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="103" name="ARIMA Predict Last" width="90" x="246" y="442">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="8.1.001" expanded="true" height="145" name="Optimize Parameters (Evolutionary)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer" width="90" x="246" y="136">
    <parameter key="time_series_attribute" value="Adj Close"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="Date"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast" width="90" x="380" y="34">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer" to_port="example set"/>
    <connect from_op="ARIMA Trainer" from_port="forecast model" to_op="Apply Forecast" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast" from_port="example set" to_port="result 1"/>
    <connect from_op="Apply Forecast" from_port="original" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (6)" width="90" x="313" y="187">
    <parameter key="time_series_attribute" value="oilLast"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="2"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="92"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (6)" width="90" x="447" y="187">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (Evolutionary)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (Evolutionary)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (Evolutionary)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (6)" from_port="forecast model" to_op="Apply Forecast (6)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="103" name="ARIMA Predict Low" width="90" x="246" y="748">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="8.1.001" expanded="true" height="124" name="Optimize Parameters (3)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (5)" width="90" x="112" y="85">
    <parameter key="time_series_attribute" value="Low"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="Date"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (5)" width="90" x="380" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer (5)" to_port="example set"/>
    <connect from_op="ARIMA Trainer (5)" from_port="forecast model" to_op="Apply Forecast (5)" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer (5)" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast (5)" from_port="example set" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (2)" width="90" x="112" y="238">
    <parameter key="time_series_attribute" value="oilLow"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (2)" width="90" x="246" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (3)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (3)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (3)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (2)" from_port="forecast model" to_op="Apply Forecast (2)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="subprocess" compatibility="8.1.001" expanded="true" height="103" name="ARIMA Predict High" width="90" x="246" y="595">
    <process expanded="true">
    <operator activated="true" class="optimize_parameters_evolutionary" compatibility="8.1.001" expanded="true" height="145" name="Optimize Parameters (2)" width="90" x="112" y="34">
    <list key="parameters">
    <parameter key="ARIMA Trainer.qlithiumorder_of_the_moving-average_model" value="[0.0;100.0]"/>
    <parameter key="ARIMA Trainer.plithiumorder_of_the_autoregressive_model" value="[0.0;100.0]"/>
    </list>
    <parameter key="error_handling" value="ignore error"/>
    <parameter key="max_generations" value="%{OptimizeCycles}"/>
    <parameter key="use_early_stopping" value="true"/>
    <process expanded="true">
    <operator activated="true" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (4)" width="90" x="246" y="34">
    <parameter key="time_series_attribute" value="High"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="Date"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="true" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (4)" width="90" x="380" y="34">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="input 1" to_op="ARIMA Trainer (4)" to_port="example set"/>
    <connect from_op="ARIMA Trainer (4)" from_port="forecast model" to_op="Apply Forecast (4)" to_port="forecast model"/>
    <connect from_op="ARIMA Trainer (4)" from_port="performance" to_port="performance"/>
    <connect from_op="Apply Forecast (4)" from_port="example set" to_port="result 1"/>
    <connect from_op="Apply Forecast (4)" from_port="original" to_port="result 2"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="source_input 2" spacing="0"/>
    <portSpacing port="sink_performance" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="false" class="timeseries:arima_trainer" compatibility="0.1.002" expanded="true" height="103" name="ARIMA Trainer (7)" width="90" x="112" y="238">
    <parameter key="time_series_attribute" value="oilHigh"/>
    <parameter key="has_indices" value="true"/>
    <parameter key="indices_attribute" value="oilDate"/>
    <parameter key="plithiumorder_of_the_autoregressive_model" value="94"/>
    <parameter key="qlithiumorder_of_the_moving-average_model" value="56"/>
    <parameter key="estimate_constant" value="false"/>
    </operator>
    <operator activated="false" class="timeseries:apply_forecast" compatibility="0.1.002" expanded="true" height="82" name="Apply Forecast (7)" width="90" x="246" y="238">
    <parameter key="forecast_horizon" value="%{PredictionHorizon}"/>
    <description align="center" color="transparent" colored="false" width="126">Applying the ARIMA process to forecast the next 10 values of the time series</description>
    </operator>
    <connect from_port="in 1" to_op="Optimize Parameters (2)" to_port="input 1"/>
    <connect from_op="Optimize Parameters (2)" from_port="performance" to_port="out 1"/>
    <connect from_op="Optimize Parameters (2)" from_port="result 1" to_port="out 2"/>
    <connect from_op="ARIMA Trainer (7)" from_port="forecast model" to_op="Apply Forecast (7)" to_port="forecast model"/>
    <portSpacing port="source_in 1" spacing="0"/>
    <portSpacing port="source_in 2" spacing="0"/>
    <portSpacing port="sink_out 1" spacing="0"/>
    <portSpacing port="sink_out 2" spacing="0"/>
    <portSpacing port="sink_out 3" spacing="0"/>
    </process>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Set Role (3)" width="90" x="447" y="442">
    <parameter key="attribute_name" value="forecast of Adj Close"/>
    <list key="set_additional_roles">
    <parameter key="Adj Close and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Set Role (4)" width="90" x="447" y="595">
    <parameter key="attribute_name" value="forecast of High"/>
    <list key="set_additional_roles">
    <parameter key="High and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="set_role" compatibility="8.1.001" expanded="true" height="82" name="Set Role (5)" width="90" x="447" y="748">
    <parameter key="attribute_name" value="forecast of Low"/>
    <list key="set_additional_roles">
    <parameter key="Low and forecast" value="regular"/>
    </list>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Filter Graph Last" width="90" x="581" y="442">
    <parameter key="parameter_expression" value="date_after(Date, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Filter Graph High" width="90" x="581" y="595">
    <parameter key="parameter_expression" value="date_after(Date, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="filter_examples" compatibility="8.1.001" expanded="true" height="103" name="Filter Graph Low" width="90" x="581" y="748">
    <parameter key="parameter_expression" value="date_after(Date, date_set(date_now(), -eval(%{PredictionHorizon})-1, DATE_UNIT_DAY))"/>
    <parameter key="condition_class" value="expression"/>
    <list key="filters_list"/>
    </operator>
    <operator activated="true" class="join" compatibility="8.1.001" expanded="true" height="82" name="Join" width="90" x="715" y="493">
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="Date" value="Date"/>
    </list>
    </operator>
    <operator activated="true" class="join" compatibility="8.1.001" expanded="true" height="82" name="Join (2)" width="90" x="715" y="595">
    <parameter key="use_id_attribute_as_key" value="false"/>
    <list key="key_attributes">
    <parameter key="Date" value="Date"/>
    </list>
    </operator>
    <operator activated="true" class="select_attributes" compatibility="8.1.001" expanded="true" height="82" name="Forecast" width="90" x="849" y="493">
    <parameter key="attribute_filter_type" value="subset"/>
    <parameter key="attributes" value="Date|Adj Close|High|Low|forecast of Adj Close|forecast of High|forecast of Low|Adj Close and forecast"/>
    </operator>
    <connect from_op="Execute XRP Crypto" from_port="result 1" to_op="Rename" to_port="example set input"/>
    <connect from_op="Rename" from_port="example set output" to_op="Sort" to_port="example set input"/>
    <connect from_op="Sort" from_port="example set output" to_op="Set Role" to_port="example set input"/>
    <connect from_op="Set Role" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Filter Start of Trend" to_port="example set input"/>
    <connect from_op="Filter Start of Trend" from_port="example set output" to_op="Train until Hold-off" to_port="example set input"/>
    <connect from_op="Train until Hold-off" from_port="example set output" to_op="Multiply (3)" to_port="input"/>
    <connect from_op="Multiply (3)" from_port="output 1" to_op="ARIMA Predict Last" to_port="in 1"/>
    <connect from_op="Multiply (3)" from_port="output 2" to_op="ARIMA Predict High" to_port="in 1"/>
    <connect from_op="Multiply (3)" from_port="output 3" to_op="ARIMA Predict Low" to_port="in 1"/>
    <connect from_op="ARIMA Predict Last" from_port="out 2" to_op="Set Role (3)" to_port="example set input"/>
    <connect from_op="ARIMA Predict Low" from_port="out 2" to_op="Set Role (5)" to_port="example set input"/>
    <connect from_op="ARIMA Predict High" from_port="out 2" to_op="Set Role (4)" to_port="example set input"/>
    <connect from_op="Set Role (3)" from_port="example set output" to_op="Filter Graph Last" to_port="example set input"/>
    <connect from_op="Set Role (4)" from_port="example set output" to_op="Filter Graph High" to_port="example set input"/>
    <connect from_op="Set Role (5)" from_port="example set output" to_op="Filter Graph Low" to_port="example set input"/>
    <connect from_op="Filter Graph Last" from_port="example set output" to_op="Join" to_port="left"/>
    <connect from_op="Filter Graph High" from_port="example set output" to_op="Join" to_port="right"/>
    <connect from_op="Filter Graph Low" from_port="example set output" to_op="Join (2)" to_port="right"/>
    <connect from_op="Join" from_port="join" to_op="Join (2)" to_port="left"/>
    <connect from_op="Join (2)" from_port="join" to_op="Forecast" to_port="example set input"/>
    <connect from_op="Forecast" from_port="example set output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <description align="center" color="green" colored="true" height="166" resized="true" width="558" x="83" y="212">Select Time Series Scope</description>
    <description align="center" color="gray" colored="true" height="141" resized="true" width="551" x="85" y="49">Get source data</description>
    <description align="center" color="orange" colored="true" height="481" resized="true" width="278" x="81" y="395">Generate Future Predictions</description>
    <description align="center" color="blue" colored="true" height="481" resized="true" width="611" x="423" y="397">Reporting</description>
    <description align="center" color="yellow" colored="true" height="331" resized="true" width="361" x="692" y="36">Process Configuration (training example set, horizon, cycles ARIMA optimization, prediction date)</description>
    </process>
    </operator>
    </process>
  • websiteguywebsiteguy Member Posts: 24 Maven

    @Thomas_Ott cheers tom ... Time for a RAM ugrade :)

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    thanks @Thomas_Ott for all the great help here.

     

    as for "I intend to prind the manual, whats the best written document to read? I dont have much maths, or coding, but im learnig from example, by butchering different processes together." I would strongly recommend "Data Mining for the Masses" by Matt North. When I started out, I literally took the PDF, went to Staples and got it printed, and went through every page. It gives you a great foundation with step-by-step instructions using RapidMiner. You can still find the PDF online here, or you can purchase the 2nd edition on Amazon here

     

    There is also an excellent playlist of intro videos you can find on our main website: https://rapidminer.com/training/videos/

     

    Scott

     

  • websiteguywebsiteguy Member Posts: 24 Maven

    @Thomas_Ott

    @sgenzer

    Hi thanks for the help, will be printing off buying the manual (found it on Amazon UK)  as soon as I get near a printer. Also found the manual 900 pages..."ouch" will  print that to... will keep me busy :)

     

    Just one more bit of help if possible, I cant run that data through the luc process, the date format seems to be inccorect? so it hangs.

    date-time-trouble.pngdate-time-trouble-gmt-bst.png

    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="numerical_to_real" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Real" width="90" x="313" y="34">
    <parameter key="attribute_filter_type" value="all"/>
    <parameter key="attribute" value=""/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="numeric"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="real"/>
    <parameter key="block_type" value="value_series"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_series_end"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="de_pivot" compatibility="8.1.001" expanded="true" height="82" name="De-Pivot" width="90" x="447" y="34">
    <list key="attribute_name">
    <parameter key="close" value="\[\d+\]\.close"/>
    <parameter key="high" value="\[\d+\]\.high"/>
    <parameter key="low" value="\[\d+\]\.low"/>
    <parameter key="open" value="\[\d+\]\.open"/>
    <parameter key="date" value="\[\d+\]\.date"/>
    <parameter key="volume" value="\[\d+\]\.volume"/>
    <parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
    <parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
    </list>
    <parameter key="index_attribute" value="id"/>
    <parameter key="create_nominal_index" value="false"/>
    <parameter key="keep_missings" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="real_to_integer" compatibility="8.1.001" expanded="true" height="82" name="Real to Integer" width="90" x="581" y="34">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="date"/>
    <parameter key="attributes" value=""/>
    <parameter key="use_except_expression" value="false"/>
    <parameter key="value_type" value="real"/>
    <parameter key="use_value_type_exception" value="false"/>
    <parameter key="except_value_type" value="real"/>
    <parameter key="block_type" value="value_series_end"/>
    <parameter key="use_block_type_exception" value="false"/>
    <parameter key="except_block_type" value="value_series_end"/>
    <parameter key="invert_selection" value="false"/>
    <parameter key="include_special_attributes" value="false"/>
    <parameter key="round_values" value="false"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="generate_attributes" compatibility="8.1.001" expanded="true" height="82" name="Generate Attributes" width="90" x="715" y="34">
    <list key="function_descriptions">
    <parameter key="Coverted Date" value="date_add(date_parse(&quot;01/01/1970&quot;),date,DATE_UNIT_SECOND)"/>
    </list>
    <parameter key="keep_all" value="true"/>
    </operator>
    </process>
    <?xml version="1.0" encoding="UTF-8"?><process version="8.1.001">
    <operator activated="true" class="numerical_to_date" compatibility="8.1.001" expanded="true" height="82" name="Numerical to Date" width="90" x="715" y="238">
    <parameter key="attribute_name" value="date"/>
    <parameter key="keep_old_attribute" value="false"/>
    <parameter key="time_offset" value="0"/>
    </operator>
    </process>

    thanks for your feedback

    much appreciated

     

    regards

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy you'd have to use one of the Date operators to fix that. Probably Date to Nominal and then Nominal to Date. Or, for the time being, just use a Generate ID after you sorted the time series to introduce a new ID and toss out the date. That should speed things up for testing. 

  • websiteguywebsiteguy Member Posts: 24 Maven

    @Thomas_Ott

    By luck rather than design I did sort it out, to get the date in the right format.

    Why are the ARIMA predictions so disparate? im using Lucs process, I’ve only got a laptop with 8gb ram, so its torture.

    I was looking at that correlation matrix (link I sent you the other day) It seems that there are correlations and divergences of different crypto.

    Therefore, I was thinking if you could run two processes of two divergent tokens, then that would give a confirmation of trend.

    As you would expect to see divergence in the predictions of future prices.

    Could these divergent tokens be processed in sequence with the same model then inverse correlation between outcomes used to establish a tighter prediction?

    How to improve the ARIMA model + is there a way to speed it up and why cant ARIMA be run in the cloud?

     

    cheers, lee

     

    https://www.sifrdata.com/cryptocurrency-correlation-matrix/

     

    divergent.png

     

    ARIMA.png

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn

    @websiteguy you can speed up the process by looking at the optimization parameters in the ARIMA subprocesses. It uses optimization and that takes a long time.  W.R.T to processing in sequence, that should be do-able. You might be able to update the model with the Update Model operator.

  • FrankHofstedeFrankHofstede Member Posts: 1 Learner I
    People, I have been trying for 3 days now.. The 'issue' I have is exactly the same as a previous user but it was never answered..

    "I have a JSON file extracted from Elasticsearch but I get only one row of data when i convert it from json to data. I would like to separate the data in form of a proper table.

    Could you please explain it in a bit detail?"


    My JSON looks like this:


    -------

    {

      "took": 904,

      "timed_out": false,

      "_shards": {

        "total": 5,

        "successful": 5,

        "failed": 0

      },

      "hits": {

        "total": 1233,

        "max_score": 1.0,

        "hits": [

          {

            "_index": "prd_www-asadventure-com_nl",

            "_type": "content",

            "_id": "_content_www-asadventure-com_nl_expertise-tips_travel_avontuur-met-twee",

            "_score": 1.0,

            "_source": {

              "contents": "Avontuur met twee: papa Gunther en zoontje Felix bedwingen de Noorse bergen op de fiets Deel dit Delen Tweet De appel valt meestal niet ver van de boom. Zo ook bij reisjournalist Gunther Hauspie en zijn vijfjarig zoontje Felix. Ze kropen allebei op hun mountainbike en fietsten met zijn tweetjes de Rallarvegen in Noorwegen af. Niet bepaald een typisch kinderuitstapje, maar wel eentje waar Gunther en Felix nog lang van zullen dromen. Avontuur zoek? Avontuurlijke reizen maken, het is niet vanzelfsprekend als je kleine bengels in huis hebt. Dat gedachte bekroop Gunther ook. De laatste jaren vond het avontuur minder zijn weg naar de laatste vezels van mijn lijf. Heel veel praktische bezwaren herleidden avontuur dan tot een speelbos of een holle weg in de buurt.",

              "description": "<p>Tot de verbeelding sprekende avonturen beleven met je kids, kan dat wel? Absoluut! Onze journalist Gunther fietste met zijn vijfjarige zoon over de Noorse Rallarvegen.</p>\n",

              "image": "/content/dam/asadventure/contentpages/travel/avontuur-met-twee/Rallarvegen2_square.jpg",

              "lastReplicated": "2019-07-11T14:02:02+0000",

              "path": "/content/www-asadventure-com/nl/expertise-tips/travel/avontuur-met-twee",

              "sortOrder": 3590,

              "tags": [

                "vader-zoonavontuur",

                "fietsen",

                "noorwegen0",

                "reis",

                "rallarvegen",

                "mountainbike",

                "bergen",

                "fiets",

                "fietsroute",

                "kinderen",

                "avontuurlijk_reizen",

                "uitstap",

                "vakantie",

                "hardangervidda"

              ],

              "title": "Avontuur met twee: papa Gunther en zoontje Felix bedwingen de Noorse bergen op de fiets",

              "instigator": "PageEventListener",

              "lastIndexed": "2019-07-31 07:14"

            }

          },

          {

            "_index": "prd_www-asadventure-com_nl",

            "_type": "content",

            "_id": "_content_www-asadventure-com_nl_expertise-tips_travel_india-voor-dummies",

            "_score": 1.0,

            "_source": {

              "contents": "Op reis naar Indi? Dankzij deze tips beleef je een onvergetelijke ervaring! Deel dit Delen Tweet India is een prachtig en uitgestrekt land waar je nooit op uitgekeken raakt, maar op sommige vlakken heeft het zn reputatie een beetje tegen. Niet helemaal terecht, als je het ons vraagt, want met een beetje voorbereiding en voorkennis kom je jouw eerste bezoek aan India haast zeker zonder kleerscheuren door. Doe je voordeel met deze tips! Je visum: regel het drie maanden op voorhand India is enkel toegankelijk voor toeristen met een geldig visum. Dat moet je dus tijdig aanvragen.",

              "description": "<p>Wil je graag op reis naar India? Check dan zeker deze reistips voor een zorgeloze ervaring!&nbsp;</p>\n",

              "image": "/content/dam/asadventure/contentpages/travel/india-voor-dummies/ancient-arch-architecture-290643.jpg",

              "lastReplicated": "2019-07-11T14:02:05+0000",

              "path": "/content/www-asadventure-com/nl/expertise-tips/travel/india-voor-dummies",

              "sortOrder": 4750,

              "tags": [

                "reisgids_india",

                "india_tips",

                "delhi_belly",

                "op_reis_india",

                "asadventure_department:travel",

                "reis0",

                "advies",

                "india0",

                "reisadvies",

                "toerisme0",

                "reistips0",

                "type:inspiration",

                "india_tips_reizen",

                "taj_mahal",

                "india_bezoeken",

                "india_tips_restaurant",

                "op_reis_naar_india",

                "india_reisadvies"

              ],

              "title": "India voor dummies",

              "instigator": "PageEventListener",

              "lastIndexed": "2019-07-31 07:16"

            }

          },

          {

            "_index": "prd_www-asadventure-com_nl",

            "_type": "content",

            "_id": "_content_www-asadventure-com_nl_expertise-tips_travel_wat-is-deet",

            "_score": 1.0,

            "_source": {

              "contents": "Wat is DEET? En tegen welke insecten beschermt het je? Deel dit Delen Tweet Wil je je vakantie niet al jeukend en krabbend doorbrengen, dan kan je maar beter een goed anti-insectenmiddel meenemen. Door de sterke geur die de muggenmelk verspreidt, blijven de stekende beestjes op een veilige afstand. Ben je een echte muggenmagneet of vertrek je op reis naar de tropen? Kies dan voor een product met DEET, het sterkste insectenwerende middel voor de huid. 1. Wat is DEET eigenlijk? ",

              "description": "<p>Ben jij een echte muggenmagneet of trek je naar tropische oorden, dan gebruik je best een anti-insectenmiddel met DEET. Maar wat is DEET eigenlijk?</p>\n",

              "image": "/content/dam/asadventure/contentpages/travel/deet/Openingsbeeld_thumb.jpg",

              "lastReplicated": "2019-07-24T13:07:38+0000",

              "path": "/content/www-asadventure-com/nl/expertise-tips/travel/wat-is-deet",

              "sortOrder": 5310,

              "title": "Wat is DEET?",

              "instigator": "PageEventListener",

              "lastIndexed": "2019-07-31 07:16"

            }

          }

        ]

      }

    }



    ------

    What I want is a table with 2 columns, title and tags ("tag, tag, tag, ..")

    Not all of them have tags..

    And really, whatever I try.. Can't get it to work. Tried transposing, extracting, depivoting, range selections, etc. etc. Tried every Google results and studied all the video's..

    Can somebody please help?

    Kind regards,
    Frank
  • Telcontar120Telcontar120 RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,635 Unicorn
    There is a new extension from Old World Computing that makes dealing with JSON data much easier than the native operators from RapidMiner, which do work but require a lot of ETL work after the initial parsing.
    I suggest you take a look at the Web Automation extension and reach out to OWC with any questions about it.
    Brian T.
    Lindon Ventures 
    Data Science Consulting from Certified RapidMiner Experts
  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager
    yup thx @Telcontar120. You can read more on OWC's website here: https://oldworldcomputing.com/en/webautomation-extension/ and there's a nice sample process in the Community Repo:



    Scott

  • BalazsBaranyBalazsBarany Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert Posts: 955 Unicorn
    Hi!

    jq is a great tool for processing this kind of complex JSON document.

    At jqplay.org you can play with your query string and your document to get the result you're searching @csv2000
    This expression converts your data into a CSV:
    {hits: .hits.hits[]} | { title: .hits._source.title, tag: .hits._source.tags[]? } | [.title, .tag] | @csv
    I blogged about using JQ expressions in RapidMiner processes. Maybe this is a good solution for you, too.

    Regards,

    Balázs
Sign In or Register to comment.