Rapidminer and JSON
Hello!
How can I convert a JSON object into a table that Rapidminer can handle?
This is the JSON that I am working on:
[{"date":1465632900,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00078269,"quoteVolume":0.39276167,"weightedAverage":0.00199281},{"date":1465633200,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00034535,"quoteVolume":0.17329899,"weightedAverage":0.00199281},{"date":1465633500,"high":0.00198761,"low":0.00198761,"open":0.00198761,"close":0.00198761,"volume":0.00126317,"quoteVolume":0.63552206,"weightedAverage":0.00198761},{"date":1465633800,"high":0.00200383,"low":0.00199217,"open":0.00199217,"close":0.00200383,"volume":0.99928894,"quoteVolume":499.17633002,"weightedAverage":0.00200187}]
I tried:
Get Page (url) ---> JSON to XML
However I got this message:
"A JSONObject text must begin with '{' at character 1"
I also realized that my JSON has not any root/enclosing name and I guess this may be the problem.
What can I do to read this JSON as a table? Thanks!
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
Hi wirtcal,
json to data does the job. The data needs to be depivoted afterwards. A process is attached.
~Martin
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.2.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="text:create_document" compatibility="7.2.000" expanded="true" height="68" name="Create Document" width="90" x="112" y="34">
<parameter key="text" value="[{"date":1465632900,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00078269,"quoteVolume":0.39276167,"weightedAverage":0.00199281},{"date":1465633200,"high":0.00199281,"low":0.00199281,"open":0.00199281,"close":0.00199281,"volume":0.00034535,"quoteVolume":0.17329899,"weightedAverage":0.00199281},{"date":1465633500,"high":0.00198761,"low":0.00198761,"open":0.00198761,"close":0.00198761,"volume":0.00126317,"quoteVolume":0.63552206,"weightedAverage":0.00198761},{"date":1465633800,"high":0.00200383,"low":0.00199217,"open":0.00199217,"close":0.00200383,"volume":0.99928894,"quoteVolume":499.17633002,"weightedAverage":0.00200187}] "/>
</operator>
<operator activated="true" class="text:json_to_data" compatibility="7.2.000" expanded="true" height="82" name="JSON To Data" width="90" x="246" y="34"/>
<operator activated="true" class="de_pivot" compatibility="7.2.001" expanded="true" height="82" name="De-Pivot" width="90" x="380" y="34">
<list key="attribute_name">
<parameter key="close" value="\[\d+\]\.close"/>
<parameter key="high" value="\[\d+\]\.high"/>
<parameter key="low" value="\[\d+\]\.low"/>
<parameter key="open" value="\[\d+\]\.open"/>
<parameter key="date" value="\[\d+\]\.date"/>
<parameter key="volume" value="\[\d+\]\.volume"/>
<parameter key="quoteVolume" value="\[\d+\]\.quoteVolume"/>
<parameter key="weightedAverage" value="\[\d+\]\.weightedAverage"/>
</list>
<parameter key="index_attribute" value="id"/>
</operator>
<connect from_op="Create Document" from_port="output" to_op="JSON To Data" to_port="documents 1"/>
<connect from_op="JSON To Data" from_port="example set" to_op="De-Pivot" to_port="example set input"/>
<connect from_op="De-Pivot" from_port="example set output" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany3
Answers
Just an FYI. The JSON to Data operator is found in the Text Mining extension. Download that first.
Thank you both!
I was just about to ask if this operator was only available in the Rapidminer Pro.
I will install the Text Mining extension to check this out.
Cheers!
is there a straight forward way to handle non uniform JSON data. ie json with dynamic fields that don't appear on each example. I get an error in this case but would prefere to get back null for that example / attribute.
Well, it depends on your definition of straight forward ;-)
JSON to Data gives you a very wide example set, usually with only one example. You can work in it with Loop Attributes, but I found it sometimes easier to transpose or rotate the example set (using the Transpose operator). It might be easier to extract stuff like example indexes and so on with the transposed structure. However, it will convert numeric data to text with a standard formatting. You might want to do that before the Transpose yourself with the operator of your choosing (Format Numbers, Numerical to Polynominal).
Hi, where do you put this process code?
I have a JSON file extracted from Elasticsearch but I get only one row of data when i convert it from json to data. I would like to separate the data in form of a proper table.
Could you please explain it in a bit detail?
If you want to use the XML process code, check out this KB article on how to do it: http://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/How-can-I-share-processes-without-RapidMiner-Server/ta-p/37047
If you see XML code in the Community and want to use it, just copy everything, activate the XML tab in Studio (View/Show Panel/XML) and paste it there.
JSON can have a very complex structure, it is not guaranteed to be a "proper table". So what JSON to Data does is taking all elements and naming them by the "path" to the element. It's something you need to get used to, but it avoids a lot of complexity for simple documents.
To extract tabular data, look at the metadata of the table or transpose it using the Transpose operator. You'll see that the attribute names (or the ID of the transposed line) have a structure that you can extract. E. g. you might have a name like example[1][1]. You can use Generate Attributes or Replace for extracting the index numbers (1, 1). Then you do some filtering, maybe joining or pivoting to come up with the structure you need.
You might not like this approach. There's another: JSON to XML in the Web Mining extension.
Take your JSON document, use JSON to XML and Write Document to export the generated XML file. Then you can use the Read XML operator's wizard to extract contents in a more structured way. However, this doesn't always work as JSON is more flexible than XML, so there are many JSON documents that can't be converted.
Regards,
Balázs
Just what I was looking for! Thank you!
@mschmitz
Hi
can you help me out with the timestampI cant work out how to convert from string of numbers to date. I just keep ending up with 1970 ?
https://poloniex.com/public?command=returnChartData¤cyPair=USDT_XRP&end=9999999999&period=1440...
to get it from a string to a date time format, I spent hours trying to suss it out?
I did post here
https://community.rapidminer.com/t5/RapidMiner-Studio-Forum/JSON-to-data-and-de-pivot-for-exampleset/m-p/47827#M30657 as well
thanks in advance, lee
@websiteguy the problem is that the date is an integer and all the other values are real. The trick is to use a Numerical to Real operator. You'll have to convert the dates back to a readable format downstream.
Hi Tom, and @sgenzer (this process below) thanks for the reply.
I have that part already resolved after folowing the tut in this thread, the issue im having is the date seems to be a time stamp. A long string of numbers, after converting it using the Numerical to real, how to then convert the date in to a format as below. Reason - Alpha Vantage does not have intra day prices for this ticker.
Using an AREMA process I found on the forum
thanks for your time, appreciated.
Lee
Sat, 17 Mar 2018 12:00:00 +0000
@websiteguy could you use a Numerical to Date operator and set the Offset parameter?
Hi @Thomas_Ott
Wow that was quick, went backt to edit my reply.. any you already replied.
Could you provide an example please in a process? (I been trying to resolve, with limited exsperiance all night)
Do you have to define the date as an attribute first?
Found this
https://community.rapidminer.com/t5/RapidMiner-Studio-Forum/Date-Time-Formatting-UTC/td-p/28160
Is this the right method?
As a side note,
I intend to prind the manual, whats the best written document to read? I dont have much maths, or coding, but im learnig from example, by butchering different processes together.
@websiteguy try this process, it uses a Generate Attributes operator.
Gives you this nice chart of Ripple. You should buy a lot of it. I own many units of it at $0.73.
Thanks @Thomas_Ott
you just made my day
Hi @Thomas_Ott do you have this processed with the ARIMA for predictions?
Ive been in to XRP since 2014... waiting patiently ...
Do you have any other processes crypto analysis you can share / (PM) ???
this is cool you seen this?
https://blog.patricktriest.com/analyzing-cryptocurrencies-python/
Cheers Lee
@websiteguy i haven't seen that post but I'll check it out later. Thanks.
WRT to doing ARIMA. I adapted this ARIMA process that @luc_bartkowski put together. Just take the process above with the JSON and timestamp conversions and attached it to this. Might take a few minutes to process.
@Thomas_Ott cheers tom ... Time for a RAM ugrade
thanks @Thomas_Ott for all the great help here.
as for "I intend to prind the manual, whats the best written document to read? I dont have much maths, or coding, but im learnig from example, by butchering different processes together." I would strongly recommend "Data Mining for the Masses" by Matt North. When I started out, I literally took the PDF, went to Staples and got it printed, and went through every page. It gives you a great foundation with step-by-step instructions using RapidMiner. You can still find the PDF online here, or you can purchase the 2nd edition on Amazon here.
There is also an excellent playlist of intro videos you can find on our main website: https://rapidminer.com/training/videos/
Scott
@Thomas_Ott
@sgenzer
Hi thanks for the help, will be printing off buying the manual (found it on Amazon UK) as soon as I get near a printer. Also found the manual 900 pages..."ouch" will print that to... will keep me busy
Just one more bit of help if possible, I cant run that data through the luc process, the date format seems to be inccorect? so it hangs.
thanks for your feedback
much appreciated
regards
@websiteguy you'd have to use one of the Date operators to fix that. Probably Date to Nominal and then Nominal to Date. Or, for the time being, just use a Generate ID after you sorted the time series to introduce a new ID and toss out the date. That should speed things up for testing.
@Thomas_Ott
By luck rather than design I did sort it out, to get the date in the right format.
Why are the ARIMA predictions so disparate? im using Lucs process, I’ve only got a laptop with 8gb ram, so its torture.
I was looking at that correlation matrix (link I sent you the other day) It seems that there are correlations and divergences of different crypto.
Therefore, I was thinking if you could run two processes of two divergent tokens, then that would give a confirmation of trend.
As you would expect to see divergence in the predictions of future prices.
Could these divergent tokens be processed in sequence with the same model then inverse correlation between outcomes used to establish a tighter prediction?
How to improve the ARIMA model + is there a way to speed it up and why cant ARIMA be run in the cloud?
cheers, lee
https://www.sifrdata.com/cryptocurrency-correlation-matrix/
@websiteguy you can speed up the process by looking at the optimization parameters in the ARIMA subprocesses. It uses optimization and that takes a long time. W.R.T to processing in sequence, that should be do-able. You might be able to update the model with the Update Model operator.
"I have a JSON file extracted from Elasticsearch but I get only one row of data when i convert it from json to data. I would like to separate the data in form of a proper table.
Could you please explain it in a bit detail?"
My JSON looks like this:
-------
{
"took": 904,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1233,
"max_score": 1.0,
"hits": [
{
"_index": "prd_www-asadventure-com_nl",
"_type": "content",
"_id": "_content_www-asadventure-com_nl_expertise-tips_travel_avontuur-met-twee",
"_score": 1.0,
"_source": {
"contents": "Avontuur met twee: papa Gunther en zoontje Felix bedwingen de Noorse bergen op de fiets Deel dit Delen Tweet De appel valt meestal niet ver van de boom. Zo ook bij reisjournalist Gunther Hauspie en zijn vijfjarig zoontje Felix. Ze kropen allebei op hun mountainbike en fietsten met zijn tweetjes de Rallarvegen in Noorwegen af. Niet bepaald een typisch kinderuitstapje, maar wel eentje waar Gunther en Felix nog lang van zullen dromen. Avontuur zoek? Avontuurlijke reizen maken, het is niet vanzelfsprekend als je kleine bengels in huis hebt. Dat gedachte bekroop Gunther ook. De laatste jaren vond het avontuur minder zijn weg naar de laatste vezels van mijn lijf. Heel veel praktische bezwaren herleidden avontuur dan tot een speelbos of een holle weg in de buurt.",
"description": "<p>Tot de verbeelding sprekende avonturen beleven met je kids, kan dat wel? Absoluut! Onze journalist Gunther fietste met zijn vijfjarige zoon over de Noorse Rallarvegen.</p>\n",
"image": "/content/dam/asadventure/contentpages/travel/avontuur-met-twee/Rallarvegen2_square.jpg",
"lastReplicated": "2019-07-11T14:02:02+0000",
"path": "/content/www-asadventure-com/nl/expertise-tips/travel/avontuur-met-twee",
"sortOrder": 3590,
"tags": [
"vader-zoonavontuur",
"fietsen",
"noorwegen0",
"reis",
"rallarvegen",
"mountainbike",
"bergen",
"fiets",
"fietsroute",
"kinderen",
"avontuurlijk_reizen",
"uitstap",
"vakantie",
"hardangervidda"
],
"title": "Avontuur met twee: papa Gunther en zoontje Felix bedwingen de Noorse bergen op de fiets",
"instigator": "PageEventListener",
"lastIndexed": "2019-07-31 07:14"
}
},
{
"_index": "prd_www-asadventure-com_nl",
"_type": "content",
"_id": "_content_www-asadventure-com_nl_expertise-tips_travel_india-voor-dummies",
"_score": 1.0,
"_source": {
"contents": "Op reis naar Indi? Dankzij deze tips beleef je een onvergetelijke ervaring! Deel dit Delen Tweet India is een prachtig en uitgestrekt land waar je nooit op uitgekeken raakt, maar op sommige vlakken heeft het zn reputatie een beetje tegen. Niet helemaal terecht, als je het ons vraagt, want met een beetje voorbereiding en voorkennis kom je jouw eerste bezoek aan India haast zeker zonder kleerscheuren door. Doe je voordeel met deze tips! Je visum: regel het drie maanden op voorhand India is enkel toegankelijk voor toeristen met een geldig visum. Dat moet je dus tijdig aanvragen.",
"description": "<p>Wil je graag op reis naar India? Check dan zeker deze reistips voor een zorgeloze ervaring! </p>\n",
"image": "/content/dam/asadventure/contentpages/travel/india-voor-dummies/ancient-arch-architecture-290643.jpg",
"lastReplicated": "2019-07-11T14:02:05+0000",
"path": "/content/www-asadventure-com/nl/expertise-tips/travel/india-voor-dummies",
"sortOrder": 4750,
"tags": [
"reisgids_india",
"india_tips",
"delhi_belly",
"op_reis_india",
"asadventure_department:travel",
"reis0",
"advies",
"india0",
"reisadvies",
"toerisme0",
"reistips0",
"type:inspiration",
"india_tips_reizen",
"taj_mahal",
"india_bezoeken",
"india_tips_restaurant",
"op_reis_naar_india",
"india_reisadvies"
],
"title": "India voor dummies",
"instigator": "PageEventListener",
"lastIndexed": "2019-07-31 07:16"
}
},
{
"_index": "prd_www-asadventure-com_nl",
"_type": "content",
"_id": "_content_www-asadventure-com_nl_expertise-tips_travel_wat-is-deet",
"_score": 1.0,
"_source": {
"contents": "Wat is DEET? En tegen welke insecten beschermt het je? Deel dit Delen Tweet Wil je je vakantie niet al jeukend en krabbend doorbrengen, dan kan je maar beter een goed anti-insectenmiddel meenemen. Door de sterke geur die de muggenmelk verspreidt, blijven de stekende beestjes op een veilige afstand. Ben je een echte muggenmagneet of vertrek je op reis naar de tropen? Kies dan voor een product met DEET, het sterkste insectenwerende middel voor de huid. 1. Wat is DEET eigenlijk? ",
"description": "<p>Ben jij een echte muggenmagneet of trek je naar tropische oorden, dan gebruik je best een anti-insectenmiddel met DEET. Maar wat is DEET eigenlijk?</p>\n",
"image": "/content/dam/asadventure/contentpages/travel/deet/Openingsbeeld_thumb.jpg",
"lastReplicated": "2019-07-24T13:07:38+0000",
"path": "/content/www-asadventure-com/nl/expertise-tips/travel/wat-is-deet",
"sortOrder": 5310,
"title": "Wat is DEET?",
"instigator": "PageEventListener",
"lastIndexed": "2019-07-31 07:16"
}
}
]
}
}
------
What I want is a table with 2 columns, title and tags ("tag, tag, tag, ..")
Not all of them have tags..
And really, whatever I try.. Can't get it to work. Tried transposing, extracting, depivoting, range selections, etc. etc. Tried every Google results and studied all the video's..
Can somebody please help?
Kind regards,
Frank
I suggest you take a look at the Web Automation extension and reach out to OWC with any questions about it.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Scott
jq is a great tool for processing this kind of complex JSON document.
At jqplay.org you can play with your query string and your document to get the result you're searching @csv2000
This expression converts your data into a CSV:
I blogged about using JQ expressions in RapidMiner processes. Maybe this is a good solution for you, too.
Regards,
Balázs