The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[solved] Issue with Extract Information operator JsonPath query type
I'm currently trying a basic example of the Extract Information operator using the JsonPath Query type. No matter how I structure the jsonpath query expression(s), I get either the entire document or an error:
Here's the process:
RapidMiner Studio 6.3.0000 (rev: 251598) - Professional Plus
Windows 8.1
- $.store.book yields the entire document, not just the books.
- $.store.book[0] yields: Process Failed. net.minidev.json.JSONObject cannot be cast to net.minidev.json.JSONArray.
Here's the process:
Any direction on how the jsonpath query expressions should look for the RapidMiner is appreciated.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="6.3.000">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.3.000" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="text:create_document" compatibility="6.1.000" expanded="true" height="60" name="Create Document" width="90" x="45" y="30">
<parameter key="text" value="{ "store": { "book": [ { "category": "reference", "author": "Nigel Rees", "title": "Sayings of the Century", "price": 8.95 }, { "category": "fiction", "author": "Evelyn Waugh", "title": "Sword of Honour", "price": 12.99 }, { "category": "fiction", "author": "Herman Melville", "title": "Moby ****", "isbn": "0-553-21311-3", "price": 8.99 }, { "category": "fiction", "author": "J. R. R. Tolkien", "title": "The Lord of the Rings", "isbn": "0-395-19395-8", "price": 22.99 } ], "bicycle": { "color": "red", "price": 19.95 } } }"/>
</operator>
<operator activated="true" class="text:extract_information" compatibility="6.1.000" expanded="true" height="60" name="Extract Information" width="90" x="447" y="30">
<parameter key="query_type" value="JsonPath"/>
<list key="string_machting_queries"/>
<list key="regular_expression_queries"/>
<list key="regular_region_queries"/>
<list key="xpath_queries"/>
<list key="namespaces"/>
<list key="index_queries"/>
<list key="jsonpath_queries">
<parameter key="booksOnly" value="$.store.book"/>
</list>
</operator>
<connect from_op="Create Document" from_port="output" to_op="Extract Information" to_port="document"/>
<connect from_op="Extract Information" from_port="document" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
RapidMiner Studio 6.3.0000 (rev: 251598) - Professional Plus
Windows 8.1
0
Answers
the problem here is that you need a "Documents to Data" in order to make use of the meta data "Extract Information" generates. Even in this case only the first item of a list is shown. You may use "Cut Document" to get a collection of those items and "Combine Document" to merge them to one line. Here is a process that shows how to do that: Cheers,
Helge