The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How use Read XML Operator
Hello.
I have troubles to use Read XML Operator.
I read this xml file:
<?xml version="1.0" encoding="UTF-8"?>
<ROOT>
<RECORD>
<ID>id1</ID>
<TEXT name="text1">
<KEYWORD>kw1</KEYWORD>
<KEYWORD>kw2</KEYWORD>
<KEYWORD>kw3</KEYWORD>
</TEXT>
</RECORD>
<RECORD>
<ID>ID2</ID>
<TEXT name="text2">
<KEYWORD>kw4</KEYWORD>
<KEYWORD>kw5</KEYWORD>
<KEYWORD>kw6</KEYWORD>
</TEXT>
</RECORD>
</ROOT>
and want to get this table:
ID | name | KEYWORD |
id1 | text1 | kw1 |
id1 | text1 | kw2 |
id1 | text1 | kw3 |
ID2 | text2 | kw4 |
ID2 | text2 | kw5 |
ID2 | text2 | kw6 |
How I can do it?
Xml version of process is here:
<?xml version="1.0" encoding="UTF-8"?><process version="7.2.003">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.2.003" expanded="true" height="68" name="Read XML (2)" width="90" x="45" y="136">
<parameter key="file" value="D:\1\new 4.xml"/>
<parameter key="xpath_for_examples" value="//ROOT/RECORD"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="TEXT[1]/attribute::name"/>
<parameter key="xpath_for_attribute" value="TEXT[1]/KEYWORD[1]/text()"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="TEXT[1]/attribute::name.true.attribute_value.attribute"/>
<parameter key="1" value="TEXT[1]/KEYWORD[1]/text().true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="advanced_file_connectors:read_xml" compatibility="7.2.003" expanded="true" height="68" name="Read XML" width="90" x="45" y="34">
<parameter key="file" value="D:\1\new 4.xml"/>
<parameter key="xpath_for_examples" value="//ROOT/RECORD"/>
<enumeration key="xpaths_for_attributes">
<parameter key="xpath_for_attribute" value="ID[1]/text()"/>
<parameter key="xpath_for_attribute" value="TEXT[1]/attribute::name"/>
</enumeration>
<list key="namespaces"/>
<parameter key="use_default_namespace" value="false"/>
<list key="annotations"/>
<list key="data_set_meta_data_information">
<parameter key="0" value="ID[1]/text().true.attribute_value.attribute"/>
<parameter key="1" value="TEXT[1]/attribute::name.true.attribute_value.attribute"/>
</list>
</operator>
<operator activated="true" class="join" compatibility="7.2.003" expanded="true" height="82" name="Join" width="90" x="313" y="34">
<parameter key="use_id_attribute_as_key" value="false"/>
<list key="key_attributes">
<parameter key="TEXT[1]/attribute::name" value="TEXT[1]/attribute::name"/>
</list>
</operator>
<connect from_op="Read XML (2)" from_port="output" to_op="Join" to_port="right"/>
<connect from_op="Read XML" from_port="output" to_op="Join" to_port="left"/>
<connect from_op="Join" from_port="join" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
Tagged:
0
Answers
Hi dimons91,
Have you tried to use the Import Wizard for Read XML? It will generate the xpaths for attributes automatically. All you need is to select the beans in the step 4 of configuration wizard for your wanted attributes.
Import Configuration Wizard Step 4: select beans
I have the sample process here for you
HTH,
YY
Hi, yyhuang
Thank you! It's really helpful.
But I made the xml file is more complex, and again difficulties arose.
I added some levels of nesting.
Because of this, an error occurs in the de-pivot operator
May be I need use more then one Read xml operator and then join tables? When I try this way I haven't any attribute make join correctly.
Here is new xml:
this table I need to make:
This proccess I made:
Thanks for giving me the new XML data. I fixed some of the function expressions for 'de-pivot'. It is always tricky to make the regular expressions work for that.
Happy RapidMinining, :smileyvery-happy:
YY
Hi yyhuang,
How can I open that wizard? I have seen references to it several times, but I cannot find any documentation. There is the "File">"Add Data" import dialog, but that one only allows me to import CSV and Excel. Is it part of some plugin?
Thanks,
Michael
The wizard is available directly inside the Read XML operator. See the attached parameter view.
You'll need to save a local copy of the XML file to run it but then you can point the resulting operator back to a web address if you want.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts