The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] outer Join
tennenrishin
Member Posts: 177 Contributor II
b in row 2 is missing in the output from this example. Is that correct behavior for outer join?
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
<process expanded="true" height="659" width="1043">
<operator activated="true" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="a=1 b=1" width="90" x="45" y="120">
<list key="attribute_values">
<parameter key="a" value="1"/>
<parameter key="b" value="1"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="generate_data_user_specification" compatibility="5.2.008" expanded="true" height="60" name="a=2 b=2" width="90" x="45" y="255">
<list key="attribute_values">
<parameter key="a" value="2"/>
<parameter key="b" value="2"/>
</list>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="join" compatibility="5.2.008" expanded="true" height="76" name="Join" width="90" x="447" y="165">
<parameter key="join_type" value="outer"/>
<parameter key="use_id_attribute_as_key" value="false"/>
<list key="key_attributes">
<parameter key="a" value="a"/>
</list>
</operator>
<connect from_op="a=1 b=1" from_port="output" to_op="Join" to_port="left"/>
<connect from_op="a=2 b=2" from_port="output" to_op="Join" to_port="right"/>
<connect from_op="Join" from_port="join" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="126"/>
<portSpacing port="sink_result 2" spacing="72"/>
</process>
</operator>
</process>
Tagged:
0
Answers
this is a bit confusing because you named the second attribute ("b") identical in both sets.
What happens is the following: you have enabled the parameter "remove_double_attributes", that means that if you have equally named attributes on the left side and the right side, Rapid Miner always uses the one from the left side. Since for a=2 you don't have a value for b on the left, it is missing.
If you disable the aforementioned parameter, everything will be as expected.
Happy Mining!
~Marius