The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"[ACKED] Aggregate operator's use default aggregation behavior"

tennenrishintennenrishin Member Posts: 177 Contributor II
edited June 2019 in Help
I need to generate aggregates (e.g. sum) for each attribute whose name complies with a given regular expression.

However, the Aggregate operator (with "use default aggregation" checked) seems to ignore the attribute filter. For example, the process below generates an ExampleSet with both sum(a) and sum(b), rather than just sum(a).
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.006">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.006" expanded="true" name="Process">
   <process expanded="true" height="460" width="748">
     <operator activated="true" class="generate_data_user_specification" compatibility="5.2.006" expanded="true" height="60" name="Generate Data by User Specification" width="90" x="122" y="121">
       <list key="attribute_values">
         <parameter key="a" value="1"/>
         <parameter key="b" value="2"/>
       </list>
       <list key="set_additional_roles"/>
     </operator>
     <operator activated="true" class="multiply" compatibility="5.2.006" expanded="true" height="94" name="Multiply" width="90" x="251" y="120"/>
     <operator activated="true" class="union" compatibility="5.2.006" expanded="true" height="76" name="Union" width="90" x="380" y="120"/>
     <operator activated="true" class="aggregate" compatibility="5.2.006" expanded="true" height="76" name="Aggregate" width="90" x="514" y="120">
       <parameter key="use_default_aggregation" value="true"/>
       <parameter key="attribute_filter_type" value="regular_expression"/>
       <parameter key="attributes" value="a|"/>
       <parameter key="regular_expression" value="a"/>
       <parameter key="default_aggregation_function" value="sum"/>
       <list key="aggregation_attributes"/>
     </operator>
     <connect from_op="Generate Data by User Specification" from_port="output" to_op="Multiply" to_port="input"/>
     <connect from_op="Multiply" from_port="output 1" to_op="Union" to_port="example set 1"/>
     <connect from_op="Multiply" from_port="output 2" to_op="Union" to_port="example set 2"/>
     <connect from_op="Union" from_port="union" to_op="Aggregate" to_port="example set input"/>
     <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>
Am I misunderstanding the purpose of this operator's attribute filter?

(In the simple case above, I could, of course, just filter the unwanted attributes out using Select Attributes, but in cases where "group-by" attributes are needed, this workaround is not possible.)
Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    Thanks for the report. I created an internal bug report for this.

    Best,
      Marius
  • MariusHelfMariusHelf RapidMiner Certified Expert, Member Posts: 1,869 Unicorn
    This has been fixed, at least in the latest development version. Currently I'm not sure if it also made it into RM 5.2.8, but I think so.

    Best,
      Marius
Sign In or Register to comment.