The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Buggy warning on FP-Growth "non-binominal attribute detected"
Hello,
With the latest version of RapidMiner 9.10.1, I have noticed an erroneous warning on FP-Growth that was not there before. Here is a sample process that illustrates the problem:
<?xml version="1.0" encoding="UTF-8"?><process version="9.10.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.10.001" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="1234"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.10.001" expanded="true" height="68" name="Retrieve Transactions" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/Templates/Market Basket Analysis/Transactions"/>
</operator>
<operator activated="true" class="blending:pivot" compatibility="9.10.001" expanded="true" height="82" name="Pivot" width="90" x="179" y="34">
<parameter key="group_by_attributes" value="Invoice"/>
<parameter key="column_grouping_attribute" value="product 1"/>
<list key="aggregation_attributes">
<parameter key="Orders" value="count"/>
</list>
<parameter key="use_default_aggregation" value="false"/>
<parameter key="default_aggregation_function" value="first"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="9.10.001" expanded="true" height="82" name="Rename by Replacing" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="replace_what" value="count\(Orders\)_"/>
<parameter key="replace_by" value=""/>
</operator>
<operator activated="true" class="set_role" compatibility="9.10.001" expanded="true" height="82" name="Set Role" width="90" x="447" y="136">
<parameter key="attribute_name" value="Invoice"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="9.10.001" expanded="true" height="103" name="Replace Missing Values" width="90" x="581" y="136">
<parameter key="return_preprocessing_model" value="false"/>
<parameter key="create_view" value="false"/>
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="default" value="zero"/>
<list key="columns"/>
</operator>
<operator activated="true" class="numerical_to_binominal" compatibility="9.10.001" expanded="true" height="82" name="Numerical to Binominal" width="90" x="715" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="0.0"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="9.10.001" expanded="true" height="82" name="Select Attributes" width="90" x="849" y="136">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="binominal"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
</operator>
<operator activated="true" class="concurrency:fp_growth" compatibility="9.10.001" expanded="true" height="82" name="FP-Growth" width="90" x="983" y="136">
<parameter key="input_format" value="items in dummy coded columns"/>
<parameter key="item_separators" value="|"/>
<parameter key="use_quotes" value="false"/>
<parameter key="quotes_character" value="""/>
<parameter key="escape_character" value="\"/>
<parameter key="trim_item_names" value="true"/>
<parameter key="min_requirement" value="support"/>
<parameter key="min_support" value="0.05"/>
<parameter key="min_frequency" value="100"/>
<parameter key="min_items_per_itemset" value="1"/>
<parameter key="max_items_per_itemset" value="0"/>
<parameter key="max_number_of_itemsets" value="1000000"/>
<parameter key="find_min_number_of_itemsets" value="true"/>
<parameter key="min_number_of_itemsets" value="100"/>
<parameter key="max_number_of_retries" value="15"/>
<parameter key="requirement_decrease_factor" value="0.9"/>
<enumeration key="must_contain_list"/>
</operator>
<connect from_op="Retrieve Transactions" from_port="output" to_op="Pivot" to_port="input"/>
<connect from_op="Pivot" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Pivot" from_port="original" to_port="result 1"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
<connect from_op="Numerical to Binominal" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
<connect from_op="FP-Growth" from_port="example set" to_port="result 2"/>
<connect from_op="FP-Growth" from_port="frequent sets" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="9.10.001" expanded="true" name="Process">
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="1234"/>
<parameter key="send_mail" value="never"/>
<parameter key="notification_email" value=""/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="SYSTEM"/>
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="9.10.001" expanded="true" height="68" name="Retrieve Transactions" width="90" x="45" y="34">
<parameter key="repository_entry" value="//Samples/Templates/Market Basket Analysis/Transactions"/>
</operator>
<operator activated="true" class="blending:pivot" compatibility="9.10.001" expanded="true" height="82" name="Pivot" width="90" x="179" y="34">
<parameter key="group_by_attributes" value="Invoice"/>
<parameter key="column_grouping_attribute" value="product 1"/>
<list key="aggregation_attributes">
<parameter key="Orders" value="count"/>
</list>
<parameter key="use_default_aggregation" value="false"/>
<parameter key="default_aggregation_function" value="first"/>
</operator>
<operator activated="true" class="rename_by_replacing" compatibility="9.10.001" expanded="true" height="82" name="Rename by Replacing" width="90" x="313" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="replace_what" value="count\(Orders\)_"/>
<parameter key="replace_by" value=""/>
</operator>
<operator activated="true" class="set_role" compatibility="9.10.001" expanded="true" height="82" name="Set Role" width="90" x="447" y="136">
<parameter key="attribute_name" value="Invoice"/>
<parameter key="target_role" value="id"/>
<list key="set_additional_roles"/>
</operator>
<operator activated="true" class="replace_missing_values" compatibility="9.10.001" expanded="true" height="103" name="Replace Missing Values" width="90" x="581" y="136">
<parameter key="return_preprocessing_model" value="false"/>
<parameter key="create_view" value="false"/>
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="attribute_value"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="default" value="zero"/>
<list key="columns"/>
</operator>
<operator activated="true" class="numerical_to_binominal" compatibility="9.10.001" expanded="true" height="82" name="Numerical to Binominal" width="90" x="715" y="136">
<parameter key="attribute_filter_type" value="all"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="numeric"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="real"/>
<parameter key="block_type" value="value_series"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_series_end"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
<parameter key="min" value="0.0"/>
<parameter key="max" value="0.0"/>
</operator>
<operator activated="true" class="select_attributes" compatibility="9.10.001" expanded="true" height="82" name="Select Attributes" width="90" x="849" y="136">
<parameter key="attribute_filter_type" value="value_type"/>
<parameter key="attribute" value=""/>
<parameter key="attributes" value=""/>
<parameter key="use_except_expression" value="false"/>
<parameter key="value_type" value="binominal"/>
<parameter key="use_value_type_exception" value="false"/>
<parameter key="except_value_type" value="time"/>
<parameter key="block_type" value="attribute_block"/>
<parameter key="use_block_type_exception" value="false"/>
<parameter key="except_block_type" value="value_matrix_row_start"/>
<parameter key="invert_selection" value="false"/>
<parameter key="include_special_attributes" value="false"/>
</operator>
<operator activated="true" class="concurrency:fp_growth" compatibility="9.10.001" expanded="true" height="82" name="FP-Growth" width="90" x="983" y="136">
<parameter key="input_format" value="items in dummy coded columns"/>
<parameter key="item_separators" value="|"/>
<parameter key="use_quotes" value="false"/>
<parameter key="quotes_character" value="""/>
<parameter key="escape_character" value="\"/>
<parameter key="trim_item_names" value="true"/>
<parameter key="min_requirement" value="support"/>
<parameter key="min_support" value="0.05"/>
<parameter key="min_frequency" value="100"/>
<parameter key="min_items_per_itemset" value="1"/>
<parameter key="max_items_per_itemset" value="0"/>
<parameter key="max_number_of_itemsets" value="1000000"/>
<parameter key="find_min_number_of_itemsets" value="true"/>
<parameter key="min_number_of_itemsets" value="100"/>
<parameter key="max_number_of_retries" value="15"/>
<parameter key="requirement_decrease_factor" value="0.9"/>
<enumeration key="must_contain_list"/>
</operator>
<connect from_op="Retrieve Transactions" from_port="output" to_op="Pivot" to_port="input"/>
<connect from_op="Pivot" from_port="output" to_op="Rename by Replacing" to_port="example set input"/>
<connect from_op="Pivot" from_port="original" to_port="result 1"/>
<connect from_op="Rename by Replacing" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Numerical to Binominal" to_port="example set input"/>
<connect from_op="Numerical to Binominal" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
<connect from_op="FP-Growth" from_port="example set" to_port="result 2"/>
<connect from_op="FP-Growth" from_port="frequent sets" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>
As you can see, even though I select only binominal operators, I still get a warning that a "non-binominal attribute" is detected:
After some testing, the problem seems to be that the attribute with an ID role (Invoice) is triggering this error. That is, the FP-Growth operator detects that the ID is not binominal and so flags this warning. However, the false warning does not seem to affect the correct operation of the FP-Growth operator in 9.10.1; it runs just fine, despite the warning.
When I adjust Select Attributes to "include special attributes" (that is, eliminate the special ID attribute), then the FP-Growth warning goes away:
So, this seems to be a buggy false warning that does not otherwise affect the operator's correct operation. Could someone please confirm that this is indeed a bug, that is, that I am not the one who misunderstands the correct operation of the operator? And is this the correct place to report such a bug?
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany0
Answers
Dortmund, Germany