The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Count Price Changes
gianluca_scheid
Member Posts: 10 Learner II
in Help
Dear RapidMiner Community
I have a big data set with product name, retailer name and price at different points in time (time stamps).
My goal is to group the data by product and retailer name and count the number of price changes across all time stamps.
Is there an easy way to do this?
Thank you for your help
Best regards
GL
I have a big data set with product name, retailer name and price at different points in time (time stamps).
My goal is to group the data by product and retailer name and count the number of price changes across all time stamps.
Is there an easy way to do this?
Thank you for your help
Best regards
GL
Tagged:
0
Best Answer
-
MartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data ScientistAhh!The attached process should do the trick. It needs Operator Toolbox to run. Sadly i have no time to add additional comments right now. I hope this works for you.
<?xml version="1.0" encoding="UTF-8"?><process version="9.1.000"><br> <context><br> <input/><br> <output/><br> <macros/><br> </context><br> <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process"><br> <parameter key="logverbosity" value="init"/><br> <parameter key="random_seed" value="2001"/><br> <parameter key="send_mail" value="never"/><br> <parameter key="notification_email" value=""/><br> <parameter key="process_duration_for_mail" value="30"/><br> <parameter key="encoding" value="SYSTEM"/><br> <process expanded="true"><br> <operator activated="true" class="operator_toolbox:create_exampleset" compatibility="1.8.000-SNAPSHOT" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85"><br> <parameter key="generator_type" value="comma_separated_text"/><br> <parameter key="number_of_examples" value="100"/><br> <parameter key="use_stepsize" value="false"/><br> <list key="function_descriptions"/><br> <parameter key="add_id_attribute" value="false"/><br> <list key="numeric_series_configuration"/><br> <list key="date_series_configuration"/><br> <list key="date_series_configuration (interval)"/><br> <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/><br> <parameter key="input_csv_text" value="product, retailer, price Product X, Retailer Y, 500 Product X, Retailer Y, 500 Product X, Retailer Y, 550 Product X, Retailer Y, 600 Product X, Retailer Y, 500 Product X, Retailer Y, 600 Product Y, Retailer Y, 500 Product Y, Retailer Y, 500 Product Y, Retailer Y, 550 Product Y, Retailer Y, 600 Product Y, Retailer Y, 500 Product Y, Retailer Y, 600"/><br> <parameter key="column_separator" value=","/><br> <parameter key="parse_all_as_nominal" value="false"/><br> <parameter key="decimal_point_character" value="."/><br> <parameter key="trim_attribute_names" value="true"/><br> </operator><br> <operator activated="true" class="generate_concatenation" compatibility="9.1.000" expanded="true" height="82" name="Generate Concatenation" width="90" x="246" y="85"><br> <parameter key="first_attribute" value="product"/><br> <parameter key="second_attribute" value="retailer"/><br> <parameter key="separator" value="_"/><br> <parameter key="trim_values" value="false"/><br> </operator><br> <operator activated="true" class="operator_toolbox:group_into_collection" compatibility="1.8.000-SNAPSHOT" expanded="true" height="82" name="Group Into Collection" width="90" x="447" y="85"><br> <parameter key="group_by_attribute" value="product_retailer"/><br> <parameter key="group_by_attribute (numerical)" value=""/><br> <parameter key="sorting_order" value="none"/><br> </operator><br> <operator activated="true" class="loop_collection" compatibility="9.1.000" expanded="true" height="82" name="Loop Collection" width="90" x="581" y="85"><br> <parameter key="set_iteration_macro" value="false"/><br> <parameter key="macro_name" value="iteration"/><br> <parameter key="macro_start_value" value="1"/><br> <parameter key="unfold" value="false"/><br> <process expanded="true"><br> <operator activated="true" class="operator_toolbox:generate_session_id" compatibility="1.8.000-SNAPSHOT" expanded="true" height="82" name="Generate Session ID" width="90" x="112" y="34"><br> <parameter key="date_attribute" value="price"/><br> <parameter key="gap_threshold" value="0.0"/><br> <parameter key="gap_unit" value="none"/><br> <parameter key="use_absolutes" value="false"/><br> </operator><br> <operator activated="true" class="remove_duplicates" compatibility="9.1.000" expanded="true" height="103" name="Remove Duplicates" width="90" x="246" y="34"><br> <parameter key="attribute_filter_type" value="single"/><br> <parameter key="attribute" value="Session id"/><br> <parameter key="attributes" value=""/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="true"/><br> <parameter key="treat_missing_values_as_duplicates" value="false"/><br> </operator><br> <operator activated="true" class="aggregate" compatibility="9.1.000" expanded="true" height="82" name="Aggregate" width="90" x="447" y="34"><br> <parameter key="use_default_aggregation" value="false"/><br> <parameter key="attribute_filter_type" value="all"/><br> <parameter key="attribute" value=""/><br> <parameter key="attributes" value=""/><br> <parameter key="use_except_expression" value="false"/><br> <parameter key="value_type" value="attribute_value"/><br> <parameter key="use_value_type_exception" value="false"/><br> <parameter key="except_value_type" value="time"/><br> <parameter key="block_type" value="attribute_block"/><br> <parameter key="use_block_type_exception" value="false"/><br> <parameter key="except_block_type" value="value_matrix_row_start"/><br> <parameter key="invert_selection" value="false"/><br> <parameter key="include_special_attributes" value="false"/><br> <parameter key="default_aggregation_function" value="average"/><br> <list key="aggregation_attributes"><br> <parameter key="product" value="count"/><br> </list><br> <parameter key="group_by_attributes" value="retailer|product"/><br> <parameter key="count_all_combinations" value="false"/><br> <parameter key="only_distinct" value="false"/><br> <parameter key="ignore_missings" value="true"/><br> </operator><br> <connect from_port="single" to_op="Generate Session ID" to_port="exa"/><br> <connect from_op="Generate Session ID" from_port="exa" to_op="Remove Duplicates" to_port="example set input"/><br> <connect from_op="Remove Duplicates" from_port="example set output" to_op="Aggregate" to_port="example set input"/><br> <connect from_op="Aggregate" from_port="example set output" to_port="output 1"/><br> <portSpacing port="source_single" spacing="0"/><br> <portSpacing port="sink_output 1" spacing="0"/><br> <portSpacing port="sink_output 2" spacing="0"/><br> </process><br> </operator><br> <operator activated="true" class="append" compatibility="9.1.000" expanded="true" height="82" name="Append" width="90" x="715" y="85"><br> <parameter key="datamanagement" value="double_array"/><br> <parameter key="data_management" value="auto"/><br> <parameter key="merge_type" value="all"/><br> </operator><br> <connect from_op="Create ExampleSet" from_port="output" to_op="Generate Concatenation" to_port="example set input"/><br> <connect from_op="Generate Concatenation" from_port="example set output" to_op="Group Into Collection" to_port="exa"/><br> <connect from_op="Group Into Collection" from_port="col" to_op="Loop Collection" to_port="collection"/><br> <connect from_op="Loop Collection" from_port="output 1" to_op="Append" to_port="example set 1"/><br> <connect from_op="Append" from_port="merged set" to_port="result 1"/><br> <portSpacing port="source_input 1" spacing="0"/><br> <portSpacing port="sink_result 1" spacing="0"/><br> <portSpacing port="sink_result 2" spacing="0"/><br> </process><br> </operator><br></process><br><br>
- Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany8
Answers
Dortmund, Germany
Thank you for your suggestion.
I had already considered this option, however, it will only count the number of distinct prices but not the number of price changes:
For example:
t1: Product X, Retailer Y, 500USD
t2: Product X, Retailer Y, 500USD
t3: Product X, Retailer Y, 550USD
t4: Product X, Retailer Y, 600USD
t5: Product X, Retailer Y, 500USD
t6: Product X, Retailer Y, 600USD
When counting prices using the Aggregate function I'd get: Product X, Retailer Y, 3 distinct prices (500, 550, 600)
However, the price changed 4 times (500 to 550, 550 to 600, 600 to 500, 500 to 600)
Best regards
Gianluca
Thank you for your help. Unfortunately, I don't know (yet) how to implement the xml code into my process...
Best regards
Gianluca
You just need to open new process then go to menu bar View--> Show panel --> XML. Now you will get an XML window, delete code in XML window and just copy the complete code from @mschmitz and paste it there. Once you paste it click green tick mark so that you can see the processes in the process window. hope this helps
Thanks,
Varun
Varun
https://www.varunmandalapu.com/
Be Safe. Follow precautions and Maintain Social Distancing
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Dortmund, Germany
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts