The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Merging prices from various product lists

helpmeminehelpmemine Member Posts: 3 Contributor I
edited November 2018 in Help
Hello everyone!

Being a supply chain manager, I have a lot of market data in my hands and I am under a lot of pressure buying the products at the right time and from the right wholesaler. My business field is related to consumer goods (toiletries, detergents, cosmetics, etc).

Over the last years our network of suppliers has grown and it is time consuming to go through all the product offers to find the best price or a undiscovered bargain, as we carry thousands of articles in the assortment. Fortunately most of the suppliers have  a unique identifier in their offers (either EAN or UPC code) which could be used for data mining purposes.

Here is the short description of data from one of the suppliers:
4250925360284 (EAN Code)
Adidas deodorant (Product name)
150 ML (Size)
4,75 (Price)
1200 pcs (Available quantity)

My idea is to consolidate all the offers from wholesalers and form a sound repository/database of products that are offered on the market with an overview of current price minimums, maximums and averages.  Essential price information in hand gives me a great competitive edge while executing the purchase orders and surely it will raise the revenues too.

Although I went through several tutorials and videos at the "Resources" section, I have hard time to figure out which Operators to use or how to build the Workflow to get around this problem.

I am a quick learner, so any help is much appreciated which can guide me to the right path.

Thank you!

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,

    what you want is to join your data sources. Either inside a DB or in RM with the Join operator. Afterwards you want to use a Aggregate and Group By your EAN Codes. With that you can create the min/max/std_dev of your prices.

    If you have an excel file with 10-20 examples i might find some time for an example process.
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • helpmeminehelpmemine Member Posts: 3 Contributor I
    Hello,

    Thank you for a such a prompt reply!

    The Join Operator lead was highly useful! I am currently going through the Join Operator tutorial and playing with the example data to understand how the process is built up and how the key attributes and operators work together. Good stuff!  :)

    I will try to move towards Aggregate and Group By later today.

    Also, I compiled some example data, as requested:
    https://dl.dropboxusercontent.com/u/90694307/Example%20data.xls



  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,

    attached is a real small process calculating stuff grouped by your categories. If you want to use Generate Attribtues afterwards you need to replace the brackets.

    You can simply copy this XML code into the XML view of your Rapidminer Studio. Be sure to change the path in Read Excel.

    Best,

    Martin

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="6.2.000">
     <context>
       <input/>
       <output/>
       <macros/>
     </context>
     <operator activated="true" class="process" compatibility="6.2.000" expanded="true" name="Process">
       <process expanded="true">
         <operator activated="true" class="read_excel" compatibility="6.2.000" expanded="true" height="60" name="Read Excel" width="90" x="112" y="30">
           <parameter key="excel_file" value="C:\Users\Martin\Downloads\Example data.xls"/>
           <parameter key="imported_cell_range" value="A1:F17"/>
           <parameter key="first_row_as_names" value="false"/>
           <list key="annotations">
             <parameter key="0" value="Comment"/>
             <parameter key="1" value="Name"/>
           </list>
           <list key="data_set_meta_data_information">
             <parameter key="0" value="House.true.polynominal.attribute"/>
             <parameter key="1" value="Category.true.polynominal.attribute"/>
             <parameter key="2" value="Description.true.polynominal.attribute"/>
             <parameter key="3" value="Euro .true.real.attribute"/>
             <parameter key="4" value="Quantity.true.integer.attribute"/>
             <parameter key="5" value="EAN.true.real.attribute"/>
           </list>
         </operator>
         <operator activated="true" class="aggregate" compatibility="6.2.000" expanded="true" height="76" name="Aggregate" width="90" x="380" y="30">
           <list key="aggregation_attributes">
             <parameter key="Euro" value="average"/>
             <parameter key="Euro" value="maximum"/>
             <parameter key="Euro" value="standard_deviation"/>
           </list>
           <parameter key="group_by_attributes" value="Category"/>
         </operator>
         <connect from_op="Read Excel" from_port="output" to_op="Aggregate" to_port="example set input"/>
         <connect from_op="Aggregate" from_port="example set output" to_port="result 1"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
         <portSpacing port="sink_result 2" spacing="0"/>
       </process>
     </operator>
    </process>

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • helpmeminehelpmemine Member Posts: 3 Contributor I
    Hi,

    Thank you, this really helped a lot! After hours of integrating, I finally managed to get it working.

    I already stumbled upon available quantity related questions, but I found a lead from the tutorial, again.

    All the best!
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi again,

    great that it helped you. I don't know where you are working, but you might be interested in our trainings. We currently offer trainings in the US, UK and Germany. For details see:  https://rapidminer.com/learning/training/ The Basic courses might be really suited for you to get into Predictive Analytics/Data Mining/ETL in Rapidminer etc.

    If you are forced to learn it yourself, you might have a look at "Data Mining for the masses" a free book avaible here: https://rapidminer.com/wp-content/uploads/2013/10/DataMiningForTheMasses.pdf
    It is really helpful for beginners.

    If you have any further question, feel free to ask!

    Best,

    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign In or Register to comment.