The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Accounting for number of observations / evidence

User23400User23400 Member Posts: 3 Learner III
edited December 2019 in Help
Dear RM-Enthusiasts,

Working on an online advertising dataset I have a list with Product-IDs. Every product has a number of attributes and I want to predict one of them. A fairly basic Decision Tree model is already yielding acceptable results.

However I still have one source of predictive potential that is not used yet and that is the number of observations. The data for some Product-IDs are based on 1 observation, while others are based on 20 or more observations. Obviously I would like to weigh the data for the IDs with many observations heavier than the ones with few observations. 

Can anybody direct me to a way of handling this? Maybe a tutorial or youtube video?

Any advice would be greatly appreciated. Thanks in advance!

Best,
Marc

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hi,
    you can use aggreagte to generate this count and then set the role of this attribute to weight. Then it is counting more in learners.

    Be a bit careful with it. It may lead to a bias towards well known things.

    Best,
    martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • User23400User23400 Member Posts: 3 Learner III

    Dear Martin,

    Thanks a lot for your response. I understand the concept and I found 3 „Aggregate“ operators: Generate Aggregation, Aggregate and Extract aggregates. I chose „Aggregate“.

    Next, I chose number_observations as “aggregation attribute”. When selecting the corresponding “aggregation_function” (average, concatenation, count etc.) though, I could not find “weight”.

    Do you have any idea where I’m going wrong?


    Best,

    Marc

  • lionelderkrikorlionelderkrikor RapidMiner Certified Analyst, Member Posts: 1,195 Unicorn
    Hi @User23400,

    You have to choose count in aggregation function in the parameters of the Aggregate operator.
    Then you have to put a Set Role operator in your process and in the parameters of this operator, select in attribute name the attribute you just created and set weight as target role

    Regards,

    Lionel
  • User23400User23400 Member Posts: 3 Learner III
    Thanks Lionel,

    Clear. It worked so far, but I now only have the aggregated attribute on the output port of the Aggregator operator. The other attributes are not passed through. I tried a few things but couldn't get it to work. Any idea?

    Thanks,
    Marc
Sign In or Register to comment.