The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Accounting for number of observations / evidence
Dear RM-Enthusiasts,
Working on an online advertising dataset I have a list with Product-IDs. Every product has a number of attributes and I want to predict one of them. A fairly basic Decision Tree model is already yielding acceptable results.
However I still have one source of predictive potential that is not used yet and that is the number of observations. The data for some Product-IDs are based on 1 observation, while others are based on 20 or more observations. Obviously I would like to weigh the data for the IDs with many observations heavier than the ones with few observations.
Can anybody direct me to a way of handling this? Maybe a tutorial or youtube video?
Any advice would be greatly appreciated. Thanks in advance!
Best,
Working on an online advertising dataset I have a list with Product-IDs. Every product has a number of attributes and I want to predict one of them. A fairly basic Decision Tree model is already yielding acceptable results.
However I still have one source of predictive potential that is not used yet and that is the number of observations. The data for some Product-IDs are based on 1 observation, while others are based on 20 or more observations. Obviously I would like to weigh the data for the IDs with many observations heavier than the ones with few observations.
Can anybody direct me to a way of handling this? Maybe a tutorial or youtube video?
Any advice would be greatly appreciated. Thanks in advance!
Best,
Marc
0
Answers
Be a bit careful with it. It may lead to a bias towards well known things.
Dortmund, Germany
Dear Martin,
Thanks a lot for your response. I understand the concept and I found 3 „Aggregate“ operators: Generate Aggregation, Aggregate and Extract aggregates. I chose „Aggregate“.
Next, I chose number_observations as “aggregation attribute”. When selecting the corresponding “aggregation_function” (average, concatenation, count etc.) though, I could not find “weight”.
Do you have any idea where I’m going wrong?
Best,
Marc
You have to choose count in aggregation function in the parameters of the Aggregate operator.
Then you have to put a Set Role operator in your process and in the parameters of this operator, select in attribute name the attribute you just created and set weight as target role
Regards,
Lionel
Clear. It worked so far, but I now only have the aggregated attribute on the output port of the Aggregator operator. The other attributes are not passed through. I tried a few things but couldn't get it to work. Any idea?
Thanks,
Marc