The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Predict (assign) viewers to emissions
Hi all!
I have a data set that contains TV commercial emissions. It has many properties, like
- date & time of emission,
- GRP value
- TV channel
- TV show
- commercial position (beginning of the block, middle, or end)
- Channel subject group (cooking, traveling, etc)
each property is important. Date and time determines whether the emission was during the prime time, night, etc., GRP value indicates range of emission, etc.
on the other side I have new website visitors count (based on Google Analytics), so I can clearly see how many people each emission has brought to the site and how effective it was.
Visitors data set is aggregated to minutes, so I have information like
- 2020-05-10 13:30:00 - 7 visitors
- 2020-05-10 13:31:00 - 10 visitors
- 2020-05-10 13:32:00 - 8 visitors
- 2020-05-10 13:33:00 - 2 visitors
so I can estimate, that this particular emissions has brought 27 new visitors to my website.
Problem is when emissions interfere. So having two (or more) emissions colliding all I know is that they have brought together eg. 57 visitors.
Is it possible to estimate how many visitors came from particular, interferred emission, using information based on "clean" (not colliding) emissions? Each emission is described by many properties. How to achieve it with RapidMiner? I'm trying hard with Impute Missing Values and k-NN operator with no luck.
Any help will be appreciated!
I have a data set that contains TV commercial emissions. It has many properties, like
- date & time of emission,
- GRP value
- TV channel
- TV show
- commercial position (beginning of the block, middle, or end)
- Channel subject group (cooking, traveling, etc)
each property is important. Date and time determines whether the emission was during the prime time, night, etc., GRP value indicates range of emission, etc.
on the other side I have new website visitors count (based on Google Analytics), so I can clearly see how many people each emission has brought to the site and how effective it was.
Visitors data set is aggregated to minutes, so I have information like
- 2020-05-10 13:30:00 - 7 visitors
- 2020-05-10 13:31:00 - 10 visitors
- 2020-05-10 13:32:00 - 8 visitors
- 2020-05-10 13:33:00 - 2 visitors
so I can estimate, that this particular emissions has brought 27 new visitors to my website.
Problem is when emissions interfere. So having two (or more) emissions colliding all I know is that they have brought together eg. 57 visitors.
Is it possible to estimate how many visitors came from particular, interferred emission, using information based on "clean" (not colliding) emissions? Each emission is described by many properties. How to achieve it with RapidMiner? I'm trying hard with Impute Missing Values and k-NN operator with no luck.
Any help will be appreciated!
Tagged:
0
Answers
Dortmund, Germany
For marketing attribution, when you have overlapping interactions, probably the most common way that it is handled is that each type of interaction has its own normalized canonical curve associated with it to identify expected responses. These curves are then used to attribute responses, and if there are any periods in which multiple interactions are operating simultaneously, the fit is estimated based on the sum of individual effects and then interactive effects are added to explain and remaining discrepancies.
So, in your example, you have some TV transmissions that occur when nothing else is happening. From these transmissions, you would develop a set of data to describe the typical number of responses and the timing of those responses, for each combination of other characteristics that characterize those transmissions, based on something like "minutes since broadcast".
Then when you have overlapping transmissions, each of the underlying canonical curves for the components is used to attribute responses based on its characteristics and broadcast time, and the difference between the sum of those curves and the actual results is reviewed, and if there are discrepancies, then new interaction terms (which may either be additive or subtractive in nature) are added to account for the differences.
This is fairly complicated and takes a fair amount of manual work---not an approach that is merely a matter of running the combined dataset through a machine learning algorithm, I'm afraid. It can all be done in RapidMiner, but it would be a fairly extensive project and would involve a fair amount of manual setup and attribute generation.
There are other rule-based approaches that use simple heuristics for marketing attribution as well, such as last interaction, first interaction, time-decay allocation, equal allocation, etc., but those are not based on ML either, just on the application of simple assumptions about how interactions generate responses.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts