"How to create an association matrix instead of the rules?"
Hello altogether,
the example set which I have contains the transition of customers between different hotels for four years. I already did the basket analysis but am not satisfied with the result due to it's lack of visualization.
What I want to achieve is kind of an association matrix for example:
Product A was bought again 80 times.
Product B was bought again 100 times.
20 Customers who bought Product B also bought Product A.
The Matrix (in percentage) would look like this:
A B
A 1 0,2
B 0,25 1
So an unsymmetrical matrix is created, which then could be visualised by xy-scatter with different circle sizes.
The problem is I don't know how to get to this matrix. My starting point would be to pivot and aggregate the data so that I get to the matrix-format.
Thank you
Answers
Did you use the Association Rules to Exampleset operator?
I was just faced with creating a similar type of matrix this morning but I haven't solved it yet.
Thank you Thomas,
yes I already used it, but is the association rule really suitable for that? Isn't it just a thing of aggregating or counting?
Best
Philipp
Off the top of my head this morning I don't know how the matrix would look for more products than your example.
For your example, there is an operator in the Statistics extension that does exactly this, so you can loop it to produce one for each product.
See below rather rushed example
Thank you Edward,
but as you said, the operator just works for two attributes, thus two products.
I found a website, which explains exactly what I want to achieve, in Excel. Would this also be possible in RapidMiner somehow?
https://help.xlstat.com/customer/en/portal/articles/2062425-how-can-associations-rules-help-for-market-basket-analysis?b_id=9283
The photo shows this "influence matrix".
Thank you
Sure,
its just a pivot and a Replace Missing Values operator.
~Martin
Dortmund, Germany
Not as easy, though?
My example set contains 4 attributes (different years) and the the examples, which resemble the different product which was bought each year by each customer. What I have done now (since time doesn't play a role in the association matrix) is that I removed dublicates within each row (customer). So if a customer has bought product A, B, C, A within four years, it is now reduced to A, B, C since the only information which is important is, that these products were bought together.
Now something like counting every combination has to happen. But I'm stuck with this problem now. Because the normal association rule to example set doesn't allow me to convert it to a matrix.
Thank you
Are you meaning something like this then?
Thank you Edward that the kind of influence matrix I was searching for, but is it also possible that we have just single attributes in columnes and examples?
Best
Philipp