The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Basics of FP-Growth
bernardo_pagnon
Member, University Professor Posts: 64 University Professor
Hello all,
I am struggling quite a bit with the FP-growth operator. I got all sorts of errors (no binomial attributes when I manually set them to binomial, outputs that I cannot understand, etc). I am trying to run the smallest possible example: 2 transactions, 3 products (juice, meat and milk)! My excel file is like that:
0 0 1
0 0 1
What am I doing wrong? What are the basic errors one should avoid when using FP-Growth? I read the help page at RM on this operator and I found it extremely confusing also. Any help is appreciated, I just want to use the operator in the simples possible way.
Regards,
Bernardo
I am struggling quite a bit with the FP-growth operator. I got all sorts of errors (no binomial attributes when I manually set them to binomial, outputs that I cannot understand, etc). I am trying to run the smallest possible example: 2 transactions, 3 products (juice, meat and milk)! My excel file is like that:
0 0 1
0 0 1
What am I doing wrong? What are the basic errors one should avoid when using FP-Growth? I read the help page at RM on this operator and I found it extremely confusing also. Any help is appreciated, I just want to use the operator in the simples possible way.
Regards,
Bernardo
Tagged:
1
Best Answer
-
bernardo_pagnon Member, University Professor Posts: 64 University ProfessorOh, now I see: this option has tow modes, and when find min number of itemsets is checked it ignores this minimum value.Solved!!!1
Answers
I think there is something weird going on: using the exact same steps as the author suggests, I got the same result as he did. For instance, the frequency of "juices" as a single item was 0.780, while the one for desserts was 0.312. Then I implemented the same situation, but now I used "read csv", and the "numerical to binomial" operator. The results for the frequencies were .220 for Juice, and 0.312 for desserts. I checked on Excel, using COUNT IF, and the last results seem to be the correct ones. Strange. It seems that RM is not counting those singletons properly, or some operator inverts a few of the values. I would appreciate it if someone could check that.
Best,
Bernardo
I tested on the same market data downloaded from http://rapidminerbook.com/index.php/chapter-downloads/chapter-8/
The frequency output for "juices" is shown as 0.219613 which matches with your Excel count if results.
YY
You have opened duplicated threads on the same question. For easy communication and trace down the issues, please go to
https://community.rapidminer.com/discussion/45849/fp-growth-itemset-one-of-the-items-is-oversupported#latest