The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Preprocessing for FPGrowth"
data:image/s3,"s3://crabby-images/e9e37/e9e376f86fc989f8be36462752cae2b4a4f55b06" alt="guilhermecr"
data:image/s3,"s3://crabby-images/5f468/5f4680711dcf5b2bea70da8891109c95c08b4440" alt=""
I am working with basket analisys. I am already generating the binomial format using other programs.
What RM operator can I use to transform the dataset from this format:
1,3
2,3,4
1,2,3
to this:
1,0,1,0
0,1,1,1
1,1,1,0
Thanks in advancedata:image/s3,"s3://crabby-images/a752f/a752fc9865231bdcd920a37a85fbf412d4fb2fc2" alt=":) :)"
What RM operator can I use to transform the dataset from this format:
1,3
2,3,4
1,2,3
to this:
1,0,1,0
0,1,1,1
1,1,1,0
Thanks in advance
data:image/s3,"s3://crabby-images/a752f/a752fc9865231bdcd920a37a85fbf412d4fb2fc2" alt=":) :)"
Tagged:
0
Answers
your data format is called dense, because it only saves the indices of the columns unequal 0. RapidMiner supports a dense format, but it slightly differs from yours. If you could bring your data in the following format, you can easily load it:
1:1 3:1
2:1 3:1 4:1
1:1 2:1 3:1
If you then use the operator SparseFormatExampleSource with the parameter format set to no_label and the parameter dimension set to the number of dimensions (the highest number occuring in your file) then it works.
Greetings,
 Sebastian
I have used the 'retail' data set available at http://fimi.cs.helsinki.fi/data/retail.dat, which is in the dense format.
But since I will get my own data from a friend's shop, my question is:
What is the best format for a market basket analysis with RM?
Thanks
PS: I will probaly use Apriori and FPGrowth.