The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Using FP-Growth and Weka-Aprori
Hi, all
To decrease learning curve is it possible to make a little step-by-step tutorial for beginners. I mean really new beginners.
I'm not able to make even an example of FP-Growth and Weka-Aprori with generated transaction data set, whereas this should be really easy process.
Does any one know if there exist such a tutorial? Or is it possible for you to give step-by-step tutorial for above example.
I spent 2 days for getting general layout and do some processes, but seems it takes a month before I can do what I want.
Thanks and Regards.
Hoping to be understood and not accepted as a lazy "user".
To decrease learning curve is it possible to make a little step-by-step tutorial for beginners. I mean really new beginners.
I'm not able to make even an example of FP-Growth and Weka-Aprori with generated transaction data set, whereas this should be really easy process.
Does any one know if there exist such a tutorial? Or is it possible for you to give step-by-step tutorial for above example.
I spent 2 days for getting general layout and do some processes, but seems it takes a month before I can do what I want.
Thanks and Regards.
Hoping to be understood and not accepted as a lazy "user".
0
Answers
Firstly, welcome to the world of pattern mining. As to finding the tutorial, this might be rather an embarrassing answer for you, but from within RapidMiner try Help->RapidMiner Tutorial. Then do -> Next -> Next in the window that shows and you will see a working example of FP-Growth. It is a smart move to go through that tutorial several times, and to be familiar with all the examples.
Have fun!
Thank you very much.
Sometimes this kind of "pointing" can save a lot time.
I've tried to do same as in tutorial but not working. I try step by step without FP-Growth and write output after each step - it works ok. Bu as soon as I insert FP-Growth, it's giving following error: So, basically it means that it can do nominal2binominal without FP-Growth. Is this bug, or am I doing something wrong?
Thanks in advance.
My file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<parameter key="logverbosity" value="all"/>
<parameter key="logfile" value="C:\AfterRuleAccoss.log"/>
<parameter key="resultfile" value="C:\afterRuleAccos.res"/>
<process expanded="true" height="601" width="784">
<operator activated="true" class="read_aml" expanded="true" height="60" name="Read AML" width="90" x="45" y="120">
<parameter key="attributes" value="C:\labor-negotiations.aml"/>
</operator>
<operator activated="true" class="replace_missing_values" expanded="true" height="94" name="Replace Missing Values" width="90" x="45" y="300">
<parameter key="attributes" value="duration|wage-inc-1st|wage-inc-2nd|wage-inc-3rd|working-hours|standby-pay|shift-differential|statutory-holidays"/>
<list key="columns"/>
</operator>
<operator activated="true" class="discretize_by_frequency" expanded="true" height="94" name="Discretize" width="90" x="179" y="300">
<parameter key="range_name_type" value="short"/>
</operator>
<operator activated="true" class="nominal_to_binominal" expanded="true" height="94" name="Nominal to Binominal" width="90" x="313" y="300"/>
<operator activated="true" class="fp_growth" expanded="true" height="76" name="FP-Growth" width="90" x="447" y="210"/>
<operator activated="true" class="write_excel" expanded="true" height="60" name="Write Excel" width="90" x="514" y="30">
<parameter key="excel_file" value="C:\result_afterRMVDiscretizeNom2BinomFPGrowth.xls"/>
</operator>
<connect from_op="Read AML" from_port="output" to_op="Replace Missing Values" to_port="example set input"/>
<connect from_op="Replace Missing Values" from_port="example set output" to_op="Discretize" to_port="example set input"/>
<connect from_op="Discretize" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
<connect from_op="Nominal to Binominal" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
<connect from_op="FP-Growth" from_port="example set" to_op="Write Excel" to_port="input"/>
<connect from_op="FP-Growth" from_port="frequent sets" to_port="result 2"/>
<connect from_op="Write Excel" from_port="through" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
</process>
</operator>
</process>
So close, and yet so far! If you had just ticked the "transform_binominal" tick box in the nominal_to_binominal operator all would have worked fine...like this.
I stuck again on step2.
I'm trying to use W-Apriori on my data: 1) I want to calculate only True values. For instance I am not interested in if someone did not bought something, but I'm interested in if someone bought something, then what else did he/she buy.
2) Even if I ignore first requirement (assuming that since RapidMiner calculates Falses then this must be a correct way). If I set M=0.4, but interesting part is that it's not showing what I'm expecting: I expect it to show itemsets with min support of 0.4, but it shows just some of them.
For above example it's (I expected beer=True 7. bread=true 9, ...)
What am I doing wrong? What do I need to get what I want?
If you want to thin out the Premises or Conclusions you may find this post interesting.
http://rapid-i.com/rapidforum/index.php/topic,1887.msg7366.html#msg7366
Because it shows how you can convert Association Rules to an exampleSet, which of course means that all the regular thinning agents can be applied.
Just a thought.
I tried to understand what you have written. But it seems it is not the answer or the way. I'm not sure though.
My problem is I'm trying to get result from W-Apriori, but result is not what I expect
It's not minor difference, which can be a result of different implementations, but totally different that it should be.
FP-Growth is giving: { bread}, {beer},{jam},{chips},{chocolate}, {bread, jam}, {bread, beer}
I expect W-Apriori to give at least 50% similar to above for such a small data set.
This makes me to think that I'm doing something wrong, such as ticking some checkbox which was the case in above problem.
As it can be guessed I spent a week, but still could not solve.
Any ideas? Or any working processes of W-Apriori?
Thanks in advance.