The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Can i do this with rapidminer?
hello,
im just beginning with data mining and i was wondering if i can use rapidminer for my needs.
i have the following data:
ProjectId Contributors Subjects DOP
1 {"joe","Karen"} {"Data mining","BI","html"} 11/16/2007
2 {""} {"modern literature"} 06/05/2000
3 {"michael","roger","jen"} {"medicine"} 09/09/1998
4 {"ken","karen"} {"web design", "html", "flash", "css"} 01/12/2004
5 {"steve", "andrew",ken} {"BI"} 02/06/2003
I want to calculate or predict the probability of each project given the following user selections:
Contributors:
joe
ken
andrew
michael
jen
Subjects
html
BI
modern literature
Can i do this with rapidminer? i've been trying out the examples but i cant really see how i can do this, if anyone can provide any info
i will be greatly appreciated.
thanks.
im just beginning with data mining and i was wondering if i can use rapidminer for my needs.
i have the following data:
ProjectId Contributors Subjects DOP
1 {"joe","Karen"} {"Data mining","BI","html"} 11/16/2007
2 {""} {"modern literature"} 06/05/2000
3 {"michael","roger","jen"} {"medicine"} 09/09/1998
4 {"ken","karen"} {"web design", "html", "flash", "css"} 01/12/2004
5 {"steve", "andrew",ken} {"BI"} 02/06/2003
I want to calculate or predict the probability of each project given the following user selections:
Contributors:
joe
ken
andrew
michael
jen
Subjects
html
BI
modern literature
Can i do this with rapidminer? i've been trying out the examples but i cant really see how i can do this, if anyone can provide any info
i will be greatly appreciated.
thanks.
0
Answers
I cannot figure out your predictiontask. I understand you this way: You want to predict the projectid given a user (or a set of users) and a subject (or a set of subjects).
In this case I am afraid you got to change the way your data is stored. E.g.: Instead of
1 {"joe","Karen"} {"Data mining","BI","html"}
you need something like
ProjectId Joe Karen Michael etc.. DataMining BI html medicine etc...
1 1 1 0 1 1 1 0
understand ?
The resulting matrix will allow you to learn models (or calculate probabilities approximately, e.g. per NaiveBayes), converting the selection of users and subjects to the same format will allow you create predictions.
But:
1. The mentioned conversion task cannot be done in RapidMiner (as far as I see)
2. If you donot have much more data with repetitions of users and subjects, the resulting probabilities will be very small, it is further possible that some learners will crash or calculate strange results
3. I cannot get rid of the feeling that this is a task for discrete mathematics, not for Data Mining = RapidMiner. If you want to calculate the exact probabilities (!) instead of approximations, you have to look for another way. Seems to be more like a job for a sheet of paper instead a tool...
hope this was helpful
Steffen
I understand how i am suppose to change my data i have no problems with that. I am looking for probability estimates rather than exact probabilities, my issue is that i am completely new to the field of data mining and rapidminer and i am not sure which algorithms, learners or classifiers i am suppose to use for this task, which is basically probability estimates for each of the project id's given the search parameters (subjects, contributors...etc).
Thank you again Steffen.
Feel free to come back and ask more questions
greetings
Steffen
actually, the transformation task can be done with RapidMiner. It should be possible with the Nominal2Binominal operator. The result will be the matrix from which the "prediction" models can be learned.
Cheers,
Ingo
The main problem is to split up the sets {...} automatically. This is what not cannot be done by RapidMiner (as far as I know )
1 {"joe","Karen"} {"Data mining","BI","html"} 11/16/2007
once the sets are split up, Nominal2Binominal can be applied.
greetings
Steffen
of course, you are absolutely right. For this step one would need to create a new operator.
Cheers,
Ingo