How can I have some melting function in rapidminer?
I am beginner in dataminer,
I have a list of 10000 rows and about 200 column like this :
look,1,2,3,4,5,6,7,8
book,4,5,6,7,8,102,104,107
look,6,7,8,9
hook,100,101,102
cook,7,8,9
build,102,103,104,107
hook,103,104,105
...
at first i need to make unique list of words:
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
hook,100,101,102,103,104,105
cook,7,8,9
build,102,103,104,107
Now I need to find lines with at least 3 (or n) similar values and generate a new list:
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
cook,7,8,9
*************
book,4,5,6,7,8,102,104,107
build,102,103,104,107
*************
hook,100,101,102,103,104,105
build,102,103,104,107
*************
Please help me in anyway
thank you
Answers
I Searched the internet and someone said python melt can help me, but I don't know how can I do in rapidminer!
Hi,
from the pandas doc for melt:
I guess it maps to something along the lines of De-Pivot.
Best,
Martin
Dortmund, Germany
so that's a fun puzzle. I would begin like this (you will need @land's Statistics Extension to run this process):
That said I am certain there is a cleverer way to do this!
Scott
I updated my rapidminer and installed statics extension:
but I Get error:
and I can not find missing extension:
Would you please help again.
Thank you
hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?). I would go directly to the website: https://oldworldcomputing.com/products/statistics-extension-for-rapidminer
Scott
This is my csv file.
would you please test with it?
so the process I posted was not intended to be a finished product - just something to get you in the right direction. If you take that csv file and put it in my process, you get the attached result.
Scott
Oh thank you sir, You are the master
but These were samples data for test
my real data have about 100000 difeerent value, with this method I will have about 100000 Columns?
Is it possible to convert the list to my wanted list?
look,1,2,3,4,5,6,7,8,9
book,4,5,6,7,8,102,104,107
cook,7,8,9
*************
book,4,5,6,7,8,102,104,107
build,102,103,104,107
*************
hook,100,101,102,103,104,105
build,102,103,104,107
*************
I mean these coloums convert to rows with header values?
Your flattery is noted and not deserved. There are many here who are far more masterful than I. That said, I think at this point I would recommend getting more knowledgable with RapidMiner Studio before moving forward with large data sets like the one you describe - actions such as renaming attributes and so forth are the beginning of a long journey. I would highly recommend starting with the "Getting Started with RapidMiner" YouTube playlist. The whole beauty of RapidMiner is that you can learn to create your own processes and be a master yourself!
Scott
Hi all,
I just published the most recent version of our extensions on the marketplace. So if that was the problem, it should be gone now. At least I can use it with the most recent version of RM.
Greetings,
Sebastian