The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Importing textdata from csv files"
Gary_Hearne
Member, University Professor Posts: 6 University Professor
I teach a data mining course to business management students (with little or no programming experience) using a combination of R and RapidMiner. I try to duplicate the examples from each package in the other so that students to appreciate the differences in usability, available algorithms and results. For obvious reasons I use the graphical process approach in RapidMiner, rather than teaching XML (which I don't know anyway).
I have two csv data sets which I use in R, neither of which I have been able to import in the appropriate format to use in RapidMiner, despite playing around with what look like sensible operator options.
sms_spam.csv has two columns, the first identifies the content of the second as "spam" or "ham", while the second is a text message. (http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/). I want to import this so that I can use Naive Bayes to build a classifier for messages.
groceries.csv is an example set that comes with the arules library in R. It has multiple (unlabelled) columns, with each row representing a transaction, and as many columns used as there are items - so unstructured. I want to use association rules and/or fp-growth on this.
Any suggestions on how I can get either or both of these data sets into RapidMiner in a usable form would be greatly appreciated.
Tagged:
1
Answers
I will give the other a try later.
Do you want me to put these on the community samples repo for your students? It may be easier for you and your students.
Scott
Haha no only the community manager can change the user id. I can change this to whatever you want. Let me know what your preferred 'handle' is.
That code is XML - the backbone of RapidMiner and the way people swap processes. You can learn how to do it here: https://community.rapidminer.com/discussion/37047. It's just a matter of 'copy-and-paste'. I'm attaching the same process as an .rmp file to this message if that's easier for you.
The process looks like this when you load it into RapidMiner:
and I made a folder for you on the Community Samples repo so any student can just load and run
Scott
Scott