The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
removing columns from a dataset/analysis
Hi all,
I'm' new to Rapidminer so this is a basic question I'd appreciate your help with.
I'm running a decision tree (ID3Numerical).
How do I:
1) remove columns (variables) from a dataset before processing
2) alternatively, list the subset of the dataset's variables that I want to put through the tree?
In a related issue, how do I store and view a dataset that's been read in without running the whole ETL process again and pausing it after import?!
thanks in advance for all your help.
Richie
I'm' new to Rapidminer so this is a basic question I'd appreciate your help with.
I'm running a decision tree (ID3Numerical).
How do I:
1) remove columns (variables) from a dataset before processing
2) alternatively, list the subset of the dataset's variables that I want to put through the tree?
In a related issue, how do I store and view a dataset that's been read in without running the whole ETL process again and pausing it after import?!
thanks in advance for all your help.
Richie
0
Answers
Welcome to the whacky world of RM! Here's an answer to your questions.... Some say that reading the manuals and working through the tutorial and examples helps, others that it takes all the fun out of guessing.
How do I now view the datasets created so I can see the results of my operators? For example, in the example you give, how do I:
- view the first dataset before filtering
- view the last dataset after the FeaturenameFilter
I ask because the input and data preparation may be computationally expensive and I don't want to have to rerun them again.
Where is the reference in the documentation that you mention by the way? I always search the documentation first but found nothing on this basic ETL stuff.
Thanks,
R
That being said I've griped before about the documentation, but believe you me RM is much better and easier to use than the documentation. Being a halfwit myself perhaps I should offer up an idiot's guide to the data underworld...
Good weekend
I ran Help\Rapidminer Tutorial but it has no search feature and when I close the tutorial dialog my whole process tree had been lost. So I gave up on that pretty quickly! I'll check the PDF you mention.
I think once these folks sort out their documentation they'll have a really excellent product that people will use. For an industry user wanting to get up and running quickly it's pretty lacking alright.
BTW, I have used breakpoints. Once you move on throuhg a breakpopint however, there's no way to go back and view the datasets. I'm coming from a SAS background where this is easy to do. Also, it's important to start a process at any stage since it's pointless to have to keep reading in the data before running any analytics.
Any pointers on where I can find out about that?
Thanks and have a good weekend yourself.
R
For serious users I'd really recommend a course up at RM, in two days I learnt more than in the preceeding two months of grappling with the guesswork. Besides which Ralf is a very approachable tutor and genial lunch host 8)
Thanks that's exactly what we wanted. We're not short of storage here but can't afford the time to repeatedly run through a long ETL process.
Thanks,
Richie