The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] creating very many models using a single data set?
Hi,
I'm a new user of RapidMiner and was wondering if someone in the community knows a good way to create what I am trying.
I am working with a very large dataset (millions of examples) that has 1 id attribute, 1 text attribute, 52 numerical attributes per example (row) and 1 label attribute. There are about 500 unique text attributes in the whole data set. What I would like to do is create a decision tree model (and store it) for data corresponding to each unique text attribute. That is, for each unique text attribute, I want all the examples corresponding to that text attribute and then train a decision tree model using the 52 numerical and 1 label attributes. I could do it using filter examples, decision tree model, and repository store operators manually for each unique text attribute, but I would have to do this about 500 times. Is there an efficient way to implement this? I could try to do this using scripting, but I was just wondering if I could use the built in operators. Is the Loop operator the answer?
Thanks in advance.
I'm a new user of RapidMiner and was wondering if someone in the community knows a good way to create what I am trying.
I am working with a very large dataset (millions of examples) that has 1 id attribute, 1 text attribute, 52 numerical attributes per example (row) and 1 label attribute. There are about 500 unique text attributes in the whole data set. What I would like to do is create a decision tree model (and store it) for data corresponding to each unique text attribute. That is, for each unique text attribute, I want all the examples corresponding to that text attribute and then train a decision tree model using the 52 numerical and 1 label attributes. I could do it using filter examples, decision tree model, and repository store operators manually for each unique text attribute, but I would have to do this about 500 times. Is there an efficient way to implement this? I could try to do this using scripting, but I was just wondering if I could use the built in operators. Is the Loop operator the answer?
Thanks in advance.
0
Answers
the Loop Values operator is the one you are looking for.
The tutorial process shows a very similar example to your problem, using the value of the loop_value macro to filter the examples.
I hope this answere your questions, if not don't hesitate to ask.
Best regards,
David