The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Ensemble method for multiple data sets
Hi,
I am currently working in RapidMiner 4.6.
I have extracted features from my main data set into 2 sets of features. One is a word vector (53 features) and the other is a set with 10 different features.
I have 2 different classifiers that I would like to combine in an ensemble method:
Logistic Regression on the word vector
W-J48graft on a different set of features
From my understanding I can only use operators such as stacking and voting if I give it one and the same data set as input.
How would I go about combining predictions from both my data sets using an ensemble method?
Thank you in advance!
I am currently working in RapidMiner 4.6.
I have extracted features from my main data set into 2 sets of features. One is a word vector (53 features) and the other is a set with 10 different features.
I have 2 different classifiers that I would like to combine in an ensemble method:
Logistic Regression on the word vector
W-J48graft on a different set of features
From my understanding I can only use operators such as stacking and voting if I give it one and the same data set as input.
How would I go about combining predictions from both my data sets using an ensemble method?
Thank you in advance!
0
Answers
ensemble methods require that your features (in your case the word vector and the 10-feature set) are part of every single instance in your data set. Therefor you could join them together to one example set and inside the vote or stacking operator use one attribute filter for every learner to hide the unwanted features from your learners. Please note that voting operator performs majority voting for classification tasks, therefor you might need more than two learners inside...
BUT, what is really not possible is to combine a regression learner with a classification approach (unless you have both a categorical label and a corresponding numerical value for each example, which is a rather bizarre setup)
Here is an example for a simple stacking approach with different input for all base learners:
Cheers,
Helge