The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Creating ExcelExampleSet
Hi have successfully managed to read an excel sheet containing 30 rows of unique data (with just one regular attribute).
I am then trying to pipe this into the NB operator and I am getting the following error upon execution:
Mar 11, 2009 12:53:25 PM: [Fatal] UserError occured in 1st application of NaiveBayes (NaiveBayes)
Mar 11, 2009 12:53:25 PM: [Fatal] Process failed: Input example set has no attributes
Root[1] (Process)
+- ExcelExampleSource[1] (ExcelExampleSource)
here ==> +- NaiveBayes[1] (NaiveBayes)
Any ideas why this is the case?
I want to classify the 30 pieces of text (i.e. each row in the excel sheet) into associated groups.
Thanks.
I am then trying to pipe this into the NB operator and I am getting the following error upon execution:
Mar 11, 2009 12:53:25 PM: [Fatal] UserError occured in 1st application of NaiveBayes (NaiveBayes)
Mar 11, 2009 12:53:25 PM: [Fatal] Process failed: Input example set has no attributes
Root[1] (Process)
+- ExcelExampleSource[1] (ExcelExampleSource)
here ==> +- NaiveBayes[1] (NaiveBayes)
Any ideas why this is the case?
I want to classify the 30 pieces of text (i.e. each row in the excel sheet) into associated groups.
Thanks.
0
Answers
several remarks:
1. you have to use the Text Plugin in order to transform your texts into word vectors with the StringTextInput operator
2. you do not seem to have a label --> clustering seems more appropriate than NaiveBayes which is a classification method
Cheers,
Ingo
So which clustering algo do you recommend for standard written english text?
there is no standard algorithm - just try them and check which one delivers results you like best. If performance is an issue, I would start with KMeans, if you want something hierarchical and you have not too many examples, I would try agglomerative clustering.
Cheers,
Ingo
Going back to your original responses/remarks however:
1. you have to use the Text Plugin in order to transform your texts into word vectors with the StringTextInput operator
my_response: will i need to do this for the clustering algorithms too? or just for the classification algorithms?
2. you do not seem to have a label --> clustering seems more appropriate than NaiveBayes which is a classification method
my_response: what are these labels? i have been through the documentation but cannot fully interpret why the labels are required? also, within my excel sheet, do i need to have another column for these labels? what are they used for? ideally i would like to use supervised learning in order to produce a model.
Thanks Ingo.
Labels are the classes you provide during the training phase. The different values of the label column will then be predicted by a classification model for new and unseen data (which no longer needs a given label). For supervised learning, you will always need a label (target, class... you name it). If you are not able to provide a label, then you usually perform an unsupervised learning method instead (like clustering).
Cheers,
Ingo