The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to
mauricenew
Member Posts: 4 Learner I
I am running a naive bayes classifcation, the most simplies way I could find on the internet. Results are...well...weird.
My trainingdata looks like this: 2 columns, column1 = combination of terms/words, column2 = categorization of those combinations
Example: column1 => "where to buy a mercedes" column2 => "mercedes"
Example: column1 => "whats the newesst mercedes model" => "mercedes"
So basically categorizing into "brands" of cars lets say
My dataset which should be classified ovv only has 1 column with combinations of terms/words.
Whats the best way to optimize or achieve that?
My trainingdata looks like this: 2 columns, column1 = combination of terms/words, column2 = categorization of those combinations
Example: column1 => "where to buy a mercedes" column2 => "mercedes"
Example: column1 => "whats the newesst mercedes model" => "mercedes"
So basically categorizing into "brands" of cars lets say
My dataset which should be classified ovv only has 1 column with combinations of terms/words.
Whats the best way to optimize or achieve that?
Tagged:
1
Answers
What needs to be done is follow a text processing workflow as described before, using the process data from documents operator, and ensure your string is of text type (not the default nominal). Create a vector set using TF-IDF (or another one) with this operator and use the output to train your model.
Results can further be improved with toggling the settings (like increase or decrease the pruning) or add additional steps in your tokenizing workflow.
Hope this helps!
So far I do this:
Trainigsdata ->"Nominal to text" -> "Process Documens from Data" (inside there is a tokenize operator) -> "set role" -> "naive bayes" -> "apply model"
ps: Thanks already for your input!