The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Regarding Text Classification
sudheendra
Member Posts: 22 Maven
Hai,
I have 1000 Text documents. I want to classify these records on the basis of some words in the document, ie if the document contains a particular number of words(word1, word2......... word10) I need to classify these as a group. I have already tried it with clustering algorithm and got around 20 clusters.But there I couldn't find any option for the above mentioned type of classification. Is there any way to classify the records on the basis of input word list.
Thanks,
Sudheendra
I have 1000 Text documents. I want to classify these records on the basis of some words in the document, ie if the document contains a particular number of words(word1, word2......... word10) I need to classify these as a group. I have already tried it with clustering algorithm and got around 20 clusters.But there I couldn't find any option for the above mentioned type of classification. Is there any way to classify the records on the basis of input word list.
Thanks,
Sudheendra
Tagged:
0
Answers
of course. But you didn't learn anything at all then. You simply could use an attribute construction operator, adding the if clauses and generate a new label attribute.
But this isn't text mining at all...
Greetings,
Sebastian
I already worked with attribute construction operator using numerical attributes.If we can use the same operator in Text data how will I label to "Type A" if the text contains "payment " and "claimant".
Thanks,
Sudheendra
wasn't it you, whom I recommended to read a book about text mining? It will become clear to you, then. The word vector representation in TFIDF is just the very basic. Sorry, but without knowledge of that, it doesn't make sense to continue.
Greetings,
Sebastian