The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Predefined Topic lists
Hello everyone..
I have a question.
1- When I have predefined topic lists, which contains some words to extract the suitable topic of each Arabic documents.
I have a question.
1- When I have predefined topic lists, which contains some words to extract the suitable topic of each Arabic documents.
Cosine similarity is considered a good solution for this problem?
or latent Dirichlet allocation (LDA) ?
Please, could you guide me to do that in rapidminer?
Thanks.
Thanks.
0
Answers
@mschmitz is the resident expert on LDA (well at least he has written the operator) but I am pretty sure that is not going to help you here because I don't think you can feed the LDA algorithm a predefined set of topics.
So I am not actually sure what the best way to accomplish this would be. I guess you could put together a wordlist with the words for each predefined cluster and then try to build a polynominal classification model but that might not give you the output you really want. @mschmitz do you have another approach you would recommend here?
P.S. I don't think the language is really an issue, it has more to do with the structure of the problem.
Lindon Ventures
Data Science Consulting from Certified RapidMiner Experts
Dortmund, Germany