The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Building a sustainability dictionary
Hello,
we are beginner in the world of text mining and anlysis...
We would like to create a sustainability dictionary. The databasis are sustainability reports and 10-k reports from firms.
We should use 70% data for the ridge regression, so we can find our parameters for the dictionary. For a classification In the next step we should use 30% of the data to train the model and to classify reports. We should show how exactly is the Ridge Regression, i.e. "when one word in a sentencnes is a sustainability word from our analysis the sentences is a sustainability sentence". How we can model this in RapidMiner? Any tips or models or templates? We have the data in sentences in Excel.
Thank you very much for the further information!
we are beginner in the world of text mining and anlysis...
We would like to create a sustainability dictionary. The databasis are sustainability reports and 10-k reports from firms.
We should use 70% data for the ridge regression, so we can find our parameters for the dictionary. For a classification In the next step we should use 30% of the data to train the model and to classify reports. We should show how exactly is the Ridge Regression, i.e. "when one word in a sentencnes is a sustainability word from our analysis the sentences is a sustainability sentence". How we can model this in RapidMiner? Any tips or models or templates? We have the data in sentences in Excel.
Thank you very much for the further information!
0
Answers
-Noel