The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"custom word weight in vector"
I'm using 'Process Documents from Files' operator and i want to use different from standard (tf-idf, binary term occur. etc.) word weight. What the best way to archive it ? Only by using API ?
Tagged:
0
Answers
yes, if you want to use a different word vector creation algorithm, you have to implement it yourself. If you are only interested in the word vector and don't want to pass it to other operators, you can probably calculate it inside RapidMiner: after using the WordList to Data operator, you can use any operators like Aggregate etc. on the word vector and on the processed documents.
If you need a proper WordVector object to pass it to the next Process Documents operator, you will have to program it in java, or in Groovy with the help of the Execute Script operator.
Best, Marius
did you also download the code of the text processing extension?
Then you can simply search in the OperatorsTextProcessing.xml for the operator name with underscores, e.g. process_document_from_file, and you will see an entry with the class name - in this case its com.rapidminer.operator.text.io.FileDocumentInputOperator.
Best, Marius
you have to open the "ant" view in Eclipse, drag the build.xml from the extension into the view and double-click the install target.
This will create a jar file for the text processing extension and copy it into the libs folder of RapidMiner. If that file is present, it will have priority over the pre-installed plugin versions.
Prior to building the extension you have to drag the build.xml of RapidMiner_Unuk into the same view and double click "createJar".