The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"problems in tokenizing imported Excel text documents via
I am new to rapidminer, and have been watching videos and reading the postings. Still have the foggist idea, hence this post.
I want to import text from Excel (with each row = a short document), and then tokenize to generate a wordlist. It's simple, but I don't know why I can't get any wordlist by executing the following steps.
1. Read excel (one column with 317 rows, and each row stores a short document)
2. Data to documents
3. Process documents
3a. Tokenize within Process documents
The results:
for the ExampleSet(Process Documents) two columns are shown: first column (Row No.) with 1 to 317, but second column (text), empty.
for the Wordlist (Process Documents) is totally empty.
When I execute steps 1 and 2, it works as I can see the outputs, but not nothing with steps 3 and 3a.
Have been trying in the last two days, so a bit drained, and demoralized as I believe it's a simple problem.
Appreciate if someone can point out my mistake.
Thank you,
George
I want to import text from Excel (with each row = a short document), and then tokenize to generate a wordlist. It's simple, but I don't know why I can't get any wordlist by executing the following steps.
1. Read excel (one column with 317 rows, and each row stores a short document)
2. Data to documents
3. Process documents
3a. Tokenize within Process documents
The results:
for the ExampleSet(Process Documents) two columns are shown: first column (Row No.) with 1 to 317, but second column (text), empty.
for the Wordlist (Process Documents) is totally empty.
When I execute steps 1 and 2, it works as I can see the outputs, but not nothing with steps 3 and 3a.
Have been trying in the last two days, so a bit drained, and demoralized as I believe it's a simple problem.
Appreciate if someone can point out my mistake.
Thank you,
George
Tagged:
0
Answers
did you convert the attribute's type to text before using the "Data to Documents" operator? Otherwise the former attribute is just added as meta data to the document and not set as the document's content.
I just replied to a post dealing with such a problem: http://rapid-i.com/rapidforum/index.php/topic,3457.0.html
This might be the solution for a simple problem (assumed you forgot it). If you already did the type conversion you perhaps might post your process XML here...
Regards
Matthias