The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
[SOLVED] TextAnnotator - Create new label - Finish tagging
CharlieFirpo
Member Posts: 48 Contributor II
Dear all!
A have a sample training exercise: retrieve a formerly created exampleset and manually annotate persons, locations, organizations in the text of the source document. The exampleset has a "word" attribute that contains the words of the source document (using SentenceTokenizer and Wordokenizer).
Now I "Generate Empty Attribute" to store the labels in it (PER, LOC,...). Then I have a "Loop Examples" operator to iterate "Set Data" operator in it. In "Set Data" I use the "label" attribute name and the "O" value. It sets the label attribute to "O" for all examples (rows), so for all words of the original document. After "Lop examples" there is the "TextAnnotator" operator where the text-attribute is "word", the label-attribute is "label". I run the process and see that in the ExampleSet result, all word has the "O" label value. It is OK. Then I switch to TextAnnotator to see its result. I find only one label: "O" with white color, and all the text (of the document) is white backgrounded. It's OK. And there is the option on that page to "Create new Label" and to "Finish tagging". I'm able to create new labels ("PER", "LOC", "ORG"), but how can I set manually which word (of the text) is a person or a location or an organization???
I create eg. the "PER" label writing "PER" to "labelName" field then click on Create new Label button. Then I double click on a string (a name of a person) in the text then I click on Finish tagging. But nothing happes. The selected string still has white background..
Can anybody help me?
Thank you!!!!
A have a sample training exercise: retrieve a formerly created exampleset and manually annotate persons, locations, organizations in the text of the source document. The exampleset has a "word" attribute that contains the words of the source document (using SentenceTokenizer and Wordokenizer).
Now I "Generate Empty Attribute" to store the labels in it (PER, LOC,...). Then I have a "Loop Examples" operator to iterate "Set Data" operator in it. In "Set Data" I use the "label" attribute name and the "O" value. It sets the label attribute to "O" for all examples (rows), so for all words of the original document. After "Lop examples" there is the "TextAnnotator" operator where the text-attribute is "word", the label-attribute is "label". I run the process and see that in the ExampleSet result, all word has the "O" label value. It is OK. Then I switch to TextAnnotator to see its result. I find only one label: "O" with white color, and all the text (of the document) is white backgrounded. It's OK. And there is the option on that page to "Create new Label" and to "Finish tagging". I'm able to create new labels ("PER", "LOC", "ORG"), but how can I set manually which word (of the text) is a person or a location or an organization???
I create eg. the "PER" label writing "PER" to "labelName" field then click on Create new Label button. Then I double click on a string (a name of a person) in the text then I click on Finish tagging. But nothing happes. The selected string still has white background..
Can anybody help me?
Thank you!!!!
0
Answers
After selecting a token (eg. a name of a person), I have to press the "Alt" keyboard button and the bacground of its token will change. And after tagging all tokens, I have to press Finish tagging button and then the values of the label attribute will change.
Wondering why there is not a "Tag it" button next to "Create new Label" button...
And also the message is so "meaningful" for a beginner: "Shortcuts: 'Strg' for nex and 'Alt' for previous Label. 'Alt Gr' selects next word."
Creating the "PER" label and tag a string with it, then creating a second label ("LOC"), RapidMiner crashes down with this message:
"
gui.dialog.error.Error during logging: .title
gui.dialog.error.Error during logging: .message
Invalid insert
"
Even if I press Finish tagging button after tag a PER token, before creating the LOC label
Brilliant
If I create more label-value pars (label-PER; label-LOC; label-ORG; label-MISC) in the Set Data suboperator (of Loop Examples operator) and the last label is not the whte backgrounded "O", but eg. MISC, all the text background will be (let say) orange. It's OK. But:
If I change only one token's label from MISC to PER (or LOC or ORG or "O"), the whole sentence's (4-5 tokens) labels will change to PER (or LOC...). But the PER-background color only appears at the selected token not the whole sentence.
2) When the whole text has orange background color (because the last label-value pair is MISC (or PER, or LOC, or ORG, but not "O"), the spaces between the words are also colored as orange. But sapces are not tokens. Or they are?