The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
byte address / word location for Textual ETL
Wanttoknow
Member Posts: 6 Contributor II
Hi,
I'm doing fine with the currently provided operators for text processing in RM 5.0 (great! guys :-*)
However there is one aspect that I would like to see during the vector creation of words in documents and that is the byte addresses per word occurence as a key to distinguish one word occurence from another.
This would require a whole new representation of the wordlist where every occurence is displayed with a byte address/word location in stead of the aggregated number of occurences per word per document.
This would open up a new range of possibilities such as determining what other words or terms are found in proximity of a certain word/term. This would be of great value to determine the context of documents.
Of course I would be glad to know if this would already be possible with some combination of current operators ::)
I'm doing fine with the currently provided operators for text processing in RM 5.0 (great! guys :-*)
However there is one aspect that I would like to see during the vector creation of words in documents and that is the byte addresses per word occurence as a key to distinguish one word occurence from another.
This would require a whole new representation of the wordlist where every occurence is displayed with a byte address/word location in stead of the aggregated number of occurences per word per document.
This would open up a new range of possibilities such as determining what other words or terms are found in proximity of a certain word/term. This would be of great value to determine the context of documents.
Of course I would be glad to know if this would already be possible with some combination of current operators ::)
0
Answers
by coincidence this is exactly what we are currently working on. Stay tuned :-)
Cheers,
Simon
Great. Looking forward to it.
Thanks for your reply
Apart from that, we have a lot of other ideas concerning the text processing extension - so it will probably take a while until the re-structuring is finished, stay tuned .. ;-)
Kind regards,
Tobias