The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
Text Tokenization Using Regular Expression For Text Mining
Hello,
I have a problem and i need your help, please.
I want to tokenize a unstructured document using regular expression. I have a text file where each rows include a sentence such as:
1. String1 String2 String3 String4 String5
2. String6 - String7 - -
...
n. String8 - String9 String10 - (assume string2 and string5 dont exist.)
What I exactly want to do is that tokenization will extract each word and give the results in a table in Excel format such as:
S1 S2 S3 S4 S5
1. String1 String2 String3 String4 String5
2. String6 - String7 - -
3.
..
n. String8 - String9 String10 -
which operators and and which regular expression structure can i use in Rapid Miner?
Thank you for your help in advance.
I have a problem and i need your help, please.
I want to tokenize a unstructured document using regular expression. I have a text file where each rows include a sentence such as:
1. String1 String2 String3 String4 String5
2. String6 - String7 - -
...
n. String8 - String9 String10 - (assume string2 and string5 dont exist.)
What I exactly want to do is that tokenization will extract each word and give the results in a table in Excel format such as:
S1 S2 S3 S4 S5
1. String1 String2 String3 String4 String5
2. String6 - String7 - -
3.
..
n. String8 - String9 String10 -
which operators and and which regular expression structure can i use in Rapid Miner?
Thank you for your help in advance.
Tagged:
0
Answers
Best regards,
Marius