The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Regarding Text Mining"
maria_godric
Member Posts: 20 Maven
Hi,
I have a text document.How can I delete the contents in between two special characters (For Example my document contains #something#). I want to delete the special character also. I tried with TextCleaner but we have to include the content whatever we want to delete.So I think this will not work out if its for huge amount of data.Is there any Operators available in RM?
Thanks,
Maria
I have a text document.How can I delete the contents in between two special characters (For Example my document contains #something#). I want to delete the special character also. I tried with TextCleaner but we have to include the content whatever we want to delete.So I think this will not work out if its for huge amount of data.Is there any Operators available in RM?
Thanks,
Maria
Tagged:
0
Answers
you might add an TokenReplace Operator before the Tokenizer during TextProcessing and then use regular expressions to capture whatever you want.
Here's an example process setup: For more information about regular expressions, you could visit wikipedia http://en.wikipedia.org/wiki/Regular_expression and for trying something without executing the process, you could use the online form at http://en.wikipedia.org/wiki/Regular_expression.
Greetings,
Sebastian
It worked fine.But I would like to get the edited text in the same format as that of original data ie I need to save it in .txt format .
Thanks,
Maria