The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
HELP please-Regular expressions (Replace tokens)
happy_neid
Member Posts: 10 Contributor I
I want to find all tokens that are #hashtags and to replace them with the word "mention", but i want to leave certain subset of those hashtags,.
Example: If i have words #apple #juice #tree #dog #table i want to replace #apple and #juice with the word "mention" and i want to leave tokens #tree #dog and #table as they are now.
How to do that with operator replace tokens?
I would really appreciate any help...
Tagged:
0
Answers
To drop the "#" you could do something like do a selection like #(.*) and then a replace by $1.
If you want to select #apple and replace it with "mention" you could do a selection like #apple and then replace with mention. This could get very messy if you have a lot of words you want to replace.
What I would suggest to do is use the Replace Dictionary operator and pass a list of words you want to change to mention. everything needs to be in a nominal data format first and then you have to convert it to text to let the Process Documents from Data work. In essence you do the token replacement before you text process.
Hi,
what you're trying to do is a so-called "negative lookahead", an advanced regular expression concept.
Take a look at this process:
It seems to do what you want.
The hashtags you don't want to match are given in this expression: \#(?!(tree|dog|table))(\w+)
Regards,
Balázs
hey ı have the same problem and ı did it like you said but result is not what ı want.please look at my screen and help me:\
Are you looking to do something like this?
oow thanks Sir;
when ı changed $1 as "myword" , it worked succesfully.Thanks to Rapid Miner Family:D