The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
How to clean tweets from hashtags and @
Hi everybody
I tried for 3 days to clean tweets from hashtags and @ but I couldn' t. Is there anybody for help
I tried for 3 days to clean tweets from hashtags and @ but I couldn' t. Is there anybody for help
Tagged:
0
Answers
Hi,
Do you mean just getting rid of the symbols "@ and #" or do you also want to remove what is following after, e.g. "@ingomierswa" and "#datascience" should be completely removed?
Both is easily possible with the operator "Replace" and a simple regular expression. Below is a small sample process showing you how this is done.
Hope this helps,
Ingo
Thanks
Extend your regex a bit like this :
\b(@|#)[^\. \s, ]+
It looks a bit ugly but basically means find anything 'word' that starts with either @ or #, and select everything till the next space, dot or comma. You replace this with nothing and it's gone.