The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"regex not working"

allerkongeallerkonge Member Posts: 1 Learner III
edited June 2019 in Help

Dear all,

 

I'm trying to clean a dataset, and I'm working with a couple of regex. If I text the regex on the website regexpal, it works fine, but if I put the same regex to Rapidminer with the Replace Operator, it says there is a mistake. 

 

This is one of the regex I'm testing:

 

(?=\b[\m*#])\w+

 

When I try this one, it says it's uncorrect.

 

I'm correcting this in this way, adding backslashes

 

(?=\\b[\\m*#])\\w+

 

And it doesn't says is uncorrect, but it doesn't replace anything. The attribute is gender, so I'd like to replace for example "mm", or "male" with "M"

 

Thanks a lot for your help.

 

 

Tagged:

Best Answer

  • IngoRMIngoRM Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University Professor Posts: 1,751 RM Founder
    Solution Accepted

    Hi,

     

    regexpal only works for Javascript but RapidMiner uses the Java regex parser.  Despite the similar names, both languages are actually completely different so there is no guarantee that any Javascript regex would word with Java.  See here for example: http://stackoverflow.com/questions/21883629/java-vs-javascript-regex-matching

     

    Anyway, in your case can't you just use "m.*" (without quotes) in "replace_what" and "M" in "replace_by"?

     

    I am not sure if it needs to be more complicated than that but I don't know your data of course...

     

    Hope this helps,

    Ingo

Sign In or Register to comment.