The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here
"Extract Information Regular Expression query type failed (Text Processing)"
CharlieFirpo
Member Posts: 48 Contributor II
Dear All!
I have a simple process: Create Document + Extract Information. I create a simple text: "string1 string2 string3 string4" and I use a simple regular expression: ^\S* so I want to extract the first string from my document. And RapidMiner gives the following error: Process Failed. No group 1.
If I use not a Regular Expression query type but a String Matching one and I set string2 and string4 at query expression, then I get string3 as result. So String Matching works well. But Regular Expression does not.
Can anybody check this why this query type does not work? Or did I make any mistake? (what?)
If I use Regular Region and set eg. ^\S* and .* as region delimiter, then RapidMiner gives the correct result: string1 string2 string3 string4.
Only the normal Regular Expression does not work........
Of course if I use Regular Region and ^\S* and '\ ' as the two delimiter, then I will get the result I want: string1
But why Regular Expression query type does not work?
Thank you for reading it and trying to help me!
I have a simple process: Create Document + Extract Information. I create a simple text: "string1 string2 string3 string4" and I use a simple regular expression: ^\S* so I want to extract the first string from my document. And RapidMiner gives the following error: Process Failed. No group 1.
If I use not a Regular Expression query type but a String Matching one and I set string2 and string4 at query expression, then I get string3 as result. So String Matching works well. But Regular Expression does not.
Can anybody check this why this query type does not work? Or did I make any mistake? (what?)
If I use Regular Region and set eg. ^\S* and .* as region delimiter, then RapidMiner gives the correct result: string1 string2 string3 string4.
Only the normal Regular Expression does not work........
Of course if I use Regular Region and ^\S* and '\ ' as the two delimiter, then I will get the result I want: string1
But why Regular Expression query type does not work?
Thank you for reading it and trying to help me!
Tagged:
0
Answers
But the second Extract Information operator works on the whole original document. I checked that the input document of the second Extract Information operator is 'string2 string3 string4'.
So why does the second Extract Information operator extract not this but the original 'string1 string2 string3 string4'?
Thank you!
You have to use brackets when using the Regular Expression query type at Extract Information operator.
So eg.:
wrong: ^\S*
good: (^\S*)
These brackets are not part of the regular expression.
Nice day!
Is there anyone know why is this 'bracket soulution' required for Rapidminer. As I am always aware of, a regular expression does not need brackes unless a it is a group captures..
Thank you