The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

Extract Date Part that contain Early KeyWord using regex (replace operator)

sgnarkhede2016sgnarkhede2016 Member Posts: 152 Contributor II
edited May 2023 in Help
From below data extract the date that contains Early keyword using regex

Apr 24, 2022-Apr 30, 2022+Early,Jun 16, 2022-Jun 30, 2022+Delay,Jun 23, 2022-Jun 30, 2022+NA,Jun 26, 2022-Jun 30, 2022+Early

Output:
Apr 24, 2022-Apr 30, 2022,Jun 26, 2022-Jun 30

Answers

  • jwpfaujwpfau Employee-RapidMiner, Member Posts: 303 RM Engineering
    edited May 2023
    Hi,

    This is way easier with the matches method of the Generate Attributes operator.

    But if you really want to replace the non matching part, you should read about negative lookahead.
    ([^+^,]+,)+[^+^,]+\+(?!Early)[^+^,]+,?|(\+[^+^,]+)

    Greetings,
    Jonas
  • sgnarkhede2016sgnarkhede2016 Member Posts: 152 Contributor II
    Thanks, But it not working for me fully

    Input : Apr 24, 2022-Apr 30, 2022+Early,May 16, 2022-May 30, 2022+Delay,Jan 01, 2022-Jan 30, 2022+Early,Jun 26, 2022-Jun 30, 2022+NA

    regex: ([^+^,]+,)+[^+^,]+\+(?!Early)[^,]+,|\+[^,]+
    Early Expected  : Apr 24, 2022-Apr 30, 2022,Jan 01, 2022-Jan 30, 2022
    Getting : Apr 24, 2022-Apr 30, 2022,Jan 01, 2022-Jan 30, 2022,Jun 26, 2022-Jun 30, 2022

    regex: ([^+^,]+,)+[^+^,]+\+(?!Delay)[^,]+,|\+[^,]+
    Delay Expected: May 16, 2022-May 30, 2022
    Delay :May 16, 2022-May 30, 2022,Jun 26, 2022-Jun 30, 2022

    Can you please help me on this
  • jwpfaujwpfau Employee-RapidMiner, Member Posts: 303 RM Engineering
    edited May 2023
    This should work
    ([^+^,]+,)+[^+^,]+\+(?!Delay)[^,]+,?|(\+Delay,(?!.*\+Delay))|\+Delay
    (make sure to remove additional whitespaces after pasting)
  • Eden60Eden60 Member Posts: 7 Contributor II
    To extract the dates containing the "Early" keyword using regular expressions (regex) from the given data, you can use the following regex pattern:

    \b(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\s\d{1,2},\s\d{4}-\b.*?Early\b 

    Here's how this regex pattern works:

    • \b: This is a word boundary anchor to ensure that we match whole words.
    • (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec): This part matches the abbreviated month names (e.g., Apr, Jun).
    • \s: Matches a single whitespace character.
    • \d{1,2},\s\d{4}-: Matches the date in the format "dd, yyyy-" (e.g., "24, 2022-").
    • .*?: Matches any characters (non-greedy- blue prism).
    • Early: Matches the keyword "Early".
    I hope this will help you.

    Regards
    Eden Wheeler
Sign In or Register to comment.