replace hyphen

pb42
pb42 New Altair Community Member
edited November 2024 in Community Q&A
I am trying to replace a hyphen from a Grade attribute by using the Replace operator. I would like to replace it with text that describes no value has been entered (i.e., Not indicated). The problem is that the attribute includes values such as - (the hyphen I want to replace), A-, B-, C-. Using the replace operator replaces all of the hyphens (including those being used as minuses). I tried using the regular expression, \b[-]\b, but that is not working. I also tried, \b["-"]\b without success.
Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    Answer ✓
    Hi @pb42 ,
    in the Replace Operator you need to use the expression
    ^-$
    in the replace what parameter and replace it by Not indicated.
    That way only the single hyphens are replaced and the minuses (i.e. A-, B-,...) are kept.
    Short explanation:
    RapidMiner uses the Java RegEx functions: The ^ represents the beginning of a line, the $ represents the end of a line.
    Happy Mining,
    Edin

Answers

  • [Deleted User]
    [Deleted User] New Altair Community Member
    @pb42

    Hello

    This is very similar with your question ;) Take a look on that please :)

    https://community.rapidminer.com/discussion/comment/63840#Comment_63840

    I hope this helps
    mbs
  • pb42
    pb42 New Altair Community Member
    Thank you for the direction. I did read this question, but the solution did not make sense to me.
  • varunm1
    varunm1 New Altair Community Member
    Hello @pb42

    Can you provide some sample data?
  • pb42
    pb42 New Altair Community Member
    This is the file
  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    Answer ✓
    Hi @pb42 ,
    in the Replace Operator you need to use the expression
    ^-$
    in the replace what parameter and replace it by Not indicated.
    That way only the single hyphens are replaced and the minuses (i.e. A-, B-,...) are kept.
    Short explanation:
    RapidMiner uses the Java RegEx functions: The ^ represents the beginning of a line, the $ represents the end of a line.
    Happy Mining,
    Edin
  • sgnarkhede2016
    sgnarkhede2016 New Altair Community Member
    but in replace operator i need to pass "regex" it not working for me 
    e.g
    Sachin N
    Jonn Clara

    I have passed "replace what"  \^(\w+ \w+)
                             "replace by"   \("\w+ \w+")

    I want above string as "Sachin N" and "John Clara"
  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    If I understood you correctly you want to have the entries in the Attributes completed by leading and trailing double quotes. Value => "Value"
    In this case you replace:
    ^(.+)$
    by
    "$1"
    Happy Mining,
    Edin

    P.S.:
    The Operator Generate Attributes could have also been used. The expression would have been:
    "\"" + AttributeName + "\""
    where AttributeName would be the name of the Attribute which values you want to change.
  • sara20
    sara20 New Altair Community Member
    edited May 2020
    @Edin_Klapic

    Hello

    I work on a data for a store and I want to analyze the basket of customers, for the name of  columns I have alot of symbols and RM is not able to understand them also I can not replace all of them because they are in different types. Could you please tell me how can I solve it?

    Also I think it can be useful if RM team can solve this problem for the next version of RM( Future request)

    Thank you in  advance
    sara
  • Edin_Klapic
    Edin_Klapic New Altair Community Member
    Hi @sara20 ,

    Although your problem is somewhat similar to the abovementioned "hyphen"-issue it affects Names of Attributes and not Attribute values.
    Thus, I suggest for the future that you rather open a new thread in case the answers in a thread don't provide the help you need. That also makes it easier to find for users which might have a similar problem in the future.

    You can use "Rename by Replacing" to replace certain patterns represented by Regular Expressions. But only 1 at a time.
    So, unfortunately, the solution to your problem is not yet (as of version 9.6) a single Operator solution. Please find attached a quick solution using "Rename by Replacing" in loops together with some self created dictionary with which you are hopefully able to achieve your desired goal.

    Happy Mining,
    Edin

    Spoiler
    <?xml version="1.0" encoding="UTF-8"?><process version="9.5.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="9.5.001" expanded="true" name="Process">
        <parameter key="logverbosity" value="init"/>
        <parameter key="random_seed" value="2001"/>
        <parameter key="send_mail" value="never"/>
        <parameter key="notification_email" value=""/>
        <parameter key="process_duration_for_mail" value="30"/>
        <parameter key="encoding" value="SYSTEM"/>
        <process expanded="true">
          <operator activated="true" class="retrieve" compatibility="9.5.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="179" y="34">
            <parameter key="repository_entry" value="//Samples/data/Golf"/>
          </operator>
          <operator activated="true" class="concurrency:loop_attributes" compatibility="9.5.001" expanded="true" height="82" name="Loop Attributes" width="90" x="313" y="34">
            <parameter key="attribute_filter_type" value="all"/>
            <parameter key="attribute" value=""/>
            <parameter key="attributes" value=""/>
            <parameter key="use_except_expression" value="false"/>
            <parameter key="value_type" value="attribute_value"/>
            <parameter key="use_value_type_exception" value="false"/>
            <parameter key="except_value_type" value="time"/>
            <parameter key="block_type" value="attribute_block"/>
            <parameter key="use_block_type_exception" value="false"/>
            <parameter key="except_block_type" value="value_matrix_row_start"/>
            <parameter key="invert_selection" value="false"/>
            <parameter key="include_special_attributes" value="false"/>
            <parameter key="attribute_name_macro" value="loop_attribute"/>
            <parameter key="reuse_results" value="true"/>
            <parameter key="enable_parallel_execution" value="true"/>
            <process expanded="true">
              <operator activated="true" class="utility:create_exampleset" compatibility="9.5.001" expanded="true" height="68" name="Create ExampleSet" width="90" x="112" y="85">
                <parameter key="generator_type" value="comma separated text"/>
                <parameter key="number_of_examples" value="100"/>
                <parameter key="use_stepsize" value="false"/>
                <list key="function_descriptions"/>
                <parameter key="add_id_attribute" value="false"/>
                <list key="numeric_series_configuration"/>
                <list key="date_series_configuration"/>
                <list key="date_series_configuration (interval)"/>
                <parameter key="date_format" value="yyyy-MM-dd HH:mm:ss"/>
                <parameter key="time_zone" value="SYSTEM"/>
                <parameter key="input_csv_text" value="old,new&#10;o,-&#10;i,%"/>
                <parameter key="column_separator" value=","/>
                <parameter key="parse_all_as_nominal" value="true"/>
                <parameter key="decimal_point_character" value="."/>
                <parameter key="trim_attribute_names" value="true"/>
              </operator>
              <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (4)" width="90" x="246" y="85">
                <parameter key="macro" value="number_of_examples"/>
                <parameter key="macro_type" value="number_of_examples"/>
                <parameter key="statistics" value="average"/>
                <parameter key="attribute_name" value=""/>
                <list key="additional_macros"/>
              </operator>
              <operator activated="true" class="concurrency:loop" compatibility="9.5.001" expanded="true" height="103" name="Loop (2)" width="90" x="380" y="187">
                <parameter key="number_of_iterations" value="%{number_of_examples}"/>
                <parameter key="iteration_macro" value="iteration"/>
                <parameter key="reuse_results" value="true"/>
                <parameter key="enable_parallel_execution" value="false"/>
                <process expanded="true">
                  <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (5)" width="90" x="112" y="34">
                    <parameter key="macro" value="old_character"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="statistics" value="average"/>
                    <parameter key="attribute_name" value="old"/>
                    <parameter key="example_index" value="%{iteration}"/>
                    <list key="additional_macros"/>
                  </operator>
                  <operator activated="true" class="extract_macro" compatibility="9.5.001" expanded="true" height="68" name="Extract Macro (6)" width="90" x="246" y="34">
                    <parameter key="macro" value="new_character"/>
                    <parameter key="macro_type" value="data_value"/>
                    <parameter key="statistics" value="average"/>
                    <parameter key="attribute_name" value="new"/>
                    <parameter key="example_index" value="%{iteration}"/>
                    <list key="additional_macros"/>
                  </operator>
                  <operator activated="true" class="delay" compatibility="9.5.001" expanded="true" height="103" name="only to ensure execution order (2)" width="90" x="447" y="85">
                    <parameter key="delay" value="none"/>
                    <parameter key="delay_amount" value="1000"/>
                    <parameter key="min_delay_amount" value="0"/>
                    <parameter key="max_delay_amount" value="1000"/>
                  </operator>
                  <operator activated="true" class="rename_by_replacing" compatibility="9.5.001" expanded="true" height="82" name="Rename by Replacing (2)" width="90" x="581" y="136">
                    <parameter key="attribute_filter_type" value="all"/>
                    <parameter key="attribute" value=""/>
                    <parameter key="attributes" value=""/>
                    <parameter key="use_except_expression" value="false"/>
                    <parameter key="value_type" value="attribute_value"/>
                    <parameter key="use_value_type_exception" value="false"/>
                    <parameter key="except_value_type" value="time"/>
                    <parameter key="block_type" value="attribute_block"/>
                    <parameter key="use_block_type_exception" value="false"/>
                    <parameter key="except_block_type" value="value_matrix_row_start"/>
                    <parameter key="invert_selection" value="false"/>
                    <parameter key="include_special_attributes" value="false"/>
                    <parameter key="replace_what" value="%{old_character}"/>
                    <parameter key="replace_by" value="%{new_character}"/>
                  </operator>
                  <connect from_port="input 1" to_op="Extract Macro (5)" to_port="example set"/>
                  <connect from_port="input 2" to_op="only to ensure execution order (2)" to_port="through 2"/>
                  <connect from_op="Extract Macro (5)" from_port="example set" to_op="Extract Macro (6)" to_port="example set"/>
                  <connect from_op="Extract Macro (6)" from_port="example set" to_op="only to ensure execution order (2)" to_port="through 1"/>
                  <connect from_op="only to ensure execution order (2)" from_port="through 1" to_port="output 1"/>
                  <connect from_op="only to ensure execution order (2)" from_port="through 2" to_op="Rename by Replacing (2)" to_port="example set input"/>
                  <connect from_op="Rename by Replacing (2)" from_port="example set output" to_port="output 2"/>
                  <portSpacing port="source_input 1" spacing="0"/>
                  <portSpacing port="source_input 2" spacing="0"/>
                  <portSpacing port="source_input 3" spacing="0"/>
                  <portSpacing port="sink_output 1" spacing="0"/>
                  <portSpacing port="sink_output 2" spacing="0"/>
                  <portSpacing port="sink_output 3" spacing="0"/>
                </process>
              </operator>
              <connect from_port="input 1" to_op="Loop (2)" to_port="input 2"/>
              <connect from_op="Create ExampleSet" from_port="output" to_op="Extract Macro (4)" to_port="example set"/>
              <connect from_op="Extract Macro (4)" from_port="example set" to_op="Loop (2)" to_port="input 1"/>
              <connect from_op="Loop (2)" from_port="output 2" to_port="output 1"/>
              <portSpacing port="source_input 1" spacing="147"/>
              <portSpacing port="source_input 2" spacing="0"/>
              <portSpacing port="sink_output 1" spacing="0"/>
              <portSpacing port="sink_output 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve Golf" from_port="output" to_op="Loop Attributes" to_port="input 1"/>
          <connect from_op="Loop Attributes" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>



  • sara20
    sara20 New Altair Community Member
    Thank you very much

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.