The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

problem with stopwordfilterfile

nguyenxuanhaunguyenxuanhau Member Posts: 22 Contributor II
edited November 2018 in Help
my file xml as:

<process version="4.6">

  <operator name="Root" class="Process" expanded="yes">
      <description text="Text Hau"/>
      <parameter key="logverbosity" value="init"/>
      <parameter key="random_seed" value="2001"/>
      <parameter key="send_mail" value="never"/>
      <parameter key="process_duration_for_mail" value="30"/>
      <parameter key="encoding" value="UTF-8"/>
      <operator name="TextInput" class="TextInput" expanded="yes">
          <list key="texts">
            <parameter key="graphics" value="dulieu"/>
          </list>
          <parameter key="default_content_type" value=""/>
          <parameter key="default_content_encoding" value="utf-8"/>
          <parameter key="default_content_language" value=""/>
          <parameter key="prune_below" value="-1"/>
          <parameter key="prune_above" value="-1"/>
          <parameter key="vector_creation" value="TermOccurrences"/>
          <parameter key="use_content_attributes" value="false"/>
          <parameter key="use_given_word_list" value="false"/>
          <parameter key="return_word_list" value="false"/>
          <parameter key="id_attribute_type" value="short"/>
          <list key="namespaces">
          </list>
          <parameter key="create_text_visualizer" value="false"/>
          <parameter key="on_the_fly_pruning" value="-1"/>
          <parameter key="extend_exampleset" value="false"/>
          <operator name="StringTokenizer" class="StringTokenizer">
          </operator>
          <operator name="StopwordFilterFile" class="StopwordFilterFile">
              <parameter key="file" value="dulieu/stopword.txt"/>
              <parameter key="case_sensitive" value="true"/>
          </operator>
      </operator>
  </operator>

</process>

when i run this file, it don't filter words that were encoded by utf-8

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    if you switch to expert mode of RapidMiner in the parameters view, you will see that there is an encoding parameter. If you set this parameter to UTF-8 the process will work.

    Greetings,
    Sebastian
Sign In or Register to comment.