The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

How to fade out data in the result?

InaIna Member Posts: 6 Contributor II
edited November 2018 in Help
Hello,

Does anybody know if RapidMiner is able to fade out some of the data in the result? I only need some special words from the outcoming wordlist in the result.
Is there any operator which I can give these special words with? So that there are only these words in the result?

Greetings, Ina

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Ina,
    if I understood you correctly, you are going to filter the resulting wordlist? If it's only for viewing purposes, you could use a process like this:
    <operator name="Root" class="Process" expanded="yes">
        <description text="This process plots the learning curve, i.e. the performance with respect to the number of examples which is used for learning."/>
        <parameter key="logverbosity" value="warning"/>
        <parameter key="random_seed" value="2004"/>
        <operator name="TextInput" class="TextInput" expanded="yes">
            <list key="texts">
              <parameter key="graphics" value="C:\Dokumente und Einstellungen\sland\Eigene Dateien\Yale\Workspace 4.5\sample\data\newsGroupTexts\graphics"/>
              <parameter key="hardware" value="C:\Dokumente und Einstellungen\sland\Eigene Dateien\Yale\Workspace 4.5\sample\data\newsGroupTexts\hardware"/>
            </list>
            <parameter key="output_word_list" value="resultWordlist.wordlist"/>
            <list key="namespaces">
            </list>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
        </operator>
        <operator name="CSVExampleSource" class="CSVExampleSource">
            <parameter key="filename" value="resultWordlist.wordlist"/>
            <parameter key="read_attribute_names" value="false"/>
            <parameter key="comment_chars" value="@/&gt;
            <parameter key="use_quotes" value="false"/>
        </operator>
        <operator name="ExampleFilter" class="ExampleFilter">
            <parameter key="condition_class" value="attribute_value_filter"/>
            <parameter key="parameter_string" value="resultWordlist.wordlist (1)=Path|club"/>
        </operator>
    </operator>
    It first imports the texts and writes the WordList as file. This wordlist is then read in as ExampleSet. You now have all the abilities to filter the wordlist as for normal exampleSets. For example you could use the ExampleFilter in the way described above. It will keep only examples, with the specified attribute matching the given regular expression.

    Greetings,
      Sebastian
  • InaIna Member Posts: 6 Contributor II
    Hi Sebastian,

    thanks for your answer!
    But I think your description doesn't solve my problem...

    Today I found the operator FeatureNameFilter. Is it possible to skip all features with "skip_features_with_name" (Which expression is needed here?) and to except special features with "except_features_with_name" (Which seperator do I need to write several features in the input field?)?

    Or is there another operator where I can select features (words) which I only want to get back in the result?

    As yet I only foundf operators to skip features/words from the result. But I need an operator where I can say which features/words I want to get back in the result.

    Greetings,
    Ina
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi Ina,
    ok, now I understand what you are going to do. The operator "AttributeFilter": It allows to select, which attributes should be kept. Therefore you switch the condition_class to "attribute_name_filter" and then enumerate each attribute name in the parameter_string field, separated by a pipe: |
    If you want to catch several equal attributes, you might use regular expressions to specify them. Please do a quick google search for understanding regular expressions. Or search this forum, if I recall correctly someone has posted a link on a good tutorial a few weeks ago.
    <operator name="Root" class="Process" expanded="yes">
        <operator name="AttributeFilter" class="AttributeFilter">
            <parameter key="condition_class" value="attribute_name_filter"/>
            <parameter key="parameter_string" value="att1|att2|att.*"/>
        </operator>
    </operator>
    Greetings,
      Sebastian
  • InaIna Member Posts: 6 Contributor II
    Hi Sebastian,

    yes that's it... Now it's working like it's supposed to. Thanks for you help!

    Greetings, Ina
Sign In or Register to comment.