The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

"About text clustering."

maria_godricmaria_godric Member Posts: 20 Maven
edited May 2019 in Help
Hi,
Now I am working with KMedoids clustering with Text data.I have input 10 different texts in the text input operator.But RM dividing each text files into 10 rows and applying clustering on the divided data.Is there any way to take the whole text as a single row.

Thanks
Maria

Answers

  • fischerfischer Member Posts: 439 Maven
    Hi,

    I guess there is something wrong with your process setup. RM does not divide texts into lines normally. Can you post your process?

    Cheers,
    Simon
  • maria_godricmaria_godric Member Posts: 20 Maven
    Hi Simon,
    Thanks for your help.
    Here I am attaching the xml.

    <operator name="Root" class="Process" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <list key="texts">
              <parameter key="fcontact" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="odiary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="diary" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="clipr" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="updated" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="crts" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="field" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="aplan" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="subrogation" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
              <parameter key="closing" value="C:\Documents and Settings\ADMIN\Desktop\case"/>
            </list>
            <list key="namespaces">
            </list>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
            </operator>
            <operator name="PorterStemmer" class="PorterStemmer">
            </operator>
        </operator>
        <operator name="KMedoids" class="KMedoids">
            <parameter key="k" value="3"/>
        </operator>
    </operator>

    Thanks
    Maria
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn
    Hi,
    RapidMiner doesn't do this inside this process. Please check if you simply pointed to the wrong files.

    Greetings,
      Sebastian
Sign In or Register to comment.