The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

How can I have some melting function in rapidminer?

smmsammsmmsamm Member Posts: 7 Learner III
edited 2018 30 in Help

I am beginner in dataminer,

I have a list of 10000 rows and about 200 column like this :











at first i need to make unique list of words:







Now I need to find lines with at least 3 (or n) similar values and generate a new list:













Please help me in anyway

thank you


  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    What is melting function?
  • smmsammsmmsamm Member Posts: 7 Learner III

    I Searched the internet and someone said python melt can help me, but I don't know how can I do in rapidminer!

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist


    from the pandas doc for melt:

    “Unpivots” a DataFrame from wide format to long format, optionally leaving identifier variables set.

    I guess it maps to something along the lines of De-Pivot.  



    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 1,761 Unicorn
    I guess I learned something new today!
  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    so that's a fun puzzle.  I would begin like this (you will need @land's Statistics Extension to run this process):


    <?xml version="1.0" encoding="UTF-8"?><process version="7.6.001">
    <operator activated="true" class="process" compatibility="7.6.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.6.001" expanded="true" height="68" name="Retrieve smmsamm" width="90" x="45" y="85">
    <parameter key="repository_entry" value="smmsamm"/>
    <operator activated="true" class="de_pivot" compatibility="7.6.001" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="85">
    <list key="attribute_name">
    <parameter key="foo" value="att[2-9]"/>
    <parameter key="index_attribute" value="bar"/>
    <operator activated="true" class="select_attributes" compatibility="7.6.001" expanded="true" height="82" name="Select Attributes" width="90" x="313" y="85">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="bar"/>
    <parameter key="invert_selection" value="true"/>
    <operator activated="true" class="numerical_to_polynominal" compatibility="7.6.001" expanded="true" height="82" name="Numerical to Polynominal" width="90" x="447" y="85">
    <parameter key="attribute_filter_type" value="single"/>
    <parameter key="attribute" value="foo"/>
    <operator activated="true" class="rmx_stat:cross_table" compatibility="1.3.000" expanded="true" height="82" name="Extract Cross Table" width="90" x="581" y="85">
    <parameter key="group_attribute_a" value="att1"/>
    <parameter key="group_attribute_b" value="foo"/>
    <connect from_op="Retrieve Untitled 3smmsamm" from_port="output" to_op="De-Pivot" to_port="example set input"/>
    <connect from_op="De-Pivot" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
    <connect from_op="Select Attributes" from_port="example set output" to_op="Numerical to Polynominal" to_port="example set input"/>
    <connect from_op="Numerical to Polynominal" from_port="example set output" to_op="Extract Cross Table" to_port="example set input"/>
    <connect from_op="Extract Cross Table" from_port="cross table output" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>

    That said I am certain there is a cleverer way to do this!



  • smmsammsmmsamm Member Posts: 7 Learner III

    I updated my rapidminer and installed statics extension:


    but I Get error:

    and I can not find missing extension:


    Would you please help again.

    Thank you

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    hmm I'm not sure the extension in the marketplace is up-to-date (Sebastian?).  I would go directly to the website:



  • smmsammsmmsamm Member Posts: 7 Learner III

    This is my csv file.
    would you please test with it?

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    so the process I posted was not intended to be a finished product - just something to get you in the right direction.  :)  If you take that csv file and put it in my process, you get the attached result.



  • smmsammsmmsamm Member Posts: 7 Learner III

    Oh thank you sir, You are the master
    but These were samples data for test
    my real data have about 100000 difeerent value, with this method I will have about 100000 Columns?
    Is it possible to convert the list to my wanted list?












  • smmsammsmmsamm Member Posts: 7 Learner III



    I mean these coloums convert to rows with header values?

  • sgenzersgenzer Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM Moderator Posts: 2,959 Community Manager

    Your flattery is noted and not deserved.  There are many here who are far more masterful than I.  That said, I think at this point I would recommend getting more knowledgable with RapidMiner Studio before moving forward with large data sets like the one you describe - actions such as renaming attributes and so forth are the beginning of a long journey.  I would highly recommend starting with the "Getting Started with RapidMiner" YouTube playlist.  The whole beauty of RapidMiner is that you can learn to create your own processes and be a master yourself!



  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 2,531 Unicorn

    Hi all,

    I just published the most recent version of our extensions on the marketplace. So if that was the problem, it should be gone now. At least I can use it with the most recent version of RM.




Sign In or Register to comment.