The Altair Community is migrating to a new platform to provide a better experience for you. In preparation for the migration, the Altair Community is on read-only mode from October 28 - November 6, 2024. Technical support via cases will continue to work as is. For any urgent requests from Students/Faculty members, please submit the form linked here

DeepLearning4J Extension - (Early Release, Needs Feedback)

JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
edited November 2018 in Help
Hi guys,

We have created an extension to integrate DeepLearning4J into RapidMiner! 
What we need, before release on marketplace, is a small army of bug finders & volunteers to become fixers of the extension. 

Download here & help us out. 
https://www.rapidminerchina.com/2016/02/rapidminer-china-announces-rapidminer-deeplearning4j-integration/
(Please note, this is still underdevelopment so use in production at your own risk)
Tagged:

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    I've tested it and i need to say:
    Thank you for this! It is already great! (Even if i run into some null pointers..) :)
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    Although it needs pointed at a text file at the moment, I recommend trying out Word2Vec, it's actually pretty interesting way to look at relationships between words.  The hierarchical cluster tree also provides a fun way to navigate around them too.
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="7.0.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="7.0.000" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="dl4j_extension:word_2_vec" compatibility="1.0.000" expanded="true" height="82" name="Word2Vec" width="90" x="45" y="85">
            <parameter key="file_path" value="C:\Users\user\.RapidMiner\repositories\Local Repository\data\raw_sentences.txt"/>
            <enumeration key="stop_words"/>
          </operator>
          <operator activated="true" class="set_role" compatibility="7.0.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="85">
            <parameter key="attribute_name" value="Word"/>
            <parameter key="target_role" value="id"/>
            <list key="set_additional_roles"/>
          </operator>
          <operator activated="true" class="multiply" compatibility="7.0.000" expanded="true" height="103" name="Multiply" width="90" x="313" y="85"/>
          <operator activated="true" class="agglomerative_clustering" compatibility="7.0.000" expanded="true" height="82" name="Clustering" width="90" x="447" y="85">
            <parameter key="mode" value="CompleteLink"/>
            <parameter key="measure_types" value="NumericalMeasures"/>
            <parameter key="numerical_measure" value="CosineSimilarity"/>
          </operator>
          <operator activated="true" class="data_to_similarity_data" compatibility="7.0.000" expanded="true" height="68" name="Data to Similarity Data" width="90" x="447" y="187">
            <parameter key="measure_types" value="NumericalMeasures"/>
            <parameter key="numerical_measure" value="CosineSimilarity"/>
          </operator>
          <connect from_op="Word2Vec" from_port="vector" to_op="Set Role" to_port="example set input"/>
          <connect from_op="Set Role" from_port="example set output" to_op="Multiply" to_port="input"/>
          <connect from_op="Multiply" from_port="output 1" to_op="Clustering" to_port="example set"/>
          <connect from_op="Multiply" from_port="output 2" to_op="Data to Similarity Data" to_port="example set"/>
          <connect from_op="Clustering" from_port="cluster model" to_port="result 1"/>
          <connect from_op="Clustering" from_port="example set" to_port="result 2"/>
          <connect from_op="Data to Similarity Data" from_port="similarity example set" to_port="result 3"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="42"/>
          <portSpacing port="sink_result 2" spacing="21"/>
          <portSpacing port="sink_result 3" spacing="42"/>
          <portSpacing port="sink_result 4" spacing="0"/>
        </process>
      </operator>
    </process>
     

  • NeuralMarketNeuralMarket Member Posts: 13 Contributor II
    I must try this. Thanks for your hard work!
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist
    Hey John,

    quick question: How can i apply the model returned by of word2vec?

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, Member Posts: 578 Unicorn
    It turned out to be less than a quick question.  ::)

    I created a process based on the rules here: http://deeplearning4j.org/word2vec (which is a really nice explanation on how it 'should' work) I created a sample process which applies the similarity operators, aggregate, clustering & PCA on the result to try to demonstrate it. 
    Trust me, it was a thing of beauty using the aggregate, differencing, append & similarity operators in various combinations. 

    Unfortunately, it doesn't work with as I expected on the complete works of Shakespeare so we're debugging the operator to see if something needs to be tweaked internally.  :D
  • kenedykenedy Member Posts: 1 Learner III
    I really like what you have shared here.
  • kaymankayman Member Posts: 662 Unicorn

    Hi, this seems to be no longer available, any way to still get hold of this?

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee-RapidMiner, RapidMiner Certified Analyst, RapidMiner Certified Expert, University Professor Posts: 3,533 RM Data Scientist

    Hi,

     

    it's still compilable from GitHub: https://github.com/LostSummer233/rapidminer-extension-dl4j-pack

     

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • KPLKPL RapidMiner Certified Analyst, Member Posts: 9 Contributor II

    It's been a while since your last post. Is RapidMiner going forward with this extension or dropping it? I'm looking for a recurrent NN plug-in for RM and this seemed to be it. Any other recommendations for RNN & RM?

Sign In or Register to comment.