GELU Activation function missing in Deep Learning extension. How to implement?

Enes_A · July 2021

Hey, I hope you are doing well.

I want to implement a specific Deep learning architecture in Rapidminer. For that I need the for example GELU activation function in my fully-connected layer, which is not present there. There are also other operations which I have to implement in this architecture, like skip-connections.

I tried to execute a custom Python script at the specific places, which did not work for me, because, you can't have a Layer Architecture as input for a script file. That's why Rapidminer throws an error there.

So I wanted to ask, what my options are? I thought about implementing the modal fully in Python, but I like Rapidminer so I want to use it.

I hope I could articulate my question. Feel free to ask, if something was unclear.

Best regarcs,

Enes

pschlunder · July 2021

Hi @Enes_A,

thanks for reaching out. Regarding the activation function: I just checked and the underlying library we're using (DL4J) has GELU and other activation's included, that we haven't exposed yet. I'm quite positive, that we'll be able to update the list of supported activation's in the next release to include GELU and others.

Regarding skip-connections:

It would be great if you could describe the use case around your need for adding skip-connections. We're mostly trying to add features use case focused and learning about your application would be beneficial for that.

Until we've added these to the extension, your option is to create the model in python and for example use it inside RapidMiner for it's application. I highly recommend checking out our integrated notebooks that tie in with the rest of the platform allowing you to access data directly from your project within python to later push it's results, e.g. the model back, into the project for using it inside other RapidMiner processes.

But I guess even more interesting for you could be the recently added mechanism to create re-usable operators in RapidMiner from python code. So you can basically create yourself the architecture in python and convert the code snippet into an operator with parameters that you can run in Studio/AI Hub for training. This mechanism allows you to expose e.g. hyper-parameters from the network as parameters of a RapidMiner operator and also ensures that you can apply the model later on in Studio as well, since you can provide code both for training and application during operator creation.

Hope this helps.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

GELU Activation function missing in Deep Learning extension. How to implement?

Best Answer