Neural Network Activation Functions

  • یوسف مهرداد

Which activation function should you use for the hidden layers of your deep neural networks?

Although your mileage will vary, in general SELU > ELU > leaky ReLU (and its variants) > ReLU > tanh > logistic.

If the network’s architecture prevents it from self-normalizing, then ELU may perform better than SELU (since SELU is not smooth at z = 0).

If you care a lot about runtime latency, then you may prefer leaky ReLU.

If you don’t want to tweak yet another hyperparameter, you may use the default α values used by Keras (e.g., 0.3 for leaky ReLU).

If you have spare time and computing power, you can use cross-validation to evaluate other activation functions, such as RReLU if your network is overfitting or PReLU if you have a huge training set.

That said, because ReLU is the most used activation function (by far), many libraries and hardware accelerators provide ReLU-specific optimizations; therefore, if speed is your priority, ReLU might still be the best choice.

Reference: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, ۲nd Edition, by Aurélien Géron, Sep 2019
یوسف مهرداد

یوسف مهرداد

کانال تلگرام

دیدگاهتان را بنویسید

نشانی ایمیل شما منتشر نخواهد شد. بخش‌های موردنیاز علامت‌گذاری شده‌اند *

برای خروج از جستجو کلید ESC را بفشارید