Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You're right that neural networks don't care too much the shape of most activation functions. I assume that splicing together two decaying exponential functions at the origin would work just as well in practice.

However tanh is a bit more special than just having the right symmetries. Sigmoid is the correct function to turn an additive value into a probability (range 0 to 1). Tanh is a scaled sigmoid which fulfills the same purpose for the -1 to +1 interval.

I sometimes wonder if clamped linear or exponential functions would work better than tanh/sigmoid in places where they're currently used (like LSTM/GRU gates).



Yeah, wiki has a decent survey of sigmoid (the family, not the specific function ML people often refer to by that name) functions here: https://en.wikipedia.org/wiki/Sigmoid_function#/media/File:G...

Note that tanh saturates to ±1 faster than most except erf when normalized to have slope 1 at the origin (its series at +infinity is like 1 - 2e^{-2x} + o(e^{-4x}), while many of the other options have polynomial series, so they don't approach 1 nearly as fast).

I suspect some applications would in theory rather use erf, but erf is even worse to compute than tanh (on the other hand, erf's derivative is really nice, so who knows?)


I assume that splicing together two decaying exponential functions at the origin would work just as well in practice.

Also known as tanh: https://en.wikipedia.org/wiki/Hyperbolic_functions

One "disadvantage" is that it doesn't saturate to [-1.0, 1.0] like appropriately scaled tanh.


By splicing together I mean a piecewise function which is `exp(x) - 1` on the left and `1 - exp(-x)` on the right. Which should be similar enough to tanh for most purposes.


Sure, it even has continuous derivatives of all orders and the right slope at the origin. It just doesn’t saturate to +/-1 as fast, which probably doesn’t matter.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: