*548*

Sigmoid activation is used early in deep learning. It is a smoothing function that is simple to derive and straightforward to put into practice.

“Sigmoidal” comes from the Greek letter Sigma, and the resulting curve slopes along the Y-axis like an “S.”

A tanh function is an example of a sigmoidal part, which is a type of logistic function that refers to any function that maintains the “S” form (x). tanh(x) follows a similar form, but instead of existing between 0 and 1, it exists between -1 and 1. Traditional sigmoidal functions exist between 0 and 1. In and of itself, a sigmoidal function is differentiable, which means that we are able to readily determine the slope of the sigmoid curve for any two locations that are provided.

When looking at the sigmoid function, we can see that the output is located in the middle of the open interval (0,1). Although we can consider probability, we shouldn’t approach it as a probability in the traditional sense. When it came to popularity, the sigmoid function used to reign supreme. One way to think of it is the pace at which a neuron fires its axons. The portion of the cell that is most susceptible to stimulation is located in the center, where the slope is rather steep. The inhibitory portion of the neuron is located on the sides of the neuron where the slope is quite mild.

**There are a few problems with the sigmoid function itself.**

1) The function’s gradient becomes almost zero as the input moves away from the origin. We all utilize something called the chain rule of differential while we’re working on the backpropagation process for neural networks. This allows us to compute the differential of each weight w. After sigmoid backpropagation, this chain’s difference is negligible. In addition to this, it may go through several sigmoid functions, which, in the long run, will result in the weight(w) having a negligible impact on the loss function. This is an environment that could be more favorable to the optimization of the weight. There are two names for this type of problem: gradient saturation and gradient dispersion.

2) Because the function output is not centered on 0, the efficiency with which the weight update is performed will be reduced.

3) The sigmoid function involves exponential calculations, making computations more time-consuming for computers.

The following is a list of some of the benefits and drawbacks that are associated with Sigmoid functions:

**The Sigmoid Function Offers the Following Benefits: –**

It offers a smooth gradient, which is beneficial to us since it helps us avoid “jumps” in the output data.

To normalize the output of each neuron, output values are constrained to fall between 0 and 1.

It makes accurate predictions, meaning that the results are extremely near to 1 or 0, which enables us to increase the performance of the model.

**Sigmoid functions have the following disadvantages:**

It is especially prone to the problem of gradients disappearing.

The output of the function is not zero-centered.

The fact that power operations take a fair amount of time contributes to the overall complexity of the model.

**How do you build a sigmoid function in Python and how do you write its derivative?**

Consequently, formulating a sigmoid function and its derivative is not very difficult. Simply said, we need to define a function for the formula to work.

**The function of the Sigmoid**

def sigmoid(z): return 1.0 / (1 + np.exp(-z))

Derivative of the sigmoid function

def sigmoid prime(z):

return sigmoid(z) * (1-sigmoid(z))

Python code demonstrating a straightforward implementation of the sigmoid activation function

#import libraries

import matplotlib.

pyplot as plt import NumPy as np in the import statement

Creating the sigmoid function with the def sigmoid(x) statement:

s=1/(1+np.exp(-x))

ds=s*(1-s)

return s,ds a=np.arrange as the result (-6,6,0.01)

sigmoid(x)

# Establish axes that are centred using the fig, axe = plt.subplots(figsize=(9, 5)) formula.

ax.spines[‘left’]

position(‘center’) \sax.spines[‘right’]

set color(‘none’) \sax.spines[‘top’]

color(‘none’)

- x-axis.

set ticks position(‘bottom’)

- y-axis.set ticks position(‘left’)

# Produce and display the figure using the following code: ax. plot(a, sigmoid(x)[0], color=”#307EC7″, linewidth=3, label=”sigmoid”)

ax.plot(a,sigmoid(x)[1], color=”#9621E2″, linewidth=3, label=”derivative”) ax.legend(loc=”upper right”, frameon=false) ax.plot(a,sigmoid(x)[2], color=”#9621E2″, linewidth=3, label=”derivative”)

fig.show()

Output:

The above code graphs the sigmoid and its derivative function, as shown below.

A tanh function is an example of a sigmoidal part, which is a type of logistic function that refers to any function that maintains the “S” form (x). tanh(x) follows a similar form, but instead of existing between 0 and 1, it exists between -1 and 1. Traditional sigmoid function exist between 0 and 1. In and of itself, a sigmoid function is differentiable, which means that we are able to readily determine the slope of the sigmoid curve for any two locations that are provided.

When looking at the sigmoid function, we can see that the output is located in the middle of the open interval (0,1). Although we can consider probability, we shouldn’t approach it as a probability in the traditional sense. When it came to popularity, the sigmoid function used to reign supreme. One way to think of it is the pace at which a neuron fires its axons. The portion of the cell that is most susceptible to stimulation is located in the center, where the slope is rather steep. The inhibitory portion of the neuron is located on the sides of the neuron where the slope is quite mild.

## Summary

I hope that you had fun reading this post and that at the end of it, you have a better understanding of the Sigmoid Activation Function and how we can use Python to implement it.

Visit us at InsideAIML for further articles and courses similar to these on data science, machine learning, artificial intelligence, and other exciting new technologies.

Thank you so much for reading…

Happy Learning…