728x90
반응형

In this tutorial, we will build an artificial neural network with python just using the Numpy library.

While we create this neural network we will move on step by step.

But you can use any programming language to create this neural network too.

 

Describe The Network Structure

The artificial neural network that we will build consists of three inputs and eight rows.

But we will use only six-row and the rest of the rows will be test data.

We will build an aritificial neural network that has a hidden layer, an output layer.

The Hidden layer will consist of five neurons. Weights and bias of the neural network will be created randomly.

 

 

Define the Variables

 

In this part, we will define the variables that we use.

The variables will consist of the matrices.

Therefore we can simply define these matrices using the python Numpy library.

 

Firstly we need to install Numpy library and we can install the Numpy library using the following command.

pip install numpy

Defining inputs and output :

#Defining inputs and output
import numpy as np
inputs = np.array([[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1]])
output = np.array([[0],[1],[0],[1],[0],[1]])
print("inputs : ")
print(inputs)
print(".....................")
print("output : ")
print(output)

"""
inputs : 
[[0 0 0]
 [0 0 1]
 [0 1 0]
 [0 1 1]
 [1 0 0]
 [1 0 1]]
.....................
output :
[[0]
 [1]
 [0]
 [1]
 [0]
 [1]]
"""

 

Defining weights :

#Defining weights
w1 = np.random.randn(inputs.shape[1],5)
w2 = np.random.randn(5,output.shape[1])
print("w1")
print(w1)
print("..................")
print("w2")
print(w2)

"""
w1
[[-0.77428492  0.10287792  0.69741541 -0.95775929  0.50670826]
 [ 0.80597415 -1.80287916 -0.3028974  -0.67473998  0.56049986]
 [-0.81735613  2.33120242  1.07024909 -0.66092605  1.46534629]]
..................
w2
[[-2.42478319]
 [ 0.68046428]
 [ 0.33132679]
 [-0.16557226]
 [ 0.59223169]]
"""

 

Defining biases:

b1 = 1
b2 = 1

 

Activation Function

The activation functions allow us to bring our output values in each layer to the 0-1 range.

If we do our operations without using the activation function, the output values will increase exponetially after each process. This will cause us to train our artificial neural network very long periods of time.

There are many activation functions to eliminate this.

These are sigoid function, tanh function, relu function etc.

We well use the sigmoid function in this tutorial.

#Defining Sigmoid Function
def sigmoid(x):
    return 1/(1 + np.exp(-x))

 

Build the Artificial Neural Network

Artificial neural network training consists of two main parts.

 

- Calculating the predicted output ŷ, known as feedforward

- Updating the weights and biases, known as backpropagation

 

The following imange neural network shows training.

Feedforward

z = x * w + b

a = 1 / (1 + e^ - z)

y_ head = a2

error = loss function

error = (1 / 2) * (output - y_head)^2

 

#Feed Forward
z1 = np.dot(inputs, w1) + b1
a1 = sigmoid(z1)
z2 = np.dot(a1, w2) + b2
a2 = sigmoid(z2)
error = np.sum((1/2)*(output - a2)**2)

print("z1 : ", z1.shape)    #z1 :  (6, 5)
print("a1 : ", a1.shape)    #a1 :  (6, 5)
print("z2 : ", z2.shape)    #z2 :  (6, 1)
print("a2 : ", a2.shape)    #a1 :  (6, 1
print("error : ", error)    #error :  1.3296527834297855

 

BackPropagation

 

So far, we found the error value of our prediction.

But we need to update our weights and bias values.

So we need to know the derivative of the loss function with respect to the weights and biases.

We know that derivative of the function is the slope of the function.

If we can calculate the derivative of the function we can simply update weifhts and bias value by increasing/reducing.

This called gradient descent.

 

However, we can't directly calculate the derivative fo the loss function with respect to the weights and biases.

Because the derivative of the loss function doesn't contain weights and bias.

So we need to use the chain rule to calculate it.

Each part of the chain rule is called the partial derivative.

 

Step 1: Find derivatives of layer one.

#Step_1
error_d_a2 = (a2 - output)
# derivative of the a2 with respect to the z2
a2_d_z2 = a2 * (1 - a2)
# derivative of the z2 with respect to the w2
z2_d_w2 = a1
# derivative of the z2 with respect to the b2_w
z2_d_b2 = b2

print("error_d_a2 : ", error_d_a2.shape)    #error_d_a2 :  (6, 1)
print("a2_d_z2 : ", a2_d_z2.shape)          #a2_d_z2 :  (6, 1)
print("z2_d_w2 : ", z2_d_w2.shape)          #z2_d_w2 :  (6, 5)
print("z2_d_b2 : ", z2_d_b2)                #z2_d_b2 :  1

 

Step 2: Calculate the derivative of the error with respect to the w2.

#Step_2
delta_w2 =  error_d_a2 * a2_d_z2
delta_w2 = np.dot(z2_d_w2.T, delta_w2)
print("delta_w2 : ", delta_w2.shape)        #delta_w2 :  (5, 1)

 

Step 3: Calculate the derivative of the error with respect to the b2.

#Step_3
delta_b2 =  error_d_a2 * a2_d_z2
delta_b2 = delta_b2 * z2_d_b2
delta_b2 = np.sum(delta_b2)
print("delta_b2 : ", delta_b2)              #delta_b2 :  -0.08251552717143652

 

Step 4: Update w2 and b2_w.

#Step_4
w2 = w2 - delta_w2
b2 = b2 - delta_b2
print("new w2 : ")
print(w2)
print(".....................")
print("new b2_w : ")
print(b2)

"""
new w2 :
[[-0.3107736 ]
 [-0.26243013]
 [-1.62597648]
 [-0.460874  ]
 [ 0.32223855]]
.....................
new b2_w :
1.1307639633992455
"""

 

Step 5: Find derivatives of layer two.

#Step_5
# derivative of the z2 with respect to the a1
z2_d_a1 = w2
# derivative of the a1 with respect to the z1
a1_d_z1 = a1*(1 - a1)
# derivative of the z1 with respect to the w1
z1_d_w1 = inputs
# derivative of the z1 with respect to the b1_w
z1_d_b1_w = b1

print("z2_d_a1 : ", z2_d_a1.shape)      #z2_d_a1 :  (5, 1)
print("a1_d_z1 : ", a1_d_z1.shape)      #a1_d_z1 :  (6, 5)
print("z1_d_w1 : ", z1_d_w1.shape)      #z1_d_w1 :  (6, 3)
print("z1_d_b1_w : ", z1_d_b1_w)        #z1_d_b1_w :  1

 

Step 6: Calculate the derivative of the error with respect to the w2.

#Step_6
delta_w1 =  error_d_a2 * a2_d_z2
delta_w1 = np.dot(delta_w1,z2_d_a1.T)
delta_w1 = delta_w1 * a1_d_z1
delta_w1 = np.dot(inputs.T,delta_w1)
print("delta_w1 : ", delta_w1.shape)    #delta_w1 :  (3, 5)

 

Step 7 : Calcluate the derivative of the error with respect to the b1.

#Step_7
delta_b1 =  error_d_a2 * a2_d_z2
delta_b1 = np.dot(delta_b1,z2_d_a1.T)
delta_b1 = delta_b1 * a1_d_z1
delta_b1 = delta_b1 * z1_d_b1_w
delta_b1 = np.sum(delta_b1)
print("delta_b1: ", delta_b1)           #delta_b1:  0.046485859867826225

 

Step 8: Update w1 and b1.

#Step_8
w1 = w1 - delta_w1
b1 = b1 - delta_b1
print("new w1 : ")
print(w1)
print(".....................")
print("new b1 : ")
print(b1)

"""
new w1 :
[[ 0.23161175  1.36976182  1.61154306  0.84420228  2.84234378]
 [ 0.25536816  2.39003472 -0.59019456 -1.43970131 -1.50262859]
 [-0.58270371  1.48691414 -1.06496844  0.34346572 -0.69036972]]
.....................
new b1 :
1.0566873975726727
"""

 

Training the neural network

so far, we calculated all parameters and now we will train our neural network.

# Training the neural network
w1 = w1 - delta_w1
b1 = b1 - delta_b1
error_list.append(error)
print("error : ", error)    #error :  0.9380097470416288

 

Show the Trained Data With Matplotlib Library

We trained our artificial neural network times 100.

Now let's show trained data on the screen using the python matplotlib library.

 

import matplotlib.pyplot as plt
x = np.arange(len(error_list))
y = error_list
plt.figure(figsize=(10,8))
plt.plot(x,y)
plt.xlabel("iteration")
plt.ylabel("error")
plt.title("Artificial Neural Networks Training")
plt.show()
# Defining all variables
import numpy as np
error_list = list()
inputs = np.array([[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1]])
output = np.array([[0],[1],[0],[1],[0],[1]])
# Weights
w1 = np.random.randn(inputs.shape[1],5)
w2 = np.random.randn(5,output.shape[1])
# Biases
b1 = 1
b2 = 1
# Sigmoid Function
def sigmoid(x):
    return 1/(1 + np.exp(-x))
# Update the weights times 100.
for i in range(100):
    # Feedforward
    z1 = np.dot(inputs,w1) + b1
    a1 = sigmoid(z1)
    z2 = np.dot(a1, w2) + b2
    a2 = sigmoid(z2)
    error = np.sum((1/2)*(output - a2)**2)
    # Backpropagation
    ## LAYER 2
    ### derivative of the error with respect to the a2
    error_d_a2 = (a2 - output)
    ### derivative of the a2 with respect to the z2
    a2_d_z2 = a2*(1 - a2)
    ### derivative of the z2 with respect to the w2
    z2_d_w2 = a1
    ### derivative of the z2 with respect to the b2_w
    z2_d_b2 = b2
    ### delta weights 2
    delta_w2 =  error_d_a2 * a2_d_z2
    delta_w2 = np.dot(z2_d_w2.T, delta_w2)
    ### delta biases
    delta_b2 =  error_d_a2 * a2_d_z2
    delta_b2 = delta_b2 * z2_d_b2
    delta_b2 = np.sum(delta_b2)
    ### Update weights and bias
    w2 = w2 - delta_w2
    b2 = b2 - delta_b2
    ## LAYER 1
    ### derivative of the z2 with respect to the a1
    z2_d_a1 = w2
    ### derivative of the a1 with respect to the z1
    a1_d_z1 = a1*(1 - a1)
    ### derivative of the z1 with respect to the w1
    z1_d_w1 = inputs
    ### derivative of the z1 with respect to the b1_w
    z1_d_b1_w = b1
    ### delta weights 1
    delta_w1 =  error_d_a2 * a2_d_z2
    delta_w1 = np.dot(delta_w1,z2_d_a1.T)
    delta_w1 = delta_w1 * a1_d_z1
    delta_w1 = np.dot(inputs.T,delta_w1)
    ### delta bias 1
    delta_b1 =  error_d_a2 * a2_d_z2
    delta_b1 = np.dot(delta_b1,z2_d_a1.T)
    delta_b1 = delta_b1 * a1_d_z1
    delta_b1 = delta_b1 * z1_d_b1_w
    delta_b1 = np.sum(delta_b1)
    ### update w1 and b1
    w1 = w1 - delta_w1
    b1 = b1 - delta_b1
    error_list.append(error)
    print("error : ", error)

 

728x90
반응형

'Deep Learning' 카테고리의 다른 글

TensorFlow  (0) 2022.09.12
Numpy  (0) 2022.09.12
How to Create Single Layer Neural Network with Python?  (0) 2022.09.12
How to create A single Layer Perceptron?  (0) 2022.09.12
Linear Functions  (0) 2022.09.11

+ Recent posts