Deep Learning 61

Dropout, Gradient Clipping, Weight Initialization, Xavier, He, Batch Normalization, Internal Covariate Shift, Layer Normalization

Dropout - One way to avoid overfitting, using several neurons only rather than whole neurons when machine training.- It does not have any learnable parameters and has only one hyperparameter.- It does not behave similarly during training and testing.  To be understand this, let us consider a simple fully connected layer containing 10 neurons. We are using a dropout probability of 0.5. Well durin..

Deep Learning 2021.04.08

Forward Propagation, Batch Gradient, Stochastic Gradient Descent, SGD, Mini Batch Gradient Descent, Momentum, Adagrad, Rprop, RMSprop, Adam, Epoch, Batch size, Iteration

Forward Propagation Input layer-->hidden layer-->activation function-->output layer --------------------------------------------------------------> in order The input data is fed in the forward direction through the network. Each hidden layer accepts the input data, processes it as per the activation function and passes to the successive layer. In order to generate output, the input data should ..

Deep Learning 2021.04.06

Perceptron, Step function, Single-Layer Perceptron, Multi-Layer Perceptron, DNN

Perceptron It is a linear classifier, an algorithm for supervised learning of binary classifiers. Input(multiple x)-->Output(one y) x : input W : Weight y : output Each x has each weights. Larger w, more important x. Step function ∑W * x >=threshold(θ)-->output(y) : 1 ∑W * x output(y) : 0 Threshold(θ) can be expressed b(bias) such as Single-Layer Perceptron It can learn only linearly separable p..

Deep Learning 2021.03.31