How to learning of DL

Deep Learning

How to learning of DL

Naranjito 2022. 3. 17. 16:34

Loss function

In deep learning, we typically use a gradient-based optimization strategy to train a model

f (x)

using some loss function

l (f (x_{i}), y_{i})

where

(x_{i}, y_{i})

are some input-output pair. It is used to help the model determine how "wrong" it is and, based on that "wrongness," improve itself. It's a measure of error. Our goal throughout training is to minimize this error/loss.

https://wandb.ai/sauravmaheshkar/cross-entropy/reports/What-Is-Cross-Entropy-Loss-A-Tutorial-With-Code--VmlldzoxMDA5NTMx

Gradient Descent

Reducing the value of the loss function.

	Gradient Descent by batch size
Gradient Descent		Training : all data → : by epochs
Stochastic Gradient Descent(SGD)		Training : random data → : by batch
Mini-batch Gradient Descent		Training : designated data → : by batch

optimizer

Optimizer
Momentum
Adagrad		Parameters with many changes set a small learning rate, few changes set a high learning rate.
RMSprop		Improve Adagrad
Adam		Combine RMSprop and momentum.

Epochs

How many time train all data.

Batch size

Data unit

Let's say one data size is 256. For instance, it consists of [3,1,2,5, ...] and length is 256.

In other words, one data size = vector dimension = 256

If number of data is 3,000, total data size is 3,000 * 256.

Computer processes the data in chunks rather than processing them one by one.

If you take out 64 pieces of 3,000, then the batch size is 64.

Therefore the computer processes at once is (batch size × dim) = 64 × 256

- One data

[3,1,2,5, ...]
length = 256

- Number of data

[3,1,2,5, ...]
length = 256

... 3,000

[3,1,2,5, ...]
length = 256

저작자표시

'Deep Learning' 카테고리의 다른 글

Terms-text encoding, text decoding, embedding (0)	2022.09.23
Word2Vec (0)	2022.03.29
Activation function (0)	2022.03.17
sklearn (0)	2022.03.08
Dropout, Gradient Clipping, Weight Initialization, Xavier, He, Batch Normalization, Internal Covariate Shift, Layer Normalization (0)	2021.04.08

현재글How to learning of DL

classmethod, Regular Expression, textdistance, Filter, randn, batch size, cross-entropy, d3js, abstractmethod, yield from, kafka, selectall, global variable, zeros, Sigmoid function, nvidia-smi, forward propagation, Step Function, docker-compose, axis,

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

¡Hola, Mundo!