Attention

Deep Learning

Attention

Naranjito 2023. 3. 31. 16:18

The idea behind the attention mechanism is that the decoder refer to entire input statement of encoder at every steps.

It will focus on the word(from encoder) which has more related to the word(decoder) to be predicted.

The result from the Softmax helps when Decoder predict the output word. The size of red rectangle represents how it helps to predict. The larger the rectangle, more helpful to predict. When each input word is quantified, it is sent to the Decoder as a single piece of informaton(the green triangle). As a result, the Decoder has a higher probability of predicting the output word more accurately.

Steps

1. Attention Score

In order to get the attention value, it need to be get first.

It presents how much resemble between each of the Encoder's hidden states and the Decoder's hidden state at this point(st).

2. Attention Distribution

The result which is obtained by applying the attention score for all time points of the Encoder to Softmax(between 0 and 1). It obtained 'The probability distribution'.

3. Attention Weight

Respective value of Attention Distribution.

4. Attention Value

The value after multiply each Encoder's hidden state and Attention Weight, and finally add them all.

It is also called the Context Vector as it contains the context of the Encoder.

5. Concatenate

Make a Vector after concatenate Attention Value on the Decoder's hidden state at this point(st).

6. Hyperbolic tangent function

After Concatenate, multiply by the weiht matrix, pass through the Hyperbolic Tangent Function to obtain a new Vector.

y ̂ is the final Vector.

저작자표시

'Deep Learning' 카테고리의 다른 글

2. Fundamentals of CNNs and RNNs (0)	2023.01.31
1. Fundamentals of CNNs and RNNs (0)	2023.01.31
AutoEncoder (0)	2022.12.12
encoding vs embedding (0)	2022.12.08
BPE(Byte Pair Encoding) (0)	2022.11.30

현재글Attention

zeros, axis, global variable, Filter, selectall, abstractmethod, classmethod, kafka, randn, textdistance, batch size, yield from, Sigmoid function, cross-entropy, nvidia-smi, Regular Expression, d3js, Step Function, docker-compose, forward propagation,

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

¡Hola, Mundo!

Attention

'Deep Learning' 카테고리의 다른 글

'Deep Learning'의 다른글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

Attention

'Deep Learning' 카테고리의 다른 글

'Deep Learning'의 다른글

관련글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역