3D Tensor(Typical Natural Language Processing)

Machine Learning

3D Tensor(Typical Natural Language Processing)

Naranjito 2022. 8. 2. 11:52

Step

1. Devide the sentences into words.

2. Vectorize each words.

3. Set up the batch size.

Let's say here is data.

[[나는 사과를 좋아해], [나는 바나나를 좋아해], [나는 사과를 싫어해], [나는 바나나를 싫어해]]

1. Devide the sentences into words.

[['나는', '사과를', '좋아해'], ['나는', '바나나를', '좋아해'], ['나는', '사과를', '싫어해'], ['나는', '바나나를', '싫어해']]

2. Vectorize each words.

'나는' = [0.1, 0.2, 0.9]
'사과를' = [0.3, 0.5, 0.1]
'바나나를' = [0.3, 0.5, 0.2]
'좋아해' = [0.7, 0.6, 0.5]
'싫어해' = [0.5, 0.6, 0.7]

↓

[[[0.1, 0.2, 0.9], [0.3, 0.5, 0.1], [0.7, 0.6, 0.5]],
 [[0.1, 0.2, 0.9], [0.3, 0.5, 0.2], [0.7, 0.6, 0.5]],
 [[0.1, 0.2, 0.9], [0.3, 0.5, 0.1], [0.5, 0.6, 0.7]],
 [[0.1, 0.2, 0.9], [0.3, 0.5, 0.2], [0.5, 0.6, 0.7]]]

3. Let's say, set the batch size as 2.

첫번째 배치 #1
[[[0.1, 0.2, 0.9], [0.3, 0.5, 0.1], [0.7, 0.6, 0.5]],
 [[0.1, 0.2, 0.9], [0.3, 0.5, 0.2], [0.7, 0.6, 0.5]]]

두번째 배치 #2
[[[0.1, 0.2, 0.9], [0.3, 0.5, 0.1], [0.5, 0.6, 0.7]],
 [[0.1, 0.2, 0.9], [0.3, 0.5, 0.2], [0.5, 0.6, 0.7]]]

4. Let's say, imagine it. The size of tensor is 2*3*3 (batch size, length of sentence, vector dimention)

vector dimension, [0.1, 0.2, 0.9] is 3.

저작자표시

'Machine Learning' 카테고리의 다른 글

model.parameters (0)	2023.01.17
Supervised vs Unsupervised Learning (0)	2022.12.12
tensorflow (0)	2022.03.16
Glance ML (0)	2022.03.11
Entropy, Cross-Entropy (0)	2021.03.31

현재글3D Tensor(Typical Natural Language Processing)

global variable, nvidia-smi, yield from, docker-compose, textdistance, randn, abstractmethod, Sigmoid function, classmethod, Regular Expression, axis, cross-entropy, zeros, Step Function, forward propagation, Filter, kafka, selectall, batch size, d3js,

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

¡Hola, Mundo!