Deep Learning/Tensorflow

LSTM

Naranjito 2022. 9. 15. 17:49
  • LSTM(Long Short Term Memory)
-The model complemented RNN.
-It can over 1000 steps.
-It can capture long term dependency.

 

 

 

σ is sigmoid(between 0 and 1).

 

C is a highway, h is a national road.

 

Therefore, there is less change to occur vanishing gradient.

https://www.coursera.org/learn/cnns-and-rnns/lecture/RJpfH/new-video

 

1. Input layer that receives 10 inputs

Input(shape=(10,))

 

2. Words to dense vectors.

vocab_size = 7
embedding_dim = 2

Embedding(vocab_size, embedding_dim, input_length=5)

 

3. Hidden size setup 256

LSTM(units=256)

 

4. All timestep hidden state, last timestep hidden state

return_sequences=True
return_state=True

 

5. Dense(output_dim, input_dim, activation)

model.add(Dense(1, input_dim=1, activation='softmax'))

 

6. Model(input, output)

tf.keras.Model(inputs=inputs, outputs=outputs)

 

7. compile

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['acc'])

 

8. model.fit(x, y, batch_size, epochs)

model.fit(x, y, batch_size=64, epochs=40, validation_split=0.2)