15 분 소요

📌 앞 절에서 배웠던 SimpleRNN보다 계산이 훨씬 복잡하지만 성능이 뛰어나기 때문에 순환 신경망에 많이 채택되고 있다.

🔮 LSTM 구조

LSTM은 Long Short-Term Memory의 약자로 말 그대로 단기 기억을 오래 기억하기 위해 고안되었다.

📍 은닉 상태 만드는 방법

입력과 이전 타임스텝의 은닉 상태를 가중치에 곱한 후 활성화 함수를 통과시켜 다음 은닉 상태를 만든다.

📍 셀 상태 만드는 방법

LSTM에는 순환되는 상태가 2개인데 은닉 상태 말고 셀 상태 라고 부르는 값이 또 있다. 셀 상태는 다음 층으로 전달되지 않고 LSTM 셀에서 순환만 되는 값이다. 셀 상태는 입력과 은닉 상태를 또 다른 가중치와 곱한 다음 셀 상태를 만들기 위한 하나의 셀은 시그모이드 함수를 통과시키고, 다른 셀은 tanh 함수를 통과시킵니다. 그 다음 이전 타임스텝의 셀 상태와 곱하여 새로운 셀 상태를 만든다.

1️⃣ 데이터 가져오기

from tensorflow.keras.datasets import imdb
from sklearn.model_selection import train_test_split

(train_input, train_target), (test_input, test_target) = imdb.load_data(num_words=500)

train_input, val_input, train_target, val_target = train_test_split(train_input, train_target, test_size=0.2, random_state=42)

from tensorflow.keras.preprocessing.sequence import pad_sequences

train_seq = pad_sequences(train_input, maxlen=100)
val_seq = pad_sequences(val_input, maxlen=100)

2️⃣ LSTM 신경망 만들기

from tensorflow import keras
model = keras.Sequential()
model.add(keras.layers.Embedding(500, 16, input_length=100))
model.add(keras.layers.LSTM(8))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.summary()

>>>
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 embedding_1 (Embedding)     (None, 100, 16)           8000

 lstm_1 (LSTM)               (None, 8)                 800

 dense_1 (Dense)             (None, 1)                 9

=================================================================
Total params: 8,809
Trainable params: 8,809
Non-trainable params: 0
_________________________________________________________________

SimpleRNN 클래스에서는 은닉 상태 한 개였으므로 모델 파라미터 개수가 200개 였지만 LSTM 셀은 작은 셀이 4개가 있으므로 800개의 모델 파라미터가 있는 것을 확인할 수 있다.

3️⃣ LSTM 모델 훈련하기

rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model.compile(optimizer=rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-lstm-model.h5', save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)

history = model.fit(train_seq, train_target, epochs=100, batch_size=64,validation_data=(val_seq, val_target),callbacks=[checkpoint_cb, early_stopping_cb])

>>>
Epoch 1/100
313/313 [==============================] - 18s 48ms/step - loss: 0.6925 - accuracy: 0.5555 - val_loss: 0.6919 - val_accuracy: 0.5940
Epoch 2/100
313/313 [==============================] - 13s 43ms/step - loss: 0.6908 - accuracy: 0.6041 - val_loss: 0.6898 - val_accuracy: 0.6162
Epoch 3/100
313/313 [==============================] - 16s 50ms/step - loss: 0.6875 - accuracy: 0.6321 - val_loss: 0.6852 - val_accuracy: 0.6356
Epoch 4/100
313/313 [==============================] - 13s 41ms/step - loss: 0.6803 - accuracy: 0.6495 - val_loss: 0.6753 - val_accuracy: 0.6576
Epoch 5/100
313/313 [==============================] - 12s 37ms/step - loss: 0.6637 - accuracy: 0.6744 - val_loss: 0.6500 - val_accuracy: 0.6912
Epoch 6/100
313/313 [==============================] - 12s 37ms/step - loss: 0.6107 - accuracy: 0.7127 - val_loss: 0.5773 - val_accuracy: 0.7266
Epoch 7/100
313/313 [==============================] - 14s 45ms/step - loss: 0.5573 - accuracy: 0.7405 - val_loss: 0.5515 - val_accuracy: 0.7436
Epoch 8/100
313/313 [==============================] - 14s 45ms/step - loss: 0.5347 - accuracy: 0.7538 - val_loss: 0.5326 - val_accuracy: 0.7508
Epoch 9/100
313/313 [==============================] - 14s 43ms/step - loss: 0.5163 - accuracy: 0.7645 - val_loss: 0.5173 - val_accuracy: 0.7650
Epoch 10/100
313/313 [==============================] - 13s 41ms/step - loss: 0.5004 - accuracy: 0.7741 - val_loss: 0.5026 - val_accuracy: 0.7676
Epoch 11/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4862 - accuracy: 0.7837 - val_loss: 0.4893 - val_accuracy: 0.7810
Epoch 12/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4734 - accuracy: 0.7890 - val_loss: 0.4780 - val_accuracy: 0.7840
Epoch 13/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4621 - accuracy: 0.7944 - val_loss: 0.4696 - val_accuracy: 0.7868
Epoch 14/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4525 - accuracy: 0.7982 - val_loss: 0.4615 - val_accuracy: 0.7886
Epoch 15/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4440 - accuracy: 0.8022 - val_loss: 0.4529 - val_accuracy: 0.7924
Epoch 16/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4370 - accuracy: 0.8049 - val_loss: 0.4475 - val_accuracy: 0.7956
Epoch 17/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4311 - accuracy: 0.8080 - val_loss: 0.4440 - val_accuracy: 0.7948
Epoch 18/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4263 - accuracy: 0.8101 - val_loss: 0.4443 - val_accuracy: 0.7918
Epoch 19/100
313/313 [==============================] - 11s 37ms/step - loss: 0.4227 - accuracy: 0.8106 - val_loss: 0.4380 - val_accuracy: 0.7982
Epoch 20/100
313/313 [==============================] - 11s 35ms/step - loss: 0.4198 - accuracy: 0.8116 - val_loss: 0.4369 - val_accuracy: 0.7978
Epoch 21/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4176 - accuracy: 0.8135 - val_loss: 0.4357 - val_accuracy: 0.7984
Epoch 22/100
313/313 [==============================] - 11s 35ms/step - loss: 0.4153 - accuracy: 0.8141 - val_loss: 0.4368 - val_accuracy: 0.7994
Epoch 23/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4137 - accuracy: 0.8126 - val_loss: 0.4340 - val_accuracy: 0.7960
Epoch 24/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4122 - accuracy: 0.8140 - val_loss: 0.4334 - val_accuracy: 0.8018
Epoch 25/100
313/313 [==============================] - 11s 35ms/step - loss: 0.4109 - accuracy: 0.8137 - val_loss: 0.4331 - val_accuracy: 0.8032
Epoch 26/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4100 - accuracy: 0.8140 - val_loss: 0.4342 - val_accuracy: 0.8026
Epoch 27/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4091 - accuracy: 0.8140 - val_loss: 0.4323 - val_accuracy: 0.8000
Epoch 28/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4083 - accuracy: 0.8127 - val_loss: 0.4324 - val_accuracy: 0.8022
Epoch 29/100
313/313 [==============================] - 11s 36ms/step - loss: 0.4072 - accuracy: 0.8156 - val_loss: 0.4324 - val_accuracy: 0.7966
Epoch 30/100
313/313 [==============================] - 11s 37ms/step - loss: 0.4068 - accuracy: 0.8148 - val_loss: 0.4326 - val_accuracy: 0.7966
import matplotlib.pyplot as plt
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

download1

🔮 순환층에 드롭아웃 적용하기

dropout 매개변수는 셀의 입력에 드롭아웃을 적용하고, recurrent_dropout 매개변수는 순환되는 은닉 상태에 드롭아웃을 적용한다.

2️⃣ 신경망 만들기

드롭 아웃 30% 지정

model2 = keras.Sequential()

model2.add(keras.layers.Embedding(500, 16, input_length=100))
model2.add(keras.layers.LSTM(8, dropout=0.3))
model2.add(keras.layers.Dense(1, activation='sigmoid'))

3️⃣ 신경망 훈련하기

rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model2.compile(optimizer=rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-dropout-model.h5', save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,restore_best_weights=True)

history = model2.fit(train_seq, train_target, epochs=100, batch_size=64, validation_data=(val_seq, val_target), callbacks=[checkpoint_cb, early_stopping_cb])

>>>
Epoch 1/100
313/313 [==============================] - 15s 40ms/step - loss: 0.6926 - accuracy: 0.5268 - val_loss: 0.6919 - val_accuracy: 0.5514
Epoch 2/100
313/313 [==============================] - 12s 38ms/step - loss: 0.6900 - accuracy: 0.5969 - val_loss: 0.6883 - val_accuracy: 0.5992
Epoch 3/100
313/313 [==============================] - 12s 37ms/step - loss: 0.6808 - accuracy: 0.6472 - val_loss: 0.6684 - val_accuracy: 0.6792
Epoch 4/100
313/313 [==============================] - 12s 38ms/step - loss: 0.6275 - accuracy: 0.7081 - val_loss: 0.5962 - val_accuracy: 0.7198
Epoch 5/100
313/313 [==============================] - 12s 38ms/step - loss: 0.5830 - accuracy: 0.7215 - val_loss: 0.5666 - val_accuracy: 0.7364
Epoch 6/100
313/313 [==============================] - 12s 39ms/step - loss: 0.5532 - accuracy: 0.7380 - val_loss: 0.5377 - val_accuracy: 0.7564
Epoch 7/100
313/313 [==============================] - 12s 39ms/step - loss: 0.5273 - accuracy: 0.7559 - val_loss: 0.5149 - val_accuracy: 0.7640
Epoch 8/100
313/313 [==============================] - 12s 39ms/step - loss: 0.5051 - accuracy: 0.7672 - val_loss: 0.4952 - val_accuracy: 0.7746
Epoch 9/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4889 - accuracy: 0.7750 - val_loss: 0.4792 - val_accuracy: 0.7848
Epoch 10/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4741 - accuracy: 0.7847 - val_loss: 0.4682 - val_accuracy: 0.7876
Epoch 11/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4645 - accuracy: 0.7895 - val_loss: 0.4612 - val_accuracy: 0.7894
Epoch 12/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4582 - accuracy: 0.7931 - val_loss: 0.4557 - val_accuracy: 0.7962
Epoch 13/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4517 - accuracy: 0.7965 - val_loss: 0.4510 - val_accuracy: 0.7940
Epoch 14/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4476 - accuracy: 0.7961 - val_loss: 0.4488 - val_accuracy: 0.7970
Epoch 15/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4448 - accuracy: 0.7992 - val_loss: 0.4492 - val_accuracy: 0.7928
Epoch 16/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4406 - accuracy: 0.8019 - val_loss: 0.4451 - val_accuracy: 0.7936
Epoch 17/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4370 - accuracy: 0.8034 - val_loss: 0.4412 - val_accuracy: 0.7984
Epoch 18/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4344 - accuracy: 0.8027 - val_loss: 0.4414 - val_accuracy: 0.7970
Epoch 19/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4329 - accuracy: 0.8033 - val_loss: 0.4375 - val_accuracy: 0.8006
Epoch 20/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4303 - accuracy: 0.8054 - val_loss: 0.4357 - val_accuracy: 0.8002
Epoch 21/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4290 - accuracy: 0.8061 - val_loss: 0.4343 - val_accuracy: 0.8024
Epoch 22/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4264 - accuracy: 0.8071 - val_loss: 0.4359 - val_accuracy: 0.8016
Epoch 23/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4248 - accuracy: 0.8100 - val_loss: 0.4333 - val_accuracy: 0.8022
Epoch 24/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4243 - accuracy: 0.8101 - val_loss: 0.4327 - val_accuracy: 0.8022
Epoch 25/100
313/313 [==============================] - 12s 38ms/step - loss: 0.4226 - accuracy: 0.8094 - val_loss: 0.4340 - val_accuracy: 0.8012
Epoch 26/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4205 - accuracy: 0.8084 - val_loss: 0.4358 - val_accuracy: 0.8014
Epoch 27/100
313/313 [==============================] - 12s 37ms/step - loss: 0.4203 - accuracy: 0.8093 - val_loss: 0.4345 - val_accuracy: 0.7996
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train','val'])
plt.show()

download2

LSTM 층에 적용한 드롭아웃이 효과를 발휘하여 훈련 손실과 검증 손실 간의 차이가 좁혀진 것을 확인할 수 있다.

🔮 2개의 층 연결하기

2️⃣ 신경망 만들기

model3 = keras.Sequential()

model3.add(keras.layers.Embedding(500, 16, input_length=100))
model3.add(keras.layers.LSTM(8, dropout=0.3, return_sequences=True))
model3.add(keras.layers.LSTM(8, dropout=0.3))
model3.add(keras.layers.Dense(1, activation='sigmoid'))

model3.summary()

>>>
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 embedding_3 (Embedding)     (None, 100, 16)           8000

 lstm_3 (LSTM)               (None, 100, 8)            800

 lstm_4 (LSTM)               (None, 8)                 544

 dense_3 (Dense)             (None, 1)                 9

=================================================================
Total params: 9,353
Trainable params: 9,353
Non-trainable params: 0
_________________________________________________________________

마지막 은닉 상태를 제외한 모든 타임스텝의 은닉 상태를 출력하려면 return_sequences=True 를 지정하면 된다.

3️⃣ 신경망 훈련하기

rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model3.compile(optimizer=rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-2rnn-model.h5', save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)

history = model3.fit(train_seq, train_target, epochs=100, batch_size=64, validation_data=(val_seq, val_target), callbacks=[checkpoint_cb, early_stopping_cb])

>>>
313/313 [==============================] - 22s 72ms/step - loss: 0.4440 - accuracy: 0.7948 - val_loss: 0.4520 - val_accuracy: 0.7880
Epoch 24/100
313/313 [==============================] - 22s 71ms/step - loss: 0.4412 - accuracy: 0.7965 - val_loss: 0.4550 - val_accuracy: 0.7888
Epoch 25/100
313/313 [==============================] - 22s 71ms/step - loss: 0.4403 - accuracy: 0.7960 - val_loss: 0.4500 - val_accuracy: 0.7920
Epoch 26/100
313/313 [==============================] - 22s 72ms/step - loss: 0.4395 - accuracy: 0.7962 - val_loss: 0.4524 - val_accuracy: 0.7878
Epoch 27/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4400 - accuracy: 0.7962 - val_loss: 0.4493 - val_accuracy: 0.7930
Epoch 28/100
313/313 [==============================] - 23s 74ms/step - loss: 0.4383 - accuracy: 0.7974 - val_loss: 0.4509 - val_accuracy: 0.7900
Epoch 29/100
313/313 [==============================] - 24s 77ms/step - loss: 0.4375 - accuracy: 0.7985 - val_loss: 0.4532 - val_accuracy: 0.7870
Epoch 30/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4349 - accuracy: 0.8008 - val_loss: 0.4454 - val_accuracy: 0.7920
Epoch 31/100
313/313 [==============================] - 22s 72ms/step - loss: 0.4339 - accuracy: 0.8001 - val_loss: 0.4450 - val_accuracy: 0.7932
Epoch 32/100
313/313 [==============================] - 22s 71ms/step - loss: 0.4338 - accuracy: 0.8001 - val_loss: 0.4437 - val_accuracy: 0.7948
Epoch 33/100
313/313 [==============================] - 22s 70ms/step - loss: 0.4352 - accuracy: 0.7993 - val_loss: 0.4435 - val_accuracy: 0.7948
Epoch 34/100
313/313 [==============================] - 22s 70ms/step - loss: 0.4302 - accuracy: 0.8015 - val_loss: 0.4430 - val_accuracy: 0.7950
Epoch 35/100
313/313 [==============================] - 23s 72ms/step - loss: 0.4309 - accuracy: 0.8005 - val_loss: 0.4479 - val_accuracy: 0.7900
Epoch 36/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4269 - accuracy: 0.8017 - val_loss: 0.4403 - val_accuracy: 0.7966
Epoch 37/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4297 - accuracy: 0.8027 - val_loss: 0.4404 - val_accuracy: 0.7966
Epoch 38/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4263 - accuracy: 0.8019 - val_loss: 0.4395 - val_accuracy: 0.7964
Epoch 39/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4264 - accuracy: 0.8011 - val_loss: 0.4414 - val_accuracy: 0.7968
Epoch 40/100
313/313 [==============================] - 24s 77ms/step - loss: 0.4254 - accuracy: 0.8044 - val_loss: 0.4378 - val_accuracy: 0.7966
Epoch 41/100
313/313 [==============================] - 25s 79ms/step - loss: 0.4263 - accuracy: 0.8012 - val_loss: 0.4364 - val_accuracy: 0.7968
Epoch 42/100
313/313 [==============================] - 23s 75ms/step - loss: 0.4228 - accuracy: 0.8065 - val_loss: 0.4399 - val_accuracy: 0.7976
Epoch 43/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4210 - accuracy: 0.8074 - val_loss: 0.4357 - val_accuracy: 0.7980
Epoch 44/100
313/313 [==============================] - 23s 72ms/step - loss: 0.4217 - accuracy: 0.8053 - val_loss: 0.4348 - val_accuracy: 0.7978
Epoch 45/100
313/313 [==============================] - 23s 72ms/step - loss: 0.4209 - accuracy: 0.8063 - val_loss: 0.4432 - val_accuracy: 0.7932
Epoch 46/100
313/313 [==============================] - 22s 72ms/step - loss: 0.4196 - accuracy: 0.8061 - val_loss: 0.4333 - val_accuracy: 0.8008
Epoch 47/100
313/313 [==============================] - 23s 72ms/step - loss: 0.4187 - accuracy: 0.8080 - val_loss: 0.4352 - val_accuracy: 0.7996
Epoch 48/100
313/313 [==============================] - 23s 72ms/step - loss: 0.4197 - accuracy: 0.8066 - val_loss: 0.4319 - val_accuracy: 0.8010
Epoch 49/100
313/313 [==============================] - 23s 74ms/step - loss: 0.4183 - accuracy: 0.8051 - val_loss: 0.4343 - val_accuracy: 0.8012
Epoch 50/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4167 - accuracy: 0.8108 - val_loss: 0.4339 - val_accuracy: 0.8022
Epoch 51/100
313/313 [==============================] - 23s 73ms/step - loss: 0.4175 - accuracy: 0.8084 - val_loss: 0.4337 - val_accuracy: 0.8010
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

download1

그래도 나름 과대적합을 잘 제어한듯? => LSTM + 드롭아웃 + 2개의 층 사용

🔮 GRU 구조

GRU는 Gated Recurrent Unit의 약자로 LSTM을 간소화한 버전으로 생각할 수 있다. 하지만 LSTM 처럼 셀 상태를 계산하지 않고 은닉 상태 하나만 포함하고 있다.

GRU 셀에는 은닉 상태와 입력에 가중치를 곱하고 절편을 더하는 작은 셀이 3개가 있다. 2개는 시그모이드 활성화 함수를 사용하고 하나는 tanh 활성화 함수를 사용한다.

2️⃣ GRU 신경망 만들기

model4 = keras.Sequential()

model4.add(keras.layers.Embedding(500, 16, input_length=100))
model4.add(keras.layers.GRU(8))
model4.add(keras.layers.Dense(1, activation='sigmoid'))

model4.summary()

>>>
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 embedding_4 (Embedding)     (None, 100, 16)           8000

 gru (GRU)                   (None, 8)                 624

 dense_4 (Dense)             (None, 1)                 9

=================================================================
Total params: 8,633
Trainable params: 8,633
Non-trainable params: 0
_________________________________________________________________

3️⃣ GRU 신경망 훈련하기

rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model4.compile(optimizer=rmsprop, loss='binary_crossentropy', metrics=['accuracy'])

checkpoint_cb = keras.callbacks.ModelCheckpoint('best-gru-model.h5', save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3, restore_best_weights=True)

history = model4.fit(train_seq, train_target, epochs=100, batch_size=64, validation_data=(val_seq, val_target), callbacks=[checkpoint_cb, early_stopping_cb])

>>>
Epoch 1/100
313/313 [==============================] - 15s 42ms/step - loss: 0.6925 - accuracy: 0.5242 - val_loss: 0.6922 - val_accuracy: 0.5374
Epoch 2/100
313/313 [==============================] - 13s 43ms/step - loss: 0.6906 - accuracy: 0.5932 - val_loss: 0.6901 - val_accuracy: 0.5876
Epoch 3/100
313/313 [==============================] - 13s 41ms/step - loss: 0.6874 - accuracy: 0.6143 - val_loss: 0.6860 - val_accuracy: 0.6146
Epoch 4/100
313/313 [==============================] - 13s 41ms/step - loss: 0.6817 - accuracy: 0.6309 - val_loss: 0.6791 - val_accuracy: 0.6220
Epoch 5/100
313/313 [==============================] - 12s 40ms/step - loss: 0.6713 - accuracy: 0.6457 - val_loss: 0.6664 - val_accuracy: 0.6390
Epoch 6/100
313/313 [==============================] - 12s 39ms/step - loss: 0.6528 - accuracy: 0.6611 - val_loss: 0.6435 - val_accuracy: 0.6670
Epoch 7/100
313/313 [==============================] - 12s 39ms/step - loss: 0.6172 - accuracy: 0.6867 - val_loss: 0.5956 - val_accuracy: 0.7072
Epoch 8/100
313/313 [==============================] - 12s 39ms/step - loss: 0.5499 - accuracy: 0.7315 - val_loss: 0.5349 - val_accuracy: 0.7432
Epoch 9/100
313/313 [==============================] - 12s 39ms/step - loss: 0.5137 - accuracy: 0.7531 - val_loss: 0.5166 - val_accuracy: 0.7530
Epoch 10/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4974 - accuracy: 0.7652 - val_loss: 0.5044 - val_accuracy: 0.7584
Epoch 11/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4838 - accuracy: 0.7754 - val_loss: 0.4938 - val_accuracy: 0.7662
Epoch 12/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4742 - accuracy: 0.7823 - val_loss: 0.4858 - val_accuracy: 0.7706
Epoch 13/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4654 - accuracy: 0.7870 - val_loss: 0.4790 - val_accuracy: 0.7764
Epoch 14/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4588 - accuracy: 0.7914 - val_loss: 0.4736 - val_accuracy: 0.7772
Epoch 15/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4529 - accuracy: 0.7948 - val_loss: 0.4688 - val_accuracy: 0.7824
Epoch 16/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4483 - accuracy: 0.7980 - val_loss: 0.4658 - val_accuracy: 0.7840
Epoch 17/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4440 - accuracy: 0.7999 - val_loss: 0.4629 - val_accuracy: 0.7868
Epoch 18/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4400 - accuracy: 0.8043 - val_loss: 0.4610 - val_accuracy: 0.7878
Epoch 19/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4367 - accuracy: 0.8043 - val_loss: 0.4597 - val_accuracy: 0.7890
Epoch 20/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4347 - accuracy: 0.8051 - val_loss: 0.4571 - val_accuracy: 0.7904
Epoch 21/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4321 - accuracy: 0.8071 - val_loss: 0.4555 - val_accuracy: 0.7896
Epoch 22/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4301 - accuracy: 0.8091 - val_loss: 0.4563 - val_accuracy: 0.7842
Epoch 23/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4289 - accuracy: 0.8103 - val_loss: 0.4543 - val_accuracy: 0.7886
Epoch 24/100
313/313 [==============================] - 13s 43ms/step - loss: 0.4270 - accuracy: 0.8105 - val_loss: 0.4532 - val_accuracy: 0.7876
Epoch 25/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4255 - accuracy: 0.8117 - val_loss: 0.4516 - val_accuracy: 0.7878
Epoch 26/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4243 - accuracy: 0.8122 - val_loss: 0.4557 - val_accuracy: 0.7910
Epoch 27/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4230 - accuracy: 0.8134 - val_loss: 0.4520 - val_accuracy: 0.7886
Epoch 28/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4222 - accuracy: 0.8129 - val_loss: 0.4505 - val_accuracy: 0.7894
Epoch 29/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4216 - accuracy: 0.8143 - val_loss: 0.4525 - val_accuracy: 0.7918
Epoch 30/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4207 - accuracy: 0.8148 - val_loss: 0.4498 - val_accuracy: 0.7898
Epoch 31/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4201 - accuracy: 0.8152 - val_loss: 0.4488 - val_accuracy: 0.7912
Epoch 32/100
313/313 [==============================] - 12s 39ms/step - loss: 0.4194 - accuracy: 0.8143 - val_loss: 0.4489 - val_accuracy: 0.7904
Epoch 33/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4188 - accuracy: 0.8164 - val_loss: 0.4480 - val_accuracy: 0.7922
Epoch 34/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4180 - accuracy: 0.8163 - val_loss: 0.4465 - val_accuracy: 0.7902
Epoch 35/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4177 - accuracy: 0.8161 - val_loss: 0.4486 - val_accuracy: 0.7876
Epoch 36/100
313/313 [==============================] - 12s 40ms/step - loss: 0.4174 - accuracy: 0.8158 - val_loss: 0.4476 - val_accuracy: 0.7884
Epoch 37/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4167 - accuracy: 0.8157 - val_loss: 0.4459 - val_accuracy: 0.7942
Epoch 38/100
313/313 [==============================] - 13s 40ms/step - loss: 0.4162 - accuracy: 0.8162 - val_loss: 0.4448 - val_accuracy: 0.7956
Epoch 39/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4157 - accuracy: 0.8155 - val_loss: 0.4450 - val_accuracy: 0.7940
Epoch 40/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4154 - accuracy: 0.8170 - val_loss: 0.4472 - val_accuracy: 0.7870
Epoch 41/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4146 - accuracy: 0.8174 - val_loss: 0.4444 - val_accuracy: 0.7932
Epoch 42/100
313/313 [==============================] - 13s 41ms/step - loss: 0.4147 - accuracy: 0.8170 - val_loss: 0.4433 - val_accuracy: 0.7934
Epoch 43/100
313/313 [==============================] - 13s 42ms/step - loss: 0.4142 - accuracy: 0.8158 - val_loss: 0.4461 - val_accuracy: 0.7954
Epoch 44/100
313/313 [==============================] - 13s 43ms/step - loss: 0.4139 - accuracy: 0.8165 - val_loss: 0.4456 - val_accuracy: 0.7954
Epoch 45/100
313/313 [==============================] - 13s 42ms/step - loss: 0.4136 - accuracy: 0.8159 - val_loss: 0.4447 - val_accuracy: 0.7892
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()

download2

드롭아웃을 사용하지 않았기 때문에 이전보다 훈련 손실과 검증 손실의 차이가 있지만 잘 수렴된 것을 확인할 수 있다.

댓글남기기