본문 바로가기
AI | 딥러닝/Coding

[Pytorch 프로젝트] Softmax regression으로 MNIST 데이터 분류하기

by 고뭉나무 2021. 10. 24.

🔊 해당 포스팅에서 사용된 컨텐츠는 위키독스의 PyTorch로 시작하는 딥러닝 입문 내용을 기반으로 했음을 알립니다. 설명에서 사용된 자료는 최대한 제가 직접 재구성한 자료임을 알립니다.

 

 

  • 사용 Framework: Pytorch
  • 사용 기법: Softmax regression
  • 사용 함수: nn.Linear()
  • 사용 데이터: MNIST (손글씨 숫자)

 

모델링을 할 때 크게 4가지 틀을 기억하고 지켜주면 된다.

1. Dataset 설정
2. 모델 설계
3. Cost 함수와 Optimizer 설정
4. Training 과 Back-propagation 수행

 

 

 

모델링 (Modeling)

import torch
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
import torch.nn as nn
import matplotlib.pyplot as plt
import random

USE_CUDA = torch.cuda.is_available() # GPU를 사용가능하면 True, 아니라면 False를 리턴
device = torch.device("cuda" if USE_CUDA else "cpu") # GPU 사용 가능하면 사용하고 아니면 CPU 사용
print("다음 기기로 학습합니다:", device)

#for reproducibility
random.seed(777)
torch.manual_seed(777)
if device == 'cuda':
    torch.cuda.manual_seed_all(777)

#hyperparameters
training_epochs = 15    #15번 반복 training
batch_size = 100        #Data를 100개씩 쪼개서 training 수행


----------------------------------## Dataset 설정 ##----------------------------------
#MNIST dataset
#60,000개의 train data, 10,000개의 test data
mnist_train = dsets.MNIST(root='MNIST_data/',
                          train=True,
                          transform=transforms.ToTensor(),
                          download=True)

mnist_test = dsets.MNIST(root='MNIST_data/',
                         train=False,
                         transform=transforms.ToTensor(),
                         download=True)

#dataset loader
data_loader = DataLoader(dataset=mnist_train,
                                          batch_size=batch_size, #배치 크기는 100
                                          shuffle=True,
                                          drop_last=True)



------------------------------------## 모델 설계 ##------------------------------------
#model 생성
#MNIST data image of shape 28 * 28 = 784
model = nn.Linear(784, 10, bias=True).to(device)



-----------------------------## Cost함수 & Optimizer ##--------------------------------
#비용 함수와 optimizer 정의
#내부적으로 소프트맥스 함수를 포함하고 있음
criterion = nn.CrossEntropyLoss().to(device) 
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

print("#linear.parameters()")
print(list(linear.parameters()))



-------------------------## Training & Back-propagation ##---------------------------

total_batch = len(data_loader)
print("#total_batch")
print(total_batch)
print("\n")

#앞서 training_epochs의 값은 15로 지정함
for epoch in range(training_epochs): 
    avg_cost = 0

    for X, Y in data_loader:
        #배치 크기가 100이므로 아래의 연산에서 X는 (100, 784)의 텐서가 된다.
        X = X.view(-1, 28 * 28).to(device)
        #레이블은 원-핫 인코딩이 된 상태가 아니라 0 ~ 9의 정수.
        Y = Y.to(device)

        #back-propagation 계산을 할 때마다 gradient 값을 누적시키기 때문에 gradient를 0으로 초기화 해주기 위한 것.
        optimizer.zero_grad()
        hypothesis = model(X)
        cost = criterion(hypothesis, Y)
        #gradient 계산
        cost.backward()
        #새로 계산된 w로 업데이트되고 다음 epoch로 넘어가기
        optimizer.step()

        avg_cost += cost / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', '{:.9f}'.format(avg_cost))

------------------------------------#############------------------------------------

print('Learning finished')

 

 

Cost 에러율 결과

다음 기기로 학습합니다: cuda
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz
9913344/? [00:00<00:00, 48208583.66it/s]
Extracting MNIST_data/MNIST/raw/train-images-idx3-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz
29696/? [00:00<00:00, 871549.79it/s]
Extracting MNIST_data/MNIST/raw/train-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664/? [00:00<00:00, 15366188.08it/s]
Extracting MNIST_data/MNIST/raw/t10k-images-idx3-ubyte.gz to MNIST_data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120/? [00:00<00:00, 91224.68it/s]
Extracting MNIST_data/MNIST/raw/t10k-labels-idx1-ubyte.gz to MNIST_data/MNIST/raw

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  /pytorch/torch/csrc/utils/tensor_numpy.cpp:180.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
#linear.parameters()
[Parameter containing:
tensor([[-2.9863e-02, -6.3304e-04, -6.9076e-03,  ..., -1.8779e-02,
          3.0876e-02, -3.1737e-02],
        [ 6.7273e-03,  2.0005e-02, -2.4505e-02,  ..., -2.7296e-02,
         -3.8136e-03,  1.1463e-03],
        [-1.7847e-02, -2.9822e-02,  7.4012e-05,  ..., -3.2111e-02,
          2.6378e-02,  3.2280e-02],
        ...,
        [-1.3945e-02,  3.2514e-02,  2.6534e-03,  ..., -3.2780e-02,
         -1.6748e-02, -5.4733e-03],
        [ 1.0081e-02, -1.2429e-02, -1.0554e-02,  ..., -2.0057e-02,
         -9.3248e-04,  8.3561e-03],
        [-6.6728e-03,  7.1139e-03,  2.5636e-02,  ...,  1.9851e-02,
         -1.8774e-02, -3.2610e-02]], device='cuda:0', requires_grad=True), Parameter containing:
tensor([-0.0229, -0.0224,  0.0170, -0.0287, -0.0111,  0.0345,  0.0230,  0.0343,
         0.0262, -0.0135], device='cuda:0', requires_grad=True)]
#total_batch
600


Epoch: 0001 cost = 0.535150647
#total_batch
600


Epoch: 0002 cost = 0.359577745
#total_batch
600


Epoch: 0003 cost = 0.331264257
#total_batch
600


Epoch: 0004 cost = 0.316404670
#total_batch
600


Epoch: 0005 cost = 0.307106972
#total_batch
600


Epoch: 0006 cost = 0.300456583
#total_batch
600


Epoch: 0007 cost = 0.294933438
#total_batch
600


Epoch: 0008 cost = 0.290956229
#total_batch
600


Epoch: 0009 cost = 0.287074089
#total_batch
600


Epoch: 0010 cost = 0.284515619
#total_batch
600


Epoch: 0011 cost = 0.281914085
#total_batch
600


Epoch: 0012 cost = 0.279526889
#total_batch
600


Epoch: 0013 cost = 0.277636617
#total_batch
600


Epoch: 0014 cost = 0.275874794
#total_batch
600


Epoch: 0015 cost = 0.274422735
Learning finished

 

 

검증

실제 데이터 넣어보며 직접 정확도 확인

 

# 테스트 데이터를 사용하여 모델을 테스트한다.
with torch.no_grad(): # torch.no_grad()를 하면 gradient 계산을 수행하지 않는다.
    X_test = mnist_test.test_data.view(-1, 28 * 28).float().to(device)
    Y_test = mnist_test.test_labels.to(device)

    prediction = model(X_test)
    correct_prediction = torch.argmax(prediction, 1) == Y_test
    accuracy = correct_prediction.float().mean()
    print('Accuracy:', accuracy.item())

    # MNIST 테스트 데이터에서 무작위로 하나를 뽑아서 예측을 해본다
    r = random.randint(0, len(mnist_test) - 1)
    X_single_data = mnist_test.test_data[r:r + 1].view(-1, 28 * 28).float().to(device)
    Y_single_data = mnist_test.test_labels[r:r + 1].to(device)

    print('Label: ', Y_single_data.item())
    single_prediction = model(X_single_data)
    print('Prediction: ', torch.argmax(single_prediction, 1).item())

    plt.imshow(mnist_test.test_data[r:r + 1].view(28, 28), cmap='Greys', interpolation='nearest')
    plt.show()

 

결과

/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:67: UserWarning: test_data has been renamed data
  warnings.warn("test_data has been renamed data")
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/mnist.py:57: UserWarning: test_labels has been renamed targets
  warnings.warn("test_labels has been renamed targets")
Accuracy: 0.8883000016212463
Label:  8
Prediction:  3

 

 

이번 포스팅에서 'softmax regression'을 이용하여 MNIST 데이터 분류를 하였지만 정확도가 89%밖에 되지 않아 위의 예시 처럼 오류 확률이 상당히 높다.

여기서 정확도를 더 높이고 싶다면 아래 포스팅의 'MLP(multi-layer perceptron)'을 이용한 MNIST 데이터 분류를 참고하면 좋다!

2021.10.24 - [AI | 딥러닝/Project] - [Pytorch 프로젝트] MLP(Multi-Layer Perceptron)으로 MNIST 데이터 분류하기

 

 

위 글이 도움이 되셨다면, 아래 하트를 눌러주세요↓

감사합니다 \( ˆoˆ )/​

반응형

댓글