13

I am trying to implement a simple autoencoder using PyTorch. My dataset consists of 256 x 256 x 3 images. I have built a torch.utils.data.dataloader.DataLoader object which has the image stored as tensor. When I run the autoencoder, I get a runtime error:

size mismatch, m1: [76800 x 256], m2: [784 x 128] at /Users/soumith/minicondabuild3/conda-bld/pytorch_1518371252923/work/torch/lib/TH/generic/THTensorMath.c:1434

These are my hyperparameters:

batch_size=100,
learning_rate = 1e-3,
num_epochs = 100

Following is the architecture of my auto-encoder:

class autoencoder(nn.Module):
    def __init__(self):
        super(autoencoder, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(3*256*256, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU(True),
            nn.Linear(64, 12),
            nn.ReLU(True),
            nn.Linear(12, 3))

        self.decoder = nn.Sequential(
            nn.Linear(3, 12),
            nn.ReLU(True),
            nn.Linear(12, 64),
            nn.ReLU(True),
            nn.Linear(64, 128),
            nn.Linear(128, 3*256*256),
            nn.ReLU())

def forward(self, x):
    x = self.encoder(x)
    #x = self.decoder(x)
    return x

This is the code I used to run the model:

for epoch in range(num_epochs):
for data in dataloader:
    img = data['image']
    img = Variable(img)
    # ===================forward=====================
    output = model(img)
    loss = criterion(output, img)
    # ===================backward====================
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
# ===================log========================
print('epoch [{}/{}], loss:{:.4f}'
      .format(epoch+1, num_epochs, loss.data[0]))
if epoch % 10 == 0:
    pic = show_img(output.cpu().data)
    save_image(pic, './dc_img/image_{}.jpg'.format(epoch))
5
  • 1
    in which line you are getting the error? what is the shape of x you are passing to the forward function? Is the first linear layer in the encoder: nn.Linear(3*256*256, 128) correct? Commented Apr 2, 2018 at 7:29
  • I am getting an error when I run output =model(input). As per my knowledge, the linear layer flattens the image and executes something like an "Y=Ax+B" operation. Since my input is 256X256X3 image, the total number of elements would be a multiplication of that. Commented Apr 2, 2018 at 7:32
  • I have added the code which I am using to train my model. Commented Apr 2, 2018 at 7:37
  • "As per my knowledge, the linear layer flattens the image". Did you test this assumption? Since, it doesn't seem to be true. Commented Apr 2, 2018 at 7:39
  • The PyTorch documentation says so. Or at least what I inferred from it.pytorch.org/docs/master/nn.html#linear-layers Commented Apr 2, 2018 at 7:44

3 Answers 3

33

Whenever you have:

RuntimeError: size mismatch, m1: [a x b], m2: [c x d]

all you have to care is b=c and you are done:

m1 is [a x b] which is [batch size x in features]

m2 is [c x d] which is [in features x out features]

Sign up to request clarification or add additional context in comments.

2 Comments

how can you calculate the value of b? It seems the value of c is determined by the ChannelIn multiplied by the ChannelOut
From own experience I would like to add: If one cannot explain b by a sensible calculation (e.g. image height * image width * number of filters) most probably the input dimension of pictures is different than assumed. E.g. I thought the input dim is 32x32 but it was 28x28. The model compiled until the dense layer but b was a strange number.
13

If your input is 3 x 256 x 256, then you need to convert it to B x N to pass it through the linear layer: nn.Linear(3*256*256, 128) where B is the batch_size and N is the linear layer input size. If you are giving one image at a time, you can convert your input tensor of shape 3 x 256 x 256 to 1 x (3*256*256) as follows.

img = img.view(1, -1) # converts [3 x 256 x 256] to 1 x 196608
output = model(img)

Comments

1

Your error:

size mismatch, m1: [76800 x 256], m2: [784 x 128]

says that previous layer output shape is not equal to next layer input shape

[76800 x 256], m2: [784 x 128] # Incorrect!
[76800 x 256], m2: [256 x 128] # Correct!

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.