2

This section of code is taken from one of the Pytorch tutorials, I have just removed the non-essential parts so it doesn't error out and added some print statements. The question I have is why the two print statements I provided have slightly different results? Is this a tuple with nothing in the second half of it? I am confused by the comma without anything after it before the assignment operator.

import torch

class MyReLU(torch.autograd.Function):
    @staticmethod
    def forward(ctx, input):
        ctx.save_for_backward(input)
        return input.clamp(min=0)

    @staticmethod
    def backward(ctx, grad_output):
        input, = ctx.saved_tensors
        print("ctx ", ctx.saved_tensors)
        print("inputs ", input)
        grad_input = grad_output.clone()
        grad_input[input < 0] = 0
        return grad_input

relu = MyReLU.apply
relu = MyReLU.apply
y_pred = relu(x.mm(w1)).mm(w2)
loss = (y_pred - y).pow(2).sum()
loss.backward()

Output

ctx  (tensor([[-34.2381,  18.6334,   8.8368,  ...,  13.7337, -31.5657, -11.8838],
        [-25.5597,  -6.2847,   9.9412,  ..., -75.0621,   5.0451, -32.9348],
        [-56.6591, -40.0830,   2.4311,  ...,  -2.8988, -18.9742, -74.0132],
        ...,
        [ -6.4023, -30.3526, -73.9649,  ...,   1.8587, -23.9617, -11.6951],
        [ -3.6425,  34.5828,  27.7200,  ..., -34.3878, -19.7250,  11.1960],
        [ 16.0137, -24.0628,  14.4008,  ...,  -5.4443,   9.9499, -18.1259]],
       grad_fn=<MmBackward>),)
inputs  tensor([[-34.2381,  18.6334,   8.8368,  ...,  13.7337, -31.5657, -11.8838],
        [-25.5597,  -6.2847,   9.9412,  ..., -75.0621,   5.0451, -32.9348],
        [-56.6591, -40.0830,   2.4311,  ...,  -2.8988, -18.9742, -74.0132],
        ...,
        [ -6.4023, -30.3526, -73.9649,  ...,   1.8587, -23.9617, -11.6951],
        [ -3.6425,  34.5828,  27.7200,  ..., -34.3878, -19.7250,  11.1960],
        [ 16.0137, -24.0628,  14.4008,  ...,  -5.4443,   9.9499, -18.1259]],
       grad_fn=<MmBackward>)
3
  • 2
    The , is necessary because (x) is just x. Commented Aug 5, 2019 at 11:48
  • @tobias_k What is printed seems to be (x,), which is different from x. Commented Aug 5, 2019 at 11:53
  • There is no "second half". A tuple is an immutable container that contains 0 or more elements. You create a tuple using the comma operator, with parentheses only necessary to distinguish a tuple-creating comma from other commas (e.g., the ones that separate function arguments: f(2,3,4) vs f((2,3), 4)) and as a special case the empty tuple which is just parentheses (since empty parentheses aren't otherwise a valid expression). Commented Aug 5, 2019 at 11:56

2 Answers 2

4

It's just an edge-case of unpacking a single-element list or tuple.

a, = [1]
print(type(a), a)
# <class 'int'> 1

Without the comma, a would have been assigned the entire list:

a = [1]
print(type(a), a)
# <class 'list'> [1]

And the same goes for a tuple:

a, = (1,)  # have to use , with literal single-tuples, because (1) is just 1
print(type(a), a)
# <class 'int'> 1

a = (1,)  # have to use , with literal single-tuples, because (1) is just 1
print(type(a), a)
# <class 'tuple'> (1,)
Sign up to request clarification or add additional context in comments.

7 Comments

I would use (1,) to better match the example. I also would not call this an edge case. It's just "a", case - quite the normal and expected behavior.
@kabanus ok, updated. But I still think it's an edge-case, as the trailing comma is not needed for anything else except single-element tuples
I see where you are coming from. What I meant is that if printing a single element tuple, this is what I would expect - the output to "look" like a tuple, a one-element one in this case.
@kabanus I agree about the output, but I believe that the premise of the question is the , on the assignment line, not in the output
Fair enough, that was just my perspective. You already have my vote :)
|
1

(a, b) is a two-tuple, (a, b, c) is a three-tuple, (a, b, c,d) is a four-tuple.

Going the other way (a) would be a one-tuple. But that conflicts with e.g. (1 + 2) / 3 because you can't divide a tuple. Since one-tuples are rare and parens in math expressions are common the ( <expr> ) is not a tuple. And extra trailing , is required, as in (a, ).

Note: (a, b, ) and (a, b, c, ) work too.

Same goes for unpacking tuples:

a, = tuple

unpacks the tuple and sets a to the first (and only) item.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.