I am using python3. Following is example which explains question.
# python3
Python 3.6.8 (default, Sep 26 2019, 11:57:09)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
>>> help(str)
| str(object='') -> str
| str(bytes_or_buffer[, encoding[, errors]]) -> str
|
| Create a new string object from the given object. If encoding or
| errors is specified, then the object must expose a data buffer
| that will be decoded using the given encoding and error handler.
| Otherwise, returns the result of object.__str__() (if defined)
| or repr(object).
| encoding defaults to sys.getdefaultencoding().
| errors defaults to 'strict'.
>>> d = b'abcd'
>>> type(d)
<class 'bytes'>
>>> print(d)
b'abcd'
>>> len(d)
4
>>> m = str(d)
>>> type(m)
<class 'str'>
>>> print(m)
b'abcd'
>>> len(m)
7
>>> m.encode()
b"b'abcd'"
>>>
>>> m = str(d, encoding='utf-8')
>>> type(m)
<class 'str'>
>>> print(m)
abcd
>>> len(m)
4
>>>
It is mentioned in help(str) "encoding defaults to sys.getdefaultencoding()" still str(d) converts string with b'' in it. Note the len of string is 7 now. Question is,
- why default coding needs to be specified explicitly to make correct string out of bytes
- How to get back to bytes - New type is string. (encode on string will add that extra b)
- is there way that pylint catch/warn this problem.