TL,DR; How can JSON containing a regex with escaped backslahes, be loaded using Python's JSON decoder?
Detail; The regular expression \\[0-9]\\
will match (for example):
\2\
The same regular expression could be encoded as a JSON value:
{
"pattern": "\\[0-9]\\"
}
And in turn, the JSON value could be encoded as a string in Python (note the single quotes):
'{"pattern": "\\[0-9]\\"}'
When loading the JSON in Python, a JSONDecodeError is raised:
import json
json.loads('{"pattern": "\\[0-9]\\"}')
The problem is caused by the regular expression escaping the blackslashes:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid \escape: line 1 column 14 (char 13)
>>> json.loads('{"pattern": "\\[0-9]\\"}')
This surprised me since each step seems reasonable (i.e. valid regex, valid JSON, and valid Python).
How can JSON containing a regex with escaped backslahes, be loaded using Python's JSON decoder?
json.loads(r'{"pattern": "\\[0-9]\\"}')