1

I have the following code:

 a="32<2>fdssa</2>ffdsa32"
 re.sub(r'<(\d+)>|</(\d+)>',"item",a)

The result I get:

32itemfdssaitemffdsa32

I want the result:

32<item>fdssa</item>ffdsa32

2 Answers 2

3

You need to capture </ part.

re.sub(r'(</?)\d+>',r"\1item>",a)

Since I made / as optional, (</?) will capture < or </

Example:

>>> a="32<2>fdssa</2>ffdsa32"
>>> re.sub(r'(</?)\d+>',r"\1item>",a)
'32<item>fdssa</item>ffdsa32'
0
1
>>> re.sub(r'(</?)\d+(?=>)', r'\1item', a)
'32<item>fdssa</item>ffdsa32'
  • (</?) matches < or </ captures to \1

  • \d+ matches one or more digits.

  • (?=>) positive look ahead, checks if the digits is followed by >, but doesn't consume them

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.