1

for string "//div[@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings']", I want to find "@..'...'" like "@id~'objectnavigator-card-list'" or "@class~'outbound-alert-settings'". But when I use regex ((@.+)\~(\'.*?\')), it find "@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings'". So how to modify the regex to find the string successfully?

1
  • Please format question properly. Commented May 15, 2017 at 3:13

3 Answers 3

3

Use non-capturing, non greedy, modifiers on the inner brackets and search for not the terminating character, e.g.:

 re.findall(r"((?:@[^\~]+)\~(?:\'[^\]]*?\'))", test)

On your test string returns:

 ["@id~'objectnavigator-card-list'", "@class~'outbound-alert-settings'"]
Sign up to request clarification or add additional context in comments.

Comments

1

Limit the characters you want to match between the quotes to not match the quote:

>>> re.findall(r'@[a-z]+~\'[-a-z]*\'', x)

I find it's much easier to look for only the characters I know are going to be in a matching section rather than omitting characters from more permissive matches.

Comments

1

For your current test string's input you can try this pattern:

import re 

a = "//div[@id~'objectnavigator-card-list']//li[@class~'outbound-alert-settings']"
# find everything which begins by '@' and neglect ']'
regex = re.compile(r'(@[^\]]+)')
strings = re.findall(regex, a)
# Or simply:
# strings = re.findall('(@[^\\]]+)', a)

print(strings)

Output:

["@id~'objectnavigator-card-list'", "@class~'outbound-alert-settings'"]

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.