0

I wrote a script to gather information out of an XML file. Inside, there are ENTITY's defined and I need a RegEx to get the value out of it.

<!ENTITY ABC         "123"> 
<!ENTITY BCD         "234"> 
<!ENTITY CDE         "345">

First, i open up the xml file and save the contents inside of a variable.

xml = open("file.xml", "r")
lines = xml.readlines()

Then I got a for loop:

result = "ABC"
var_search_result_list = []

var_searcher = "ENTITY\s" + result + '.*"[^"]*"\>'

for line in lines:
    var_search_result = re.match(var_searcher, line)

    if var_search_result != None:
        var_search_result_list += list(var_search_result.groups())

print(var_search_result_list)

I really want to have the value 123 inside of my var_search_result_list list. Instead, I get an empty list every time I use this. Has anybody got a solution?

Thanks in Advance - Toki

3
  • do u know about xmltodict ?
    – Ghost Ops
    Commented Sep 22, 2021 at 11:03
  • cant use any community modules or python 3.x at this project
    – toki
    Commented Sep 22, 2021 at 11:05
  • share the xml and explain what are the attributes / elements you are looking for.
    – balderman
    Commented Sep 22, 2021 at 15:53

1 Answer 1

0

There are a few issues in the code.

  • You are using re.match which has to match from the start of the string. Your pattern is ENTITY\sABC.*"([^"]*)"\> which does not match from the start of the given example strings.
  • If you want to add 123 only, you have to use a capture group, and add it using var_search_result.group(1) to the result list using append

For example:

import re

xml = open("file.xml", "r")
lines = xml.readlines()

result = "ABC"
var_search_result_list = []
var_searcher = "ENTITY\s" + result + '.*"([^"]*)"\>'
print(var_searcher)
for line in lines:
    var_search_result = re.search(var_searcher, line)
    if var_search_result:
        var_search_result_list.append(var_search_result.group(1))
print(var_search_result_list)

Output

['123']

A bit more precise pattern could be

<!ENTITY\sABC\s+"([^"]*)"\>

Regex demo

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.