I am trying to replace class name in xml file based on csv columns. Actually xml
files are annotation files.
This is the format of xml:
<annotation>
<folder>./test_xmls</folder>
<filename>000048_Panorama.jpg</filename>
<path>./images000048_Panorama.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>4000</width>
<height>2000</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>AAAA</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
My csv contains original
column and change to` column.
format is:
|original | change to|
--------------------------
| AAAA | class_A |
..................
Csv has more than 20000 rows which includes all the <name>AAAA</name>
of 80000 xml files.
I want to match xml
name like AAAA
with csv column. If it exists in original
column then I want to replace by corresponding value from change to
like AAAA
to class_A
.
I tried to write python code but it doesn't work. My code is here
import xml.etree.ElementTree as ET
import os
import pandas as pd
from collections import defaultdict
import csv
from csv import reader
with open('table.csv', mode='r') as inp:
reader = csv.reader(inp)
dict_from_csv = {rows[0]:rows[2] for rows in reader}
#print(dict_from_csv)
root_path = "./xmls"
xml_list = sorted(os.listdir(root_path))
for xml_file in xml_list:
xml_path = os.path.join(root_path,xml_file)
# parse xml file
tree = ET.parse(xml_path)
# get root node
root = tree.getroot()
for member in root.findall('object'):
sub_child = member[0].text
print(sub_child)
for key, value in dict_from_csv.items():
if sub_child in key:
sub_child = sub_child.replace(sub_child, value)
#print(xml)
xml_file.write(sub_child)
print("Classes are changed : " + xml_path)
Any help would be appreciated.
Thank you