(my solution below)
I have a python script that uses a REST API to get XML files from a tool. This is the heart of it:
def get_output(self, url, output_filename):
xml_response = self.client.get_request(url)
if not xml_response:
self.logger.error(f"no xml_response:{output_filename}")
return False
else:
self.logger.info(f"{output_filename}")
output_file = open(output_filename, 'w', encoding='utf-8')
output_file.write(xml_response.text)
output_file.close()
return True
The write call causes a DEBUG log entry:
Encoding detection: utf_8 is most likely the one.
Is there a way to explicitly tell the write to use utf8 in order to not get that log entry?
Thanks!
Edit 1:
self.client.get_request(url) is essentially:
import requests
self.headers = {'Accept': 'application/rdf+xml'}
self.session = requests.Session()
response = self.session.get(url, allow_redirects=True, headers=self.headers)
This is a sanitized log entry:
2025-09-08T22:00:52+0000;INFO;XXX:./__auto_temp/2025-09-08_2200(+0000)___XXX.xml
2025-09-08T22:01:36+0000;DEBUG;https://XXX "GET /XXX HTTP/1.1" 200 None
2025-09-08T22:01:36+0000;INFO;./__auto_temp/2025-09-08_2200(+0000)___XXX.xml
2025-09-08T22:01:36+0000;INFO;get_output 1
2025-09-08T22:01:36+0000;INFO;get_output 2
2025-09-08T22:01:36+0000;DEBUG;Encoding detection: utf_8 is most likely the one.
2025-09-08T22:01:36+0000;INFO;get_output 3
2025-09-08T22:01:36+0000;INFO;get_output 4
Unless log entries are coming in out-of-order, my experiment that sprinkled logger calls among the open/write/close show it's the write causing the entry.
self.logger.info(f"{output_filename}")
self.logger.info('get_output 1')
output_file = open(output_filename, 'w', encoding='utf-8')
self.logger.info('get_output 2')
output_file.write(xml_response.text)
self.logger.info('get_output 3')
output_file.close()
self.logger.info('get_output 4')
return True
Edit 2:
My solution based on deceze's answer was to add xml_response.encoding = 'utf-8' before the write; the log message no longer appears. I changed the title of the post to reflect the actual problem. I did not realize text was a property with some code behind it, I thought it was a simple get to the buffer.
f"{output_filename}"is pretty redundant. Justself.logger.info(output_filename)will do the same thing.[charset_normalizer] DEBUG: Encoding detection: utf_8 is most likely the one.that come after HTTP requests. Perhaps the API you call didn't specify aContent-Type, so whateverclientis, tried to detect the encoding ?clientandget_request? Does the package they came from usecharset-normalizerAnd what are the full log entries ?