5

I want to insert bytes into my PostgreSQL (9.5.7) database column with the type bytea, using the Psycopg2 (2.7.1) copy_from() method.

I can insert my bytes with the following code :

psycopg2_cursor.copy_from(
    StringIO("\x30\x40\x50"),
    "my_table",
)

By executing a SELECT into my table after the insertion, I get the expected value from the bytea column:

\x304050

Now, I want to prepend my bytes with the byte 0:

psycopg2_cursor.copy_from(
    StringIO("\x00\x30\x40\x50"),
    "my_table",
)

I get the error : psycopg2.DataError: invalid byte sequence for encoding "UTF-8": 0x00. From my understanding, this error should only be triggered when inserting a null byte into a text field, but should work as expected into a bytea field. Am I missing something ? Is there any simple way to insert a null byte into a bytea column ?

Thanks!

2
  • what's your standard_conforming_strings setting? Commented Jun 12, 2017 at 7:42
  • did you try StringIO('\x30\x40\x50') instead?.. Commented Jun 12, 2017 at 7:57

2 Answers 2

2

https://www.postgresql.org/docs/current/static/sql-copy.html

the following characters must be preceded by a backslash if they appear as part of a column value: backslash itself, newline, carriage return, and the current delimiter character.

just realized you are using COPY, so you have to escape backslash:

t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \\x00
>> \.
COPY 1
t=# copy b from stdin;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> \x00
>> \.
ERROR:  invalid byte sequence for encoding "UTF8": 0x00
CONTEXT:  COPY b, line 1: "\x00"

this should do the trick:

psycopg2_cursor.copy_from(
    StringIO("\\x00\\x30\\x40\\x50"),
    "my_table",
)
Sign up to request clarification or add additional context in comments.

6 Comments

I added the following parameter to my psycopg2 connection : options="-c standard_conforming_strings=on", but I still get the same error. I tried StringIO('\x30\x40\x50') and it works correctly, but I really need to insert this null byte. I tried to use a BytesIO instead of a StringIO, but I still get the same result.
try that? select decode('00203040','hex') - is it producing your wanted result?..
Yes, it produces the wanted result when using it in PSQL (INSERT INTO my_table VALUES(decode('001060', 'hex'));), but I need to insert my bytes with the copy_from() method from my python code, not directly with a SQL query.
last attempt: psycopg2_cursor.copy_from( StringIO("\\x00\\x30\\x40\\x50"), "my_table", ) ?..
Still the same (invalid byte sequence for encoding "UTF8": 0x00). The good news is that I can insert bytes from my python code when using cursor.execute("INSERT INTO my_table VALUES(decode('005566', 'hex'));") but for performances issues (details on my github: github.com/jean553/massive-insert-postgresql-tornado), I would prefer to insert with copy_from().
|
1

To insert a binary with copy it is necessary to use the binary format which is not what you want. Use the extras.execute_values method

from psycopg2.extensions import Binary

binaries = [[Binary('\x00\x20')], [Binary('\x00\x30')]]

insert_query = 'insert into t (b) values %s'
psycopg2.extras.execute_values (
    cursor, insert_query, binaries, page_size=100
)

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.