0

I've added a JSON format column metadata_json to a database table digilearning_support_files, and am trying to do an index on a value that will be stored in there.

In the MySQL client terminal, the following works fine:

mysql> ALTER TABLE digilearning_support_files ADD INDEX metadata_json_category (( CAST(metadata_json->>"$.category" as CHAR(255)) COLLATE utf8mb4_bin )) USING BTREE;
Query OK, 0 rows affected (0.19 sec)
Records: 0  Duplicates: 0  Warnings: 0

and I can see that it will be used, with an EXPLAIN query:

mysql> EXPLAIN SELECT * FROM digilearning_support_files WHERE metadata_json->>"$.category" = 'Lesson Plan'\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: digilearning_support_files
   partitions: NULL
         type: ref
possible_keys: metadata_json_category
          key: metadata_json_category
      key_len: 1023
          ref: const
         rows: 1
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

So, the EXPLAIN is showing that it will use that index to test the value of the 'category' key in the stored JSON. So far so good.

I wrote a helper method to run in Rails migrations, to set up an index in this way. But, when it runs in Rails, it complains about the collation:

>> ActiveRecord::Base.connection.execute("ALTER TABLE digilearning_support_files ADD INDEX metadata_json_category (( CAST(metadata_json->>\"$.category\" as CHAR(255)) COLLATE utf8mb4_bin )) USING BTREE")

ActiveRecord::StatementInvalid: Mysql::Error: COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'utf8mb3': ALTER TABLE digilearning_support_files ADD INDEX metadata_json_category (( CAST(metadata_json->>"$.category" as CHAR(255)) COLLATE utf8mb4_bin )) USING BTREE

Why is the MySQL client happy with this but not when the exact same SQL query runs in Rails?

As an experiment I changed the Rails query to use utf8mb3_bin instead. That let me add the index, but EXPLAIN says it won't use it, and I'm pretty sure that is the wrong value to use for JSON anyway.

This is the full definition for the metadata_json column: note Collation: NULL:

     Field: metadata_json
      Type: json
 Collation: NULL
      Null: YES
       Key:
   Default: NULL
     Extra:
Privileges: select,insert,update,references

However, some other columns in the table do have utf8mb3:

     Field: exclude_locales
      Type: varchar(255)
 Collation: utf8mb3_general_ci
      Null: YES
       Key:
   Default: NULL
     Extra:
Privileges: select,insert,update,references

This is the database default:

mysql> SELECT DEFAULT_COLLATION_NAME FROM information_schema.SCHEMATA WHERE SCHEMA_NAME = 'e_learning_resource_development' LIMIT 1;
+------------------------+
| DEFAULT_COLLATION_NAME |
+------------------------+
| utf8mb4_0900_ai_ci     |
+------------------------+
1 row in set (0.00 sec)

I'm wondering if the NULL value on the collation means it can be set to different things depending on the enviroment, which would explain why the mysql terminal is different to rails for the same query.

3
  • What's the column collation? The error says COLLATION 'utf8mb4_bin' is not valid for CHARACTER SET 'utf8mb3'. This doesn't specify just the collation but the encoding of the text as well. Is the column's collation utf8 or utf8mb3 perhaps? utf8mb3 is deprecated and uses up to 3 bytes to encode characters. utf8mb4 uses up to 4. Commented Jan 23 at 13:33
  • It says NULL (see second Edit) . Should i create the column again and specify the collation at that point? I thiought json columns were always utf8mb4. Commented Jan 23 at 14:37
  • I think i fixed it, see below. thanks Commented Jan 23 at 15:15

1 Answer 1

1

This seems to have been caused by my Rails database.yml file which had this:

defaults: &defaults
  adapter: mysql
  encoding: utf8
  collation: utf8_general_ci
  database: e_learning_resource_development
  username: root
  host: 127.0.0.1
  port: 3306
  pool: 32

development:
  <<: *defaults

production:
  <<: *defaults

I changed the two following lines, and it seems to work now.

encoding: utf8mb4   
collation: utf8mb4_bin
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.