5

I am trying to add an Amazon RDS (Postgres 15) connection in AWS Glue:

enter image description here

My JDBC URL is

jdbc:postgresql://xxx.xxx.us-west-2.rds.amazonaws.com:5432/my_db

which I can successfully connect in pgAdmin 4.

However, when I test this connection by

enter image description here

enter image description here

I got "FAILED".

enter image description here

The IAM role I am using even has AmazonRDSFullAccess:

enter image description here

When I click "View log" button, the error log is not really helpful. It seems AWS has issue parsing the errors:

ERROR StatusLogger Unrecognized format specifier [d]
ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern.
ERROR StatusLogger Unrecognized format specifier [thread]

enter image description here

And here is my Security Group config. Basically both inbound and outbound rules allow all traffic:

Security Group - Inbound rules:

enter image description here

Security Group - Outbound rules:

enter image description here

I am wondering what else could cause this connection issue? Thanks!

3 Answers 3

7

After more research, I found the issue is because AWS Glue is using an old JDBC driver. So I have to upload the latest driver manually to use.

Step 1. Download latest JDBC driver and upload to S3

Download latest JDBC driver at https://jdbc.postgresql.org/download/ As of today, I downloaded latest JDBC driver 42.6.0 Java 8 version which filename is postgresql-42.6.0.jar.

Then I uploaded at s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar

Step 2. Update the JDBC connection

Update your JDBC connection in AWS Glue -> Connectors with

  • JDBC Driver Class name: org.postgresql.Driver
  • JDBC Driver S3 Path: s3://my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar

enter image description here

Step 3. Add new access for the AWS Glue IAM role that is being used

In order to use my own JDBC driver, I need add the following permissions to the IAM role:

  • Grant permissions for the following job actions: CreateJob, DeleteJob, GetJob, GetJobRun, StartJobRun.
  • Grant permissions for IAM actions: iam:PassRole.
  • Grant permissions for Amazon S3 actions: s3:DeleteObjects, s3:GetObject, s3:ListBucket, s3:PutObject.
  • Grant service principal access to bucketin the IAM policy and your JDBC driver S3 location.

The default AWSGlueServiceRole policy has CreateJob, DeleteJob, GetJob, GetJobRun, StartJobRun. So I just need provide the rest of access to a new policy AWSGlueServiceRole-MyPolicy I created:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket/aws-glue/postgresql-jdbc-driver/postgresql-42.6.0.jar",
                "arn:aws:s3:::my-bucket"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "iam:PassRole"
            ],
            "Resource": [
                "arn:aws:iam::xxxxxxxxxxxx:role/service-role/AWSGlueServiceRole-xxx"
            ]
        }
    ]
}

Now the IAM role have

  • The default AWSGlueServiceRole policy
  • My new policy AWSGlueServiceRole-MyPolicy

(Note you do NOT need default AmazonRDSFullAccess policy)

enter image description here

Note the "Test Connection" will still fail with error log

Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported. Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Testing connections with custom drivers is not currently supported.

However, the crawler should succeed!

enter image description here

Extra Notes

Based on AWS Glue doc, Glue 4.0 is using PostgreSQL JDBC driver 42.3.6 which is quite new. However, I am not sure why the connection still failed.

Note my Amazon RDS is using Postgres 15 this case, and my show password_encryption; returns scram-sha-256. In my case, I do not need set password_encryption = 'md5'; like some users did.

Reference

Sign up to request clarification or add additional context in comments.

Comments

3

For me the core of the issue turned out to be password encryption purely. Nothing was working with a custom driver. I reverted to the default driver and simply updated my dynamic user creation to use set password_encryption = 'md5';

It may be that they have recently resolved the default driver issue. I'd recommend just trying with a user created with md5 pass encryption before anything else and move forward from there.

Comments

0

Yes password encryption type is an issue. AWS support gave me a similar answer. they said scram-sha-256 is not supported. you can find out about your password encryption type with the SCRAM command either in the query editor or with CLI, see reference [reference] https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL_Password_Encryption_configuration.html#PostgreSQL_Password_Encryption_configuration.getting-ready

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.