How to write a file or data to an S3 object using boto3

Question

In boto 2, you can write to an S3 object using these methods:

Is there a boto 3 equivalent? What is the boto3 method for saving data to an object stored on S3?

Community · Accepted Answer · 2020-06-20 09:12:55Z

392

In boto 3, the 'Key.set_contents_from_' methods were replaced by

For example:

import boto3

some_binary_data = b'Here we have some data'
more_binary_data = b'Here we have some more data'

# Method 1: Object.put()
s3 = boto3.resource('s3')
object = s3.Object('my_bucket_name', 'my/key/including/filename.txt')
object.put(Body=some_binary_data)

# Method 2: Client.put_object()
client = boto3.client('s3')
client.put_object(Body=more_binary_data, Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')

Alternatively, the binary data can come from reading a file, as described in the official docs comparing boto 2 and boto 3:

Storing Data

Storing data from a file, stream, or string is easy:

# Boto 2.x
from boto.s3.key import Key
key = Key('hello.txt')
key.set_contents_from_file('/tmp/hello.txt')

# Boto 3
s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Oct 31, 2016 at 5:13

jkdev

11.8k15 gold badges57 silver badges79 bronze badges

botocore.exceptions.NoCredentialsError: Unable to locate credentials how to fix this ?
– deepak murthy
Commented Oct 16, 2017 at 7:10
2

@deepakmurthy I'm not sure why you're getting that error... You'd need to ask a new Stack Overflow question and provide more details about the issue.
– jkdev
Commented Oct 16, 2017 at 16:48
2

When I try s3.Object().put() I end up with an object with zero content-length. For me put() only accepts string data, but put(str(binarydata)) seems to have some sort of encoding issues. I end up with an object roughly 3times the size of the original data, which makes it useless for me.
– user1129682
Commented Feb 28, 2018 at 16:05
@user1129682 I'm not sure why that is. Could you please ask a new question and provide more details?
– jkdev
Commented Feb 28, 2018 at 17:28
@jkdev It'd be great if you could take a look.
– user1129682
Commented Feb 28, 2018 at 20:33

| Show 1 more comment

mathetes · Accepted Answer · 2020-11-22 09:46:14Z

105

boto3 also has a method for uploading a file directly:

s3 = boto3.resource('s3')    
s3.Bucket('bucketname').upload_file('/local/file/here.txt','folder/sub/path/to/s3key')

http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file

edited Nov 22, 2020 at 9:46

mathetes

12.1k7 gold badges27 silver badges32 bronze badges

answered Mar 8, 2018 at 17:17

EM Bee

1,2791 gold badge8 silver badges11 bronze badges

17

This is good, but it doesn't allow for data currently in memory to be stored.
– Reid
Commented Feb 16, 2019 at 23:58
20

@Reid: for in-memory files you can use the s3.Bucket(...).upload_fileobj() method instead.
– svohara
Commented Mar 26, 2019 at 23:04
3

How does writing from in-memory perform vs. uploading to s3 from locally written file?
– cdabel
Commented May 6, 2021 at 22:34
@cdabel according to this answer, not much. Both will be roughly the same. Both written in python. Bottleneck is either disk-io or network-io
– codeananda
Commented Mar 8, 2024 at 12:30
Note: s3key means the name you want your file to have in s3. If you pass a path, the relevent folders will be created. If you pass here.txt it will be uploaded to root. The current answer would make more sense if it was 'folder/sub/path/to/s3key.txt'
– codeananda
Commented Mar 8, 2024 at 12:31

Add a comment |

Franke · Accepted Answer · 2019-01-20 00:08:35Z

77

You no longer have to convert the contents to binary before writing to the file in S3. The following example creates a new text file (called newfile.txt) in an S3 bucket with string contents:

import boto3

s3 = boto3.resource(
    's3',
    region_name='us-east-1',
    aws_access_key_id=KEY_ID,
    aws_secret_access_key=ACCESS_KEY
)
content="String content to write to a new S3 file"
s3.Object('my-bucket-name', 'newfile.txt').put(Body=content)

answered Jan 20, 2019 at 0:08

Franke

1,28212 silver badges14 bronze badges

Have no idea my 'put' action has no access. I created this bucket and put my canonical id under the access list.
– Drake .C
Commented Mar 5, 2019 at 0:49
How do you give a prefix in this case? Meaning, what if you want to store the file in my-bucket-name/subfolder/ ?
– kev
Commented Apr 9, 2019 at 22:47
4

@kev you can specify that along with the filename 'subfolder/newfile.txt' instead of 'newfile.txt'
– Madhava Carrillo
Commented Apr 10, 2019 at 10:08
Re "You no longer have to convert the contents to binary before writing to the file in S3.", is this documented somewhere? I was looking at boto3.amazonaws.com/v1/documentation/api/latest/reference/…, and thought it only accepted bytes. I'm not sure what exactly constitutes a "seekable file-like object", but didn't think that included strings.
– Emma
Commented Jun 11, 2020 at 20:49
1

I may have comparing this with download_fileobj() which is for large multipart file uploads. The upload methods require seekable file objects, but put() lets you write strings directly to a file in the bucket, which is handy for lambda functions to dynamically create and write files to an S3 bucket.
– Franke
Commented Jun 13, 2020 at 13:11

| Show 1 more comment

Uri Goren · Accepted Answer · 2018-11-19 23:41:46Z

47

Here's a nice trick to read JSON from s3:

import json, boto3
s3 = boto3.resource("s3").Bucket("bucket")
json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))

Now you can use json.load_s3 and json.dump_s3 with the same API as load and dump

data = {"test":0}
json.dump_s3(data, "key") # saves json to s3://bucket/key
data = json.load_s3("key") # read json from s3://bucket/key

edited Nov 19, 2018 at 23:41

answered Apr 30, 2018 at 13:39

Uri Goren

13.7k7 gold badges62 silver badges113 bronze badges

4

Excellent. To get it to work, I added this extra bit: ...["Body"].read().decode('utf-8').
– sedeh
Commented Nov 19, 2018 at 15:05
Great idea. Anyway, it provides some space for naming improvements.
– Jan Vlcinsky
Commented Jun 14, 2020 at 9:46
1

Proposed rewrite of this nice idea: gist.github.com/vlcinsky/bbeda4321208aa98745afc29b58e90ac
– Jan Vlcinsky
Commented Jun 14, 2020 at 10:47
1

Hmm. It doesn't seem like a good idea to monkeypatch core Python library modules. Too surprising. Better to use plain functions or your own module, then call json without monkeypatching, e.g. def load_s3(key): json.load(s3.Object(key=key).get()["Body"]) and so forth.
– ggorlen
Commented Mar 28, 2023 at 4:56

Add a comment |

kev · Accepted Answer · 2019-04-17 13:50:26Z

32

A cleaner and concise version which I use to upload files on the fly to a given S3 bucket and sub-folder-

import boto3

BUCKET_NAME = 'sample_bucket_name'
PREFIX = 'sub-folder/'

s3 = boto3.resource('s3')

# Creating an empty file called "_DONE" and putting it in the S3 bucket
s3.Object(BUCKET_NAME, PREFIX + '_DONE').put(Body="")

Note: You should ALWAYS put your AWS credentials (aws_access_key_id and aws_secret_access_key) in a separate file, for example- ~/.aws/credentials

edited Apr 17, 2019 at 13:50

answered Apr 10, 2019 at 14:02

kev

2,8815 gold badges25 silver badges49 bronze badges

What's the Windows equivalent location for the AWS credentials file, since Windows won't support ~
– Hamman Samuel
Commented Mar 30, 2020 at 16:17
2

@HammanSamuel you may store it like C:\Users\username\.aws\credentials
– kev
Commented Apr 2, 2020 at 22:13
Better to store it in environment variables of lambda.
– nats
Commented Jul 5, 2021 at 3:29

Add a comment |

ouflak · Accepted Answer · 2022-03-09 15:41:59Z

After some research, I found this. It can be achieved using a simple csv writer. It is to write a dictionary to CSV directly to S3 bucket.

eg: data_dict = [{"Key1": "value1", "Key2": "value2"}, {"Key1": "value4", "Key2": "value3"}] assuming that the keys in all the dictionary are uniform.

import csv
import boto3

# Sample input dictionary
data_dict = [{"Key1": "value1", "Key2": "value2"}, {"Key1": "value4", "Key2": "value3"}]
data_dict_keys = data_dict[0].keys()

# creating a file buffer
file_buff = StringIO()
# writing csv data to file buffer
writer = csv.DictWriter(file_buff, fieldnames=data_dict_keys)
writer.writeheader()
for data in data_dict:
    writer.writerow(data)
# creating s3 client connection
client = boto3.client('s3')
# placing file to S3, file_buff.getvalue() is the CSV body for the file
client.put_object(Body=file_buff.getvalue(), Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')

Uri Goren · Accepted Answer · 2019-08-22 20:56:51Z

5

it is worth mentioning smart-open that uses boto3 as a back-end.

smart-open is a drop-in replacement for python's open that can open files from s3, as well as ftp, http and many other protocols.

for example

from smart_open import open
import json
with open("s3://your_bucket/your_key.json", 'r') as f:
    data = json.load(f)

The aws credentials are loaded via boto3 credentials, usually a file in the ~/.aws/ dir or an environment variable.

answered Aug 22, 2019 at 20:56

Uri Goren

13.7k7 gold badges62 silver badges113 bronze badges

4

while this respone is informative, it doesn't adhere to answering the original question - which is, what are the boto3 equivalents of certain boto methods.
– robinhood91
Commented Sep 15, 2019 at 1:26
3

Smart open uses boto3
– Uri Goren
Commented Sep 15, 2019 at 6:38
@UriGoren can you share an example to ftp to s3 using smart-open?
– kms
Commented Aug 24, 2021 at 23:28

Add a comment |

Prateek Bhuwania · Accepted Answer · 2019-09-18 18:45:26Z

You may use the below code to write, for example an image to S3 in 2019. To be able to connect to S3 you will have to install AWS CLI using command pip install awscli, then enter few credentials using command aws configure:

import urllib3
import uuid
from pathlib import Path
from io import BytesIO
from errors import custom_exceptions as cex

BUCKET_NAME = "xxx.yyy.zzz"
POSTERS_BASE_PATH = "assets/wallcontent"
CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"


class S3(object):
    def __init__(self):
        self.client = boto3.client('s3')
        self.bucket_name = BUCKET_NAME
        self.posters_base_path = POSTERS_BASE_PATH

    def __download_image(self, url):
        manager = urllib3.PoolManager()
        try:
            res = manager.request('GET', url)
        except Exception:
            print("Could not download the image from URL: ", url)
            raise cex.ImageDownloadFailed
        return BytesIO(res.data)  # any file-like object that implements read()

    def upload_image(self, url):
        try:
            image_file = self.__download_image(url)
        except cex.ImageDownloadFailed:
            raise cex.ImageUploadFailed

        extension = Path(url).suffix
        id = uuid.uuid1().hex + extension
        final_path = self.posters_base_path + "/" + id
        try:
            self.client.upload_fileobj(image_file,
                                       self.bucket_name,
                                       final_path
                                       )
        except Exception:
            print("Image Upload Error for URL: ", url)
            raise cex.ImageUploadFailed

        return CLOUDFRONT_BASE_URL + id

Collectives™ on Stack Overflow

How to write a file or data to an S3 object using boto3

8 Answers 8

Storing Data

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Storing Data

Linked

Related