How to migrate an AWS S3 Bucket to another account or service – CloudSavvy IT

AWS logo

AWS’s Simple Storage Service (S3) is great for storing large amounts of objects, but it is also an API that is compatible with many other competing services. If you want to move from AWS, transferring an S3 bucket is easy.

How does it work?

If both services you transfer to and from are S3 compatible, you can simply use a utility such as rclone, configured to access each service, to transfer all the items. For example, you can transfer from S3 to Digital Ocean’s compatible Spaces service or transfer from an S3 bucket on one account to another.

To handle the transfer, rclone will read from the source bucket, find all the files to be transferred, and handle cloning them into the destination bucket. rclone can also handle file updates, which can be useful if the source bucket is written to while it is being transferred.

As for the transfer times, it will probably take a while depending on the size of the bucket. The number of files is also an issue rclone adds overhead for each transfer. If you have millions of files or more terabytes, you should be prepared for hours of transfer time.

Fortunately, you can perform the large initial transfer while still actively writing to the bucket. You will probably need some downtime to make sure the buckets are in sync before you finally switch. If this is a problem, there are other tools available for seamless transfer, including commercial tools like NetApp Cloud Sync that can sync multiple buckets together.

Some cloud services, like Google Cloud Platform, have their own services that can handle the transfer. If you’re moving to a platform that supports this, you’ll probably use their service instead.

RELATED: How to transfer an S3 Bucket to Google Cloud Platform Storage

Setting up rclone

The simplest method is to set up rclone on your own server to handle the transfer operation. You have to run it in the background or through one tmux window so you can disconnect from long transfers.

RELATED: How to use tmux on Linux for Terminal Multitasking

rclone is available from most package administrators:

apt install rclone -y

rclone is mostly intended for transferring files locally or between SSH-compatible servers, so it will require some configuration to handle transfers between S3 services. This file is located at:

~/.config/rclone/rclone.conf

Add a new block with the following configuration that links it to your AWS account (not a specific bucket):

[s3]
type = s3
env_auth = false
acl = private
access_key_id = ACCESS_KEY
secret_access_key = SECRET_KEY
region = REGION
location_constraint = LOCATION_CONSTRAINT

You must complete the configuration with your access key and secret and enter in the area of ​​your buckets. You can find a list of regions from the AWS documents.

You must complete another block for the second service you are transferring to. If you move between AWS accounts, you will need a separate key to access that account. If you are moving to a service like DO Spaces, define another block with a new endpoint configured:

[spaces]
type = s3
env_auth = false
acl = private
access_key_id = ACCESS_KEY
secret_access_key = SECRET_KEY
endpoint = nyc3.digitaloceanspaces.com

In any case, give it a new name for the block title because these are two separate remotes.

Execution of the transfer

Once configured, you will be able to see all sorts of remote controls

rclone listremotes

s3:
spaces:

Confirm the remote control type by adding the --long flag to the rclone listremotes command.

You can see the contents of a bucket using the endpoint name followed by a colon and the bucket name.

rclone tree s3:source-bucket

Then you can run the synchronization with some extra flags for optimal performance:

rclone sync source:/source-bucket 
destination:/destination-bucket 
-P -v --log-file /var/log/rclone/rclone-1.log 
--create-empty-src-dirs --s3-chunk-size 20M 
--s3-upload-concurrency 64 --checksum

That -P flag allows you to see progress interactively in your terminal, and will provide an estimate of how long it will take.

rclone sync will simply scan the source bucket and update the target bucket. You can continue to change the source setting while the download is complete. Once done, you can run additional downloads and continue syncing buckets together.

Leave a Comment