"aws s3 cp" vs "aws s3 sync": Key Differences

This post explains the differences between the AWS CLI commands s3 cp and s3 sync.

Introduction

When I wanted to copy objects from one S3 bucket to another, two commands came to mind: s3 cp and s3 sync.

I thought I understood them by intuition, but I realized that I couldn't clearly explain the differences.

So, I decided to research and organize the differences between these two commands.

Note: This article was translated from my original post.

Key Differences Between cp and sync

The most significant difference between the two can be summed up as follows:

  • cp is a file copy command.
  • sync is a directory synchronization (copying only updated differences) command.

Comparing Descriptions

First, let's compare the descriptions from the AWS CLI documentation.

cp — AWS CLI 2.24.10 Command Reference
Description:

Copies a local file or S3 object to another location locally or in S3.


sync — AWS CLI 2.24.10 Command Reference
Description:

Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.

The cp command simply copies objects, whereas sync recursively copies newly created or updated files within a specified directory. Additionally, sync creates only those folders that contain at least one file.

This difference in descriptions aligns with the intuitive understanding that cp is used to copy files, while sync is used to synchronize directories (copy only updated differences).

Comparing Command Options

Next, compare how these commands are used by looking at their available options.

Most of the cp and sync options are the same, but some are unique to each.

Comparison of cp and sync command options (Gray: Options available for both commands, Orange: Options unique to each command)

The following options are unique to the sync command:

  • --size-only
  • --exact-timestamps
  • --delete

The --size-only option determines whether to sync files based only on file size changes.

Makes the size of each key the only criteria used to decide whether to sync from source to destination.


The --exact-timestamps option ensures that sync is based on an exact timestamp match.

When syncing from S3 to local, same-sized items will be ignored only when the timestamps match exactly. The default behavior is to ignore same-sized items unless the local version is newer than the S3 version.


The --delete option removes files in the destination that do not exist in the source.

Files that exist in the destination but not in the source are deleted during sync.


All these three sync-specific options reinforce the idea that sync is designed to copy only updated differences within a directory.


Now, let's look at options unique to the cp command:

  • --expected-size <value>
  • --recursive

The --expected-size <value> option is necessary when uploading files larger than 50GB to S3 to avoid multipart upload failures.

This argument specifies the expected size of a stream in terms of bytes. Note that this argument is needed only when a stream is being uploaded to s3 and the size is larger than 50GB. Failure to include this argument under these conditions may result in a failed upload due to too many parts in upload.


The --recursive option copies all files recursively from the specified directory.

Command is performed on all files or objects under the specified directory or prefix.

Unlike sync, which is designed to handle directory-wide operations, cp requires the --recursive option to copy an entire directory.

This clearly illustrates the fundamental difference between cp and sync—one focuses on file copying, while the other focuses on directory synchronization.

Comparing Command Behaviors

Finally, look at how these commands behave in practice by executing them.

Simple S3 copy

Copying files between S3 buckets using sync and cp

First, let's copy multiple files between S3 buckets.

Using sync, this can be done with the following command:

$ aws s3 sync s3://cp-sync-source s3://sync-destination                                                                                                   
copy: s3://cp-sync-source/aaa.txt to s3://sync-destination/aaa.txt
copy: s3://cp-sync-source/bbb.txt to s3://sync-destination/bbb.txt


With cp, copying a directory requires the --recursive option:

$ aws s3 cp s3://cp-sync-source s3://cp-destination --recursive
copy: s3://cp-sync-source/bbb.txt to s3://cp-destination/bbb.txt
copy: s3://cp-sync-source/aaa.txt to s3://cp-destination/aaa.txt

Copy updated S3

Next, compare how the commands handle updated S3 files.

Differences in sync and cp after directory updates (Red: Updated files)

After updating bbb.txt and adding a new file ccc.txt in cp-sync-source, running sync again:

$ aws s3 sync s3://cp-sync-source s3://sync-destination
copy: s3://cp-sync-source/bbb.txt to s3://sync-destination/bbb.txt
copy: s3://cp-sync-source/ccc.txt to s3://sync-destination/ccc.txt

Only the updated bbb.txt and new ccc.txt are copied. The unchanged aaa.txt is not copied again.

On the other hand, using cp:

$ aws s3 cp s3://cp-sync-source s3://cp-destination --recursive
copy: s3://cp-sync-source/bbb.txt to s3://cp-destination/bbb.txt  
copy: s3://cp-sync-source/aaa.txt to s3://cp-destination/aaa.txt  
copy: s3://cp-sync-source/ccc.txt to s3://cp-destination/ccc.txt  

All files, including aaa.txt, are copied again, regardless of whether they were modified.


Thus, the key differences are:

  • cp copies files.
  • sync copies only updated differences (synchronizes directories).

Conclusion

In this article, we explored the differences between the AWS CLI cp and sync commands.

At a high level, the difference is intuitive—copying vs. synchronizing. However, comparing their options helped clarify their intended use cases.

Even when you think you understand a command, revisiting its documentation often leads to new insights.

[Related Articles]

en.bioerrorlog.work

References