This post explains the differences between the AWS CLI commands s3 cp
and s3 sync
.
Introduction
When I wanted to copy objects from one S3 bucket to another, two commands came to mind:
s3 cp
and s3 sync
.
I thought I understood them by intuition, but I realized that I couldn't clearly explain the differences.
So, I decided to research and organize the differences between these two commands.
Note: This article was translated from my original post.
Key Differences Between cp
and sync
The most significant difference between the two can be summed up as follows:
cp
is a file copy command.sync
is a directory synchronization (copying only updated differences) command.
Comparing Descriptions
First, let's compare the descriptions from the AWS CLI documentation.
cp — AWS CLI 2.24.10 Command Reference
Description:
Copies a local file or S3 object to another location locally or in S3.
sync — AWS CLI 2.24.10 Command Reference
Description:
Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.
The cp
command simply copies objects, whereas sync
recursively copies newly created or updated files within a specified directory. Additionally, sync
creates only those folders that contain at least one file.
This difference in descriptions aligns with the intuitive understanding that cp
is used to copy files, while sync
is used to synchronize directories (copy only updated differences).
Comparing Command Options
Next, compare how these commands are used by looking at their available options.
Most of the cp
and sync
options are the same, but some are unique to each.
The following options are unique to the sync
command:
--size-only
--exact-timestamps
--delete
The --size-only
option determines whether to sync files based only on file size changes.
Makes the size of each key the only criteria used to decide whether to sync from source to destination.
The --exact-timestamps
option ensures that sync is based on an exact timestamp match.
When syncing from S3 to local, same-sized items will be ignored only when the timestamps match exactly. The default behavior is to ignore same-sized items unless the local version is newer than the S3 version.
The --delete
option removes files in the destination that do not exist in the source.
Files that exist in the destination but not in the source are deleted during sync.
All these three sync-specific options reinforce the idea that sync
is designed to copy only updated differences within a directory.
Now, let's look at options unique to the cp
command:
--expected-size <value>
--recursive
The --expected-size <value>
option is necessary when uploading files larger than 50GB to S3 to avoid multipart upload failures.
This argument specifies the expected size of a stream in terms of bytes. Note that this argument is needed only when a stream is being uploaded to s3 and the size is larger than 50GB. Failure to include this argument under these conditions may result in a failed upload due to too many parts in upload.
The --recursive
option copies all files recursively from the specified directory.
Command is performed on all files or objects under the specified directory or prefix.
Unlike sync
, which is designed to handle directory-wide operations, cp
requires the --recursive
option to copy an entire directory.
This clearly illustrates the fundamental difference between cp
and sync
—one focuses on file copying, while the other focuses on directory synchronization.
Comparing Command Behaviors
Finally, look at how these commands behave in practice by executing them.
Simple S3 copy
First, let's copy multiple files between S3 buckets.
Using sync
, this can be done with the following command:
$ aws s3 sync s3://cp-sync-source s3://sync-destination copy: s3://cp-sync-source/aaa.txt to s3://sync-destination/aaa.txt copy: s3://cp-sync-source/bbb.txt to s3://sync-destination/bbb.txt
With cp
, copying a directory requires the --recursive
option:
$ aws s3 cp s3://cp-sync-source s3://cp-destination --recursive copy: s3://cp-sync-source/bbb.txt to s3://cp-destination/bbb.txt copy: s3://cp-sync-source/aaa.txt to s3://cp-destination/aaa.txt
Copy updated S3
Next, compare how the commands handle updated S3 files.
After updating bbb.txt
and adding a new file ccc.txt
in cp-sync-source
, running sync
again:
$ aws s3 sync s3://cp-sync-source s3://sync-destination copy: s3://cp-sync-source/bbb.txt to s3://sync-destination/bbb.txt copy: s3://cp-sync-source/ccc.txt to s3://sync-destination/ccc.txt
Only the updated bbb.txt
and new ccc.txt
are copied. The unchanged aaa.txt
is not copied again.
On the other hand, using cp
:
$ aws s3 cp s3://cp-sync-source s3://cp-destination --recursive copy: s3://cp-sync-source/bbb.txt to s3://cp-destination/bbb.txt copy: s3://cp-sync-source/aaa.txt to s3://cp-destination/aaa.txt copy: s3://cp-sync-source/ccc.txt to s3://cp-destination/ccc.txt
All files, including aaa.txt
, are copied again, regardless of whether they were modified.
Thus, the key differences are:
cp
copies files.sync
copies only updated differences (synchronizes directories).
Conclusion
In this article, we explored the differences between the AWS CLI cp
and sync
commands.
At a high level, the difference is intuitive—copying vs. synchronizing. However, comparing their options helped clarify their intended use cases.
Even when you think you understand a command, revisiting its documentation often leads to new insights.
[Related Articles]