Moving datasets is half the job. Do it once, do it right. Here’s a clean, repeatable way to get large data into and out of your GPU computing service with integrity checks and resumable transfers.
What this covers
- Install and configure rclone on a CUDA‑ready template
- Copy to/from S3‑compatible storage and SSH/SFTP servers
- Make and verify SHA‑256 manifests
- Resume safely after disconnects
- Pick chunk sizes, parallelism, and compression that matter
Opinion: use rclone for cloud/object storage; use rsync only for LAN/SSH copies when both ends are POSIX and you need hard links/permissions.
1) Install rclone (once per template)
Inside your running container:
curl -fsSL https://rclone.org/install.sh | sudo bash
rclone version
Keep rclone in your custom template so you don’t repeat this.
2) Configure a remote (S3 or SSH)
Start the interactive config:
rclone config
Add a remote:
- S3 (AWS, MinIO, Wasabi, etc.): choose
s3
, setprovider
,region
, and access keys. - SFTP/SSH: choose
sftp
, set host, port, and key path.
Don’t bake secrets into images. Store access keys in the rclone config, or set env vars at runtime.
Env‑only (no interactive config) — S3 example
export RCLONE_CONFIG_myremote_TYPE=s3
export RCLONE_CONFIG_myremote_PROVIDER=AWS
export RCLONE_CONFIG_myremote_ACCESS_KEY_ID=XXXX
export RCLONE_CONFIG_myremote_SECRET_ACCESS_KEY=YYYY
# optional: custom endpoint
# export RCLONE_CONFIG_myremote_ENDPOINT=https://s3.my-org.example
3) Copy data in (and resume if it breaks)
To instance (S3 → NVMe)
# pull a dataset down to local /data
mkdir -p /data
rclone copy myremote:datasets/projectA /data \
--progress --transfers 16 --checkers 8 --fast-list \
--s3-chunk-size 64M --s3-upload-concurrency 6
From instance (NVMe → S3)
rclone copy /data/results myremote:results/projectA \
--progress --transfers 16 --checkers 8 --fast-list \
--s3-chunk-size 64M --s3-upload-concurrency 6
- Resumable: rclone resumes interrupted transfers automatically.
- Tuning: start with the settings above; raise
--transfers
gently until bandwidth or IOPS saturate. Large objects like.tar.zst
prefer larger--s3-chunk-size
(128M+).
SSH/SFTP example
rclone copy /data/results sftpremote:/srv/results/projectA \
--progress --transfers 8 --checkers 4
4) Integrity: SHA‑256 manifests you can trust
Make a manifest on the source, copy data and manifest, then verify on the destination.
Create manifest at source
cd /data/results
rclone hashsum SHA-256 . > SHA256SUMS.txt
Copy data + manifest
rclone copy /data/results myremote:results/projectA --progress
rclone copy /data/results/SHA256SUMS.txt myremote:results/projectA
Verify at destination (downloaded)
# Option A: verify after download back on another machine
rclone copy myremote:results/projectA ./projectA
cd projectA && sha256sum -c SHA256SUMS.txt
Verify in place (remote hash listing)
# If your remote exposes SHA-256/MD5, list remote hashes and compare
rclone hashsum SHA-256 myremote:results/projectA > REMOTE_SHA256.txt
# diff REMOTE_SHA256.txt with your local manifest (paths must match)
If the object store doesn’t expose strong hashes per part (common with S3 multipart), trust the manifest workflow: recompute locally after download and compare.
5) Sync vs copy, and delete safety
copy
only adds/updates files on the destination.sync
makes the destination match the source (including deletes). Use with care:
rclone sync /data/results myremote:results/projectA --progress --delete-before
Add --dry-run
first to preview deletes.
6) Fewer files = faster transfers (bundle smartly)
Millions of tiny files stall on metadata. Bundle logically, then compress.
# bundle and compress (multi-core)
cd /data/run123
tar -I 'zstd -T0 -19' -cf run123.tar.zst .
# upload the single archive + a tiny MANIFEST file listing contents
rclone copy run123.tar.zst myremote:runs/ --progress
Prefer zstd for speed; use pigz
for gzip compatibility. Keep bundles below a few tens of GB if you need easy partial re‑runs.
7) Move data between buckets or projects
You can copy remote→remote without pulling to the instance:
rclone copy awsA:bucketA/prefix gsB:bucketB/prefix --progress --transfers 32 --checkers 16
Works across providers if both remotes are configured.
8) Bandwidth and reliability knobs
--bwlimit 100M
to cap bandwidth if you share a link.--retries 8 --low-level-retries 20
for flaky paths.--timeout 2m --contimeout 10s
to tune slow endpoints.--checksum
asks rclone to use hashes when the remote supports them.
Log the exact command in your run card.
9) rsync when both ends are POSIX
For SSH on LAN or a well‑peered WAN, rsync
is great:
rsync -avhP --delete --partial --partial-dir=.rsync-partial \
/data/results user@host:/srv/results/projectA
--partial
lets resumes continue. Still write a SHA‑256 manifest and verify.
10) Security basics
- Keep access keys in rclone config or env vars, not in images.
- Mount secrets at runtime; don’t commit them.
- Prefer VPN/SSH to open buckets. If public, restrict by IP and expire presigned URLs quickly.
Methods snippet (copy‑paste)
transfers:
tool: "rclone 1.xx"
source:
type: "local | s3 | sftp | gcs | azure | minio"
url: "<path or remote:bucket/prefix>"
destination:
type: "local | s3 | sftp | gcs | azure | minio"
url: "<path or remote:bucket/prefix>"
command: |
rclone copy <src> <dst> --transfers 16 --checkers 8 --s3-chunk-size 64M --progress
manifest:
algo: "SHA-256"
file: "SHA256SUMS.txt"
verified: "yes | no"
notes: "row group size, compression, any retries/timeouts"
Related reading
Try Compute today
Start a GPU instance with a CUDA-ready template (e.g., Ubuntu 24.04 LTS / CUDA 12.6) or your own GROMACS image. Enjoy flexible per-second billing with custom templates and the ability to start, stop, and resume your sessions at any time. Unsure about FP64 requirements? Contact support to help you select the ideal hardware profile for your computational needs.