Running AWS at the Edge - Storage Backup to S3

Your edge device is generating data, but what happens when that Raspberry Pi fails? In this article, we’ll configure rclone to automatically sync your SeaweedFS storage with AWS S3, creating a robust backup strategy that works even on limited bandwidth.

We’ll set up rclone for both SeaweedFS and AWS S3, then explore the copy and sync commands to manage your backups effectively.

Pre-requisites Link to heading

SeaweedFS running and accessible (see the previous article for setup)
AWS Account with S3 access
AWS CLI configured with appropriate credentials
rclone installed on your system
Tailscale installed on your edge device (optional, for remote access)

Tailscale Funnel: Exposing SeaweedFS to the Internet Link to heading

If you need to access your SeaweedFS instance remotely or from cloud services, Tailscale Funnel provides a secure way to expose your local service to the internet without port forwarding.

What is Tailscale Funnel? Link to heading

Tailscale Funnel creates a public HTTPS endpoint for services running on your Tailscale network. Unlike Tailscale Serve (which only works within your tailnet), Funnel makes your service accessible to anyone on the internet. The traffic flows through Tailscale’s infrastructure and gets routed to your device over your existing Tailscale connection.

Key Benefits:

No port forwarding or firewall configuration needed
Automatic HTTPS with valid TLS certificates
Built-in DDoS protection from Tailscale
Easy to enable and disable
Works behind NAT and restrictive firewalls

Enabling Tailscale Funnel for SeaweedFS Link to heading

First, ensure your Tailscale node is up to date and Funnel is enabled for your tailnet:

Enable HTTPS on your SeaweedFS S3 port. Since SeaweedFS runs on port 8333, we’ll expose it via Funnel:

sudo tailscale funnel --bg 8333

This creates a public HTTPS endpoint (https://your-device.tail-scale.ts.net) that routes to port 8333 with automatic TLS certificates.

Verify the Funnel is active:

tailscale funnel status

You should see output like:

https://raspberry-pi.tail-scale.ts.net (Funnel on)
|-- / proxy http://127.0.0.1:8333

Testing the Tailscale Funnel Endpoint Link to heading

Test your SeaweedFS access through the Funnel endpoint:

# Create a bucket using the Funnel endpoint
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 mb s3://<BUCKET_NAME>

# List buckets
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 ls

If these commands work, your Funnel is properly configured.

Rclone Configuration Link to heading

Configure rclone to work with both SeaweedFS and AWS S3. We’ll create a configuration file that defines both storage endpoints.

Finding the Configuration Location Link to heading

Find where rclone expects its configuration file:

rclone config file

You’ll see the location (typically ~/.config/rclone/rclone.conf). If the file doesn’t exist, create it:

mkdir -p ~/.config/rclone
touch ~/.config/rclone/rclone.conf

Configuration File Setup Link to heading

Add the following configuration to your rclone.conf file:

[seaweedfs]
type = s3
provider = SeaweedFS
region = us-east-1
endpoint = <TAILSCALE FUNNEL IP>
acl = private

[aws]
type = s3
provider = AWS
env_auth = true
region = us-east-1

Configuration Breakdown Link to heading

SeaweedFS Remote ([seaweedfs]):

type = s3 - Tells rclone this is S3-compatible storage
provider = SeaweedFS - Specifies the S3 variant
endpoint = http://localhost:8333 - Your local SeaweedFS S3 API endpoint
- Use https://your-device.tail-scale.ts.net if accessing via Tailscale Funnel
acl = private - Sets default access control for uploads

AWS Remote ([aws]):

type = s3 - Standard S3 configuration
provider = AWS - Official AWS S3
env_auth = true - Use AWS credentials from environment/CLI config
region = us-east-1 - Your AWS region

Testing the Configuration Link to heading

Test your configuration:

# List configured remotes
rclone config

# List buckets in both locations
rclone lsd aws:
rclone lsd seaweedfs:

If you see your buckets listed, you’re all set—the configuration is working correctly.

Using Rclone Copy and Sync Link to heading

With rclone configured, you have two main transfer options: copy and sync.

Rclone Copy Link to heading

The copy command adds files to your destination without touching existing data—perfect for incremental backups.

Basic Copy Examples Link to heading

Copy a single file:

# Backup today's log file
rclone copy seaweedfs:logs/app-2025-12-27.log aws:daily-backups/

Copy an entire bucket:

# Backup entire documents folder
rclone copy seaweedfs:documents aws:document-backups

Copy with progress display:

# Monitor transfer progress
rclone copy seaweedfs:documents aws:document-backups --progress

When to Use Copy Link to heading

Adding files to an existing backup without removing anything
One-time transfers
When you want to preserve existing files at the destination

Rclone Sync Link to heading

The sync command makes the destination identical to the source by copying new/changed files and deleting files that don’t exist in the source.

Basic Sync Examples Link to heading

Sync entire buckets:

rclone sync seaweedfs:source-bucket aws:backup-bucket

Sync with bandwidth limiting:

rclone sync seaweedfs:source-bucket aws:backup-bucket --bwlimit 10M

When to Use Sync Link to heading

Creating exact backups
Regular backup jobs where you want the destination to mirror the source
When you need to remove deleted files from backups

⚠️ Warning: Sync deletes files at the destination that don’t exist in the source. Always test with --dry-run first.

Useful Flags for Edge Devices Link to heading

Performance Optimization Link to heading

# Limit bandwidth to 10MB/s
--bwlimit 10M

# Reduce concurrent transfers for limited resources
--transfers 2

# Reduce file checking threads
--checkers 2

# Retry failed transfers
--retries 3

Testing and Monitoring Link to heading

# Show what would be transferred without doing it
--dry-run

# Display transfer progress
--progress

# Verify file integrity with checksums
--checksum

# Log to file
--log-file /var/log/rclone.log

Filtering Link to heading

# Only sync files newer than 2 days
--max-age 2d

# Exclude temporary files
--exclude "*.tmp"

# Include only specific file types
--include "*.pdf"

Practical Examples Link to heading

Daily Backup Sync Link to heading

rclone sync seaweedfs:important-data aws:daily-backup \
    --bwlimit 5M \
    --transfers 2 \
    --retries 3 \
    --log-file /var/log/daily-backup.log

Incremental Copy (Add Only) Link to heading

rclone copy seaweedfs:documents aws:document-archive \
    --max-age 1d \
    --progress

Test Before Sync Link to heading

# Always test first
rclone sync seaweedfs:source aws:backup --dry-run

# If the output looks correct, run without --dry-run
rclone sync seaweedfs:source aws:backup

Copy vs Sync: Choosing the Right Command Link to heading

Use Case	Command	Why
Initial backup	`copy`	Safe, won’t delete anything
Regular backups	`sync`	Keeps destination identical to source
Archive old files	`copy`	Preserves historical data
Mirror production	`sync`	Exact replica including deletions
One-time transfer	`copy`	Simple and safe

Wrapping Up Link to heading

You now have rclone configured to transfer files between SeaweedFS and AWS S3. The copy command is perfect for safe, additive transfers, while sync creates exact mirrors of your data.

Key takeaways:

Use copy when you want to preserve existing destination files
Use sync when you want an exact backup that mirrors the source
Always test with --dry-run before running sync operations
Optimize transfers for your edge device with bandwidth and concurrency limits

This backup strategy gives you the best of both worlds: fast local access through SeaweedFS and reliable cloud storage through S3. Your edge devices can now fail without losing critical data, and you maintain full control over the transfer process.

FAQ Link to heading

Question: Why would I need to copy from a seaweedfs bucket to AWS S3 when I can either use `rclone sync` locally or use `aws s3 cp`. Link to heading

This is a valid question! Here are the key reasons why using rclone to transfer from SeaweedFS to AWS S3 makes sense:

1. Unified Interface

rclone provides a consistent command syntax for both SeaweedFS and AWS S3
You don’t need to remember different command formats (aws s3 cp vs rclone copy)
Same flags and options work across both storage systems

2. Advanced Transfer Features

Bandwidth limiting (--bwlimit) - crucial for edge devices with limited internet
Concurrent transfer control (--transfers, --checkers) - optimize for your hardware
Retry logic (--retries) - handles intermittent connectivity automatically
Progress monitoring (--progress) - see real-time transfer status
Dry-run testing (--dry-run) - preview changes before executing

3. Intelligent Sync Capabilities

rclone can compare files between SeaweedFS and S3 efficiently
Only transfers changed/new files (incremental backups)
Built-in checksum verification for data integrity
Handles large datasets better than basic copy commands

4. Edge Device Optimization

# This level of control isn't available with aws s3 cp
rclone sync seaweedfs:data aws:backup \
    --bwlimit 5M \
    --transfers 2 \
    --checkers 2 \
    --retries 5 \
    --checksum

5. Object Store to Object Store Transfer When we copy files from SeaweedFS to AWS S3, we’re performing an object-to-object transfer between two S3-compatible systems. This is different from traditional file-to-object transfers and offers several advantages:

Native S3 API compatibility - Both systems speak the same protocol
Metadata preservation - Object metadata, tags, and properties are maintained
Efficient transfer protocols - Optimized for object storage operations
Consistent object naming - No file path translation issues
Atomic operations - Objects are transferred as complete units

This is more efficient than copying files from a local filesystem to S3 because:

No file-to-object conversion overhead
Better handling of large objects and multipart uploads
Preserved S3-specific features like storage classes and lifecycle policies

Question: Can `rclone` work with other Cloud Service Providers? Link to heading

Yes it sure can! As a matter of fact, if we are talking about a Multi-Cloud Strategy where the edge device needs to back up to multiple Object Storages in different CSPs then rclone is the best tool for the job. Rclone provides consistent tooling across different storage backends

We would simply need to add the other CSPs to the rclone config file and call it as a different object store. For example, if I have an object store in Azure and I want to copy/sync data from seaweedfs to azure I would just run: rclone <copy/sync> seaweedfs:<BUCKET> azure:<BUCKET>

Question: What are some strategies to use `rclone` Link to heading

The best strategy is to create a synchronous backup using systemd timers. The design of the script would follow these steps:

Say you have a logfile that needs to be copied to seaweedfs bucket. We can run the rclone copy command to copy the logfile into the seaweedfs bucket
Then we can run rclone sync with the appropriate edge device flags we saw in this article
Attach those commands to a systemd unit file and then create a timer that runs every 2-2.5 hours

To make things even cooler is to have a loop cycle through each of your Cloud Service Providers. Conceptually, you would do the following:

Add your CSPs credentials to rclone
Create a bash array of your CSPs rclone config names
Loop through the array by testing the connection to your CSP
If the connection goes through, then you can perform a backup. Else, skip that CSP and go to the next CSP in the array