Running AWS at the Edge - Storage Backup to S3

Your edge device is generating data, but what happens when that Raspberry Pi fails? In this article, we’ll configure rclone to automatically sync your SeaweedFS storage with AWS S3, creating a robust backup strategy that works even on limited bandwidth.

We’ll set up rclone for both SeaweedFS and AWS S3, then explore the copy and sync commands to manage your backups effectively.

Pre-requisites Link to heading

  • SeaweedFS running and accessible (see the previous article for setup)
  • AWS Account with S3 access
  • AWS CLI configured with appropriate credentials
  • rclone installed on your system
  • Tailscale installed on your edge device (optional, for remote access)

Tailscale Funnel: Exposing SeaweedFS to the Internet Link to heading

If you need to access your SeaweedFS instance remotely or from cloud services, Tailscale Funnel provides a secure way to expose your local service to the internet without port forwarding.

What is Tailscale Funnel? Link to heading

Tailscale Funnel creates a public HTTPS endpoint for services running on your Tailscale network. Unlike Tailscale Serve (which only works within your tailnet), Funnel makes your service accessible to anyone on the internet. The traffic flows through Tailscale’s infrastructure and gets routed to your device over your existing Tailscale connection.

Key Benefits:

  • No port forwarding or firewall configuration needed
  • Automatic HTTPS with valid TLS certificates
  • Built-in DDoS protection from Tailscale
  • Easy to enable and disable
  • Works behind NAT and restrictive firewalls

Enabling Tailscale Funnel for SeaweedFS Link to heading

First, ensure your Tailscale node is up to date and Funnel is enabled for your tailnet:

  1. Enable HTTPS on your SeaweedFS S3 port. Since SeaweedFS runs on port 8333, we’ll expose it via Funnel:
sudo tailscale funnel --bg 8333

This creates a public HTTPS endpoint (https://your-device.tail-scale.ts.net) that routes to port 8333 with automatic TLS certificates.

  1. Verify the Funnel is active:
tailscale funnel status

You should see output like:

https://raspberry-pi.tail-scale.ts.net (Funnel on)
|-- / proxy http://127.0.0.1:8333

Testing the Tailscale Funnel Endpoint Link to heading

Test your SeaweedFS access through the Funnel endpoint:

# Create a bucket using the Funnel endpoint
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 mb s3://<BUCKET_NAME>

# List buckets
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 ls

If these commands work, your Funnel is properly configured.

Rclone Configuration Link to heading

Configure rclone to work with both SeaweedFS and AWS S3. We’ll create a configuration file that defines both storage endpoints.

Finding the Configuration Location Link to heading

Find where rclone expects its configuration file:

rclone config file

You’ll see the location (typically ~/.config/rclone/rclone.conf). If the file doesn’t exist, create it:

mkdir -p ~/.config/rclone
touch ~/.config/rclone/rclone.conf

Configuration File Setup Link to heading

Add the following configuration to your rclone.conf file:

[seaweedfs]
type = s3
provider = SeaweedFS
region = us-east-1
endpoint = <TAILSCALE FUNNEL IP>
acl = private

[aws]
type = s3
provider = AWS
env_auth = true
region = us-east-1

Configuration Breakdown Link to heading

SeaweedFS Remote ([seaweedfs]):

  • type = s3 - Tells rclone this is S3-compatible storage
  • provider = SeaweedFS - Specifies the S3 variant
  • endpoint = http://localhost:8333 - Your local SeaweedFS S3 API endpoint
    • Use https://your-device.tail-scale.ts.net if accessing via Tailscale Funnel
  • acl = private - Sets default access control for uploads

AWS Remote ([aws]):

  • type = s3 - Standard S3 configuration
  • provider = AWS - Official AWS S3
  • env_auth = true - Use AWS credentials from environment/CLI config
  • region = us-east-1 - Your AWS region

Testing the Configuration Link to heading

Test your configuration:

# List configured remotes
rclone config

# List buckets in both locations
rclone lsd aws:
rclone lsd seaweedfs:

If you see your buckets listed, you’re all set—the configuration is working correctly.

Using Rclone Copy and Sync Link to heading

With rclone configured, you have two main transfer options: copy and sync.

Rclone Copy Link to heading

The copy command adds files to your destination without touching existing data—perfect for incremental backups.

Basic Copy Examples Link to heading

Copy a single file:

# Backup today's log file
rclone copy seaweedfs:logs/app-2025-12-27.log aws:daily-backups/

Copy an entire bucket:

# Backup entire documents folder
rclone copy seaweedfs:documents aws:document-backups

Copy with progress display:

# Monitor transfer progress
rclone copy seaweedfs:documents aws:document-backups --progress

When to Use Copy Link to heading

  • Adding files to an existing backup without removing anything
  • One-time transfers
  • When you want to preserve existing files at the destination

Rclone Sync Link to heading

The sync command makes the destination identical to the source by copying new/changed files and deleting files that don’t exist in the source.

Basic Sync Examples Link to heading

Sync entire buckets:

rclone sync seaweedfs:source-bucket aws:backup-bucket

Sync with bandwidth limiting:

rclone sync seaweedfs:source-bucket aws:backup-bucket --bwlimit 10M

When to Use Sync Link to heading

  • Creating exact backups
  • Regular backup jobs where you want the destination to mirror the source
  • When you need to remove deleted files from backups

⚠️ Warning: Sync deletes files at the destination that don’t exist in the source. Always test with --dry-run first.

Useful Flags for Edge Devices Link to heading

Performance Optimization Link to heading

# Limit bandwidth to 10MB/s
--bwlimit 10M

# Reduce concurrent transfers for limited resources
--transfers 2

# Reduce file checking threads
--checkers 2

# Retry failed transfers
--retries 3

Testing and Monitoring Link to heading

# Show what would be transferred without doing it
--dry-run

# Display transfer progress
--progress

# Verify file integrity with checksums
--checksum

# Log to file
--log-file /var/log/rclone.log

Filtering Link to heading

# Only sync files newer than 2 days
--max-age 2d

# Exclude temporary files
--exclude "*.tmp"

# Include only specific file types
--include "*.pdf"

Practical Examples Link to heading

Daily Backup Sync Link to heading

rclone sync seaweedfs:important-data aws:daily-backup \
    --bwlimit 5M \
    --transfers 2 \
    --retries 3 \
    --log-file /var/log/daily-backup.log

Incremental Copy (Add Only) Link to heading

rclone copy seaweedfs:documents aws:document-archive \
    --max-age 1d \
    --progress

Test Before Sync Link to heading

# Always test first
rclone sync seaweedfs:source aws:backup --dry-run

# If the output looks correct, run without --dry-run
rclone sync seaweedfs:source aws:backup

Copy vs Sync: Choosing the Right Command Link to heading

Use CaseCommandWhy
Initial backupcopySafe, won’t delete anything
Regular backupssyncKeeps destination identical to source
Archive old filescopyPreserves historical data
Mirror productionsyncExact replica including deletions
One-time transfercopySimple and safe

Wrapping Up Link to heading

You now have rclone configured to transfer files between SeaweedFS and AWS S3. The copy command is perfect for safe, additive transfers, while sync creates exact mirrors of your data.

Key takeaways:

  • Use copy when you want to preserve existing destination files
  • Use sync when you want an exact backup that mirrors the source
  • Always test with --dry-run before running sync operations
  • Optimize transfers for your edge device with bandwidth and concurrency limits

This backup strategy gives you the best of both worlds: fast local access through SeaweedFS and reliable cloud storage through S3. Your edge devices can now fail without losing critical data, and you maintain full control over the transfer process.

FAQ Link to heading

Question: Why would I need to copy from a seaweedfs bucket to AWS S3 when I can either use rclone sync locally or use aws s3 cp. Link to heading

This is a valid question! Here are the key reasons why using rclone to transfer from SeaweedFS to AWS S3 makes sense:

1. Unified Interface

  • rclone provides a consistent command syntax for both SeaweedFS and AWS S3
  • You don’t need to remember different command formats (aws s3 cp vs rclone copy)
  • Same flags and options work across both storage systems

2. Advanced Transfer Features

  • Bandwidth limiting (--bwlimit) - crucial for edge devices with limited internet
  • Concurrent transfer control (--transfers, --checkers) - optimize for your hardware
  • Retry logic (--retries) - handles intermittent connectivity automatically
  • Progress monitoring (--progress) - see real-time transfer status
  • Dry-run testing (--dry-run) - preview changes before executing

3. Intelligent Sync Capabilities

  • rclone can compare files between SeaweedFS and S3 efficiently
  • Only transfers changed/new files (incremental backups)
  • Built-in checksum verification for data integrity
  • Handles large datasets better than basic copy commands

4. Edge Device Optimization

# This level of control isn't available with aws s3 cp
rclone sync seaweedfs:data aws:backup \
    --bwlimit 5M \
    --transfers 2 \
    --checkers 2 \
    --retries 5 \
    --checksum

5. Object Store to Object Store Transfer When we copy files from SeaweedFS to AWS S3, we’re performing an object-to-object transfer between two S3-compatible systems. This is different from traditional file-to-object transfers and offers several advantages:

  • Native S3 API compatibility - Both systems speak the same protocol
  • Metadata preservation - Object metadata, tags, and properties are maintained
  • Efficient transfer protocols - Optimized for object storage operations
  • Consistent object naming - No file path translation issues
  • Atomic operations - Objects are transferred as complete units

This is more efficient than copying files from a local filesystem to S3 because:

  • No file-to-object conversion overhead
  • Better handling of large objects and multipart uploads
  • Preserved S3-specific features like storage classes and lifecycle policies

Question: Can rclone work with other Cloud Service Providers? Link to heading

Yes it sure can! As a matter of fact, if we are talking about a Multi-Cloud Strategy where the edge device needs to back up to multiple Object Storages in different CSPs then rclone is the best tool for the job. Rclone provides consistent tooling across different storage backends

We would simply need to add the other CSPs to the rclone config file and call it as a different object store. For example, if I have an object store in Azure and I want to copy/sync data from seaweedfs to azure I would just run: rclone <copy/sync> seaweedfs:<BUCKET> azure:<BUCKET>

Question: What are some strategies to use rclone Link to heading

The best strategy is to create a synchronous backup using systemd timers. The design of the script would follow these steps:

  • Say you have a logfile that needs to be copied to seaweedfs bucket. We can run the rclone copy command to copy the logfile into the seaweedfs bucket
  • Then we can run rclone sync with the appropriate edge device flags we saw in this article
  • Attach those commands to a systemd unit file and then create a timer that runs every 2-2.5 hours

To make things even cooler is to have a loop cycle through each of your Cloud Service Providers. Conceptually, you would do the following:

  1. Add your CSPs credentials to rclone
  2. Create a bash array of your CSPs rclone config names
  3. Loop through the array by testing the connection to your CSP
  4. If the connection goes through, then you can perform a backup. Else, skip that CSP and go to the next CSP in the array