Running AWS at the Edge - Storage Backup to S3
Your edge device is generating data, but what happens when that Raspberry Pi fails? In this article, we’ll configure rclone to automatically sync your SeaweedFS storage with AWS S3, creating a robust backup strategy that works even on limited bandwidth.
We’ll set up rclone for both SeaweedFS and AWS S3, then explore the copy and sync commands to manage your backups effectively.
Pre-requisites Link to heading
- SeaweedFS running and accessible (see the previous article for setup)
- AWS Account with S3 access
- AWS CLI configured with appropriate credentials
- rclone installed on your system
- Tailscale installed on your edge device (optional, for remote access)
Tailscale Funnel: Exposing SeaweedFS to the Internet Link to heading
If you need to access your SeaweedFS instance remotely or from cloud services, Tailscale Funnel provides a secure way to expose your local service to the internet without port forwarding.
What is Tailscale Funnel? Link to heading
Tailscale Funnel creates a public HTTPS endpoint for services running on your Tailscale network. Unlike Tailscale Serve (which only works within your tailnet), Funnel makes your service accessible to anyone on the internet. The traffic flows through Tailscale’s infrastructure and gets routed to your device over your existing Tailscale connection.
Key Benefits:
- No port forwarding or firewall configuration needed
- Automatic HTTPS with valid TLS certificates
- Built-in DDoS protection from Tailscale
- Easy to enable and disable
- Works behind NAT and restrictive firewalls
Enabling Tailscale Funnel for SeaweedFS Link to heading
First, ensure your Tailscale node is up to date and Funnel is enabled for your tailnet:
- Enable HTTPS on your SeaweedFS S3 port. Since SeaweedFS runs on port 8333, we’ll expose it via Funnel:
sudo tailscale funnel --bg 8333
This creates a public HTTPS endpoint (https://your-device.tail-scale.ts.net) that routes to port 8333 with automatic TLS certificates.
- Verify the Funnel is active:
tailscale funnel status
You should see output like:
https://raspberry-pi.tail-scale.ts.net (Funnel on)
|-- / proxy http://127.0.0.1:8333
Testing the Tailscale Funnel Endpoint Link to heading
Test your SeaweedFS access through the Funnel endpoint:
# Create a bucket using the Funnel endpoint
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 mb s3://<BUCKET_NAME>
# List buckets
aws --profile <SEAWEED_PROFILE> --endpoint <TAILSCALE_FUNNEL_ENDPOINT> s3 ls
If these commands work, your Funnel is properly configured.
Rclone Configuration Link to heading
Configure rclone to work with both SeaweedFS and AWS S3. We’ll create a configuration file that defines both storage endpoints.
Finding the Configuration Location Link to heading
Find where rclone expects its configuration file:
rclone config file
You’ll see the location (typically ~/.config/rclone/rclone.conf). If the file doesn’t exist, create it:
mkdir -p ~/.config/rclone
touch ~/.config/rclone/rclone.conf
Configuration File Setup Link to heading
Add the following configuration to your rclone.conf file:
[seaweedfs]
type = s3
provider = SeaweedFS
region = us-east-1
endpoint = <TAILSCALE FUNNEL IP>
acl = private
[aws]
type = s3
provider = AWS
env_auth = true
region = us-east-1
Configuration Breakdown Link to heading
SeaweedFS Remote ([seaweedfs]):
type = s3- Tells rclone this is S3-compatible storageprovider = SeaweedFS- Specifies the S3 variantendpoint = http://localhost:8333- Your local SeaweedFS S3 API endpoint- Use
https://your-device.tail-scale.ts.netif accessing via Tailscale Funnel
- Use
acl = private- Sets default access control for uploads
AWS Remote ([aws]):
type = s3- Standard S3 configurationprovider = AWS- Official AWS S3env_auth = true- Use AWS credentials from environment/CLI configregion = us-east-1- Your AWS region
Testing the Configuration Link to heading
Test your configuration:
# List configured remotes
rclone config
# List buckets in both locations
rclone lsd aws:
rclone lsd seaweedfs:
If you see your buckets listed, you’re all set—the configuration is working correctly.
Using Rclone Copy and Sync Link to heading
With rclone configured, you have two main transfer options: copy and sync.
Rclone Copy Link to heading
The copy command adds files to your destination without touching existing data—perfect for incremental backups.
Basic Copy Examples Link to heading
Copy a single file:
# Backup today's log file
rclone copy seaweedfs:logs/app-2025-12-27.log aws:daily-backups/
Copy an entire bucket:
# Backup entire documents folder
rclone copy seaweedfs:documents aws:document-backups
Copy with progress display:
# Monitor transfer progress
rclone copy seaweedfs:documents aws:document-backups --progress
When to Use Copy Link to heading
- Adding files to an existing backup without removing anything
- One-time transfers
- When you want to preserve existing files at the destination
Rclone Sync Link to heading
The sync command makes the destination identical to the source by copying new/changed files and deleting files that don’t exist in the source.
Basic Sync Examples Link to heading
Sync entire buckets:
rclone sync seaweedfs:source-bucket aws:backup-bucket
Sync with bandwidth limiting:
rclone sync seaweedfs:source-bucket aws:backup-bucket --bwlimit 10M
When to Use Sync Link to heading
- Creating exact backups
- Regular backup jobs where you want the destination to mirror the source
- When you need to remove deleted files from backups
⚠️ Warning: Sync deletes files at the destination that don’t exist in the source. Always test with --dry-run first.
Useful Flags for Edge Devices Link to heading
Performance Optimization Link to heading
# Limit bandwidth to 10MB/s
--bwlimit 10M
# Reduce concurrent transfers for limited resources
--transfers 2
# Reduce file checking threads
--checkers 2
# Retry failed transfers
--retries 3
Testing and Monitoring Link to heading
# Show what would be transferred without doing it
--dry-run
# Display transfer progress
--progress
# Verify file integrity with checksums
--checksum
# Log to file
--log-file /var/log/rclone.log
Filtering Link to heading
# Only sync files newer than 2 days
--max-age 2d
# Exclude temporary files
--exclude "*.tmp"
# Include only specific file types
--include "*.pdf"
Practical Examples Link to heading
Daily Backup Sync Link to heading
rclone sync seaweedfs:important-data aws:daily-backup \
--bwlimit 5M \
--transfers 2 \
--retries 3 \
--log-file /var/log/daily-backup.log
Incremental Copy (Add Only) Link to heading
rclone copy seaweedfs:documents aws:document-archive \
--max-age 1d \
--progress
Test Before Sync Link to heading
# Always test first
rclone sync seaweedfs:source aws:backup --dry-run
# If the output looks correct, run without --dry-run
rclone sync seaweedfs:source aws:backup
Copy vs Sync: Choosing the Right Command Link to heading
| Use Case | Command | Why |
|---|---|---|
| Initial backup | copy | Safe, won’t delete anything |
| Regular backups | sync | Keeps destination identical to source |
| Archive old files | copy | Preserves historical data |
| Mirror production | sync | Exact replica including deletions |
| One-time transfer | copy | Simple and safe |
Wrapping Up Link to heading
You now have rclone configured to transfer files between SeaweedFS and AWS S3. The copy command is perfect for safe, additive transfers, while sync creates exact mirrors of your data.
Key takeaways:
- Use
copywhen you want to preserve existing destination files - Use
syncwhen you want an exact backup that mirrors the source - Always test with
--dry-runbefore running sync operations - Optimize transfers for your edge device with bandwidth and concurrency limits
This backup strategy gives you the best of both worlds: fast local access through SeaweedFS and reliable cloud storage through S3. Your edge devices can now fail without losing critical data, and you maintain full control over the transfer process.
FAQ Link to heading
Question: Why would I need to copy from a seaweedfs bucket to AWS S3 when I can either use rclone sync locally or use aws s3 cp.
Link to heading
This is a valid question! Here are the key reasons why using rclone to transfer from SeaweedFS to AWS S3 makes sense:
1. Unified Interface
- rclone provides a consistent command syntax for both SeaweedFS and AWS S3
- You don’t need to remember different command formats (
aws s3 cpvsrclone copy) - Same flags and options work across both storage systems
2. Advanced Transfer Features
- Bandwidth limiting (
--bwlimit) - crucial for edge devices with limited internet - Concurrent transfer control (
--transfers,--checkers) - optimize for your hardware - Retry logic (
--retries) - handles intermittent connectivity automatically - Progress monitoring (
--progress) - see real-time transfer status - Dry-run testing (
--dry-run) - preview changes before executing
3. Intelligent Sync Capabilities
- rclone can compare files between SeaweedFS and S3 efficiently
- Only transfers changed/new files (incremental backups)
- Built-in checksum verification for data integrity
- Handles large datasets better than basic copy commands
4. Edge Device Optimization
# This level of control isn't available with aws s3 cp
rclone sync seaweedfs:data aws:backup \
--bwlimit 5M \
--transfers 2 \
--checkers 2 \
--retries 5 \
--checksum
5. Object Store to Object Store Transfer When we copy files from SeaweedFS to AWS S3, we’re performing an object-to-object transfer between two S3-compatible systems. This is different from traditional file-to-object transfers and offers several advantages:
- Native S3 API compatibility - Both systems speak the same protocol
- Metadata preservation - Object metadata, tags, and properties are maintained
- Efficient transfer protocols - Optimized for object storage operations
- Consistent object naming - No file path translation issues
- Atomic operations - Objects are transferred as complete units
This is more efficient than copying files from a local filesystem to S3 because:
- No file-to-object conversion overhead
- Better handling of large objects and multipart uploads
- Preserved S3-specific features like storage classes and lifecycle policies
Question: Can rclone work with other Cloud Service Providers?
Link to heading
Yes it sure can! As a matter of fact, if we are talking about a Multi-Cloud Strategy where the edge device needs to back up to multiple Object Storages in different CSPs then rclone is the best tool for the job. Rclone provides consistent tooling across different storage backends
We would simply need to add the other CSPs to the rclone config file and call it as a different object store. For example, if I have an object store in Azure and I want to copy/sync data from seaweedfs to azure I would just run:
rclone <copy/sync> seaweedfs:<BUCKET> azure:<BUCKET>
Question: What are some strategies to use rclone
Link to heading
The best strategy is to create a synchronous backup using systemd timers. The design of the script would follow these steps:
- Say you have a logfile that needs to be copied to seaweedfs bucket. We can run the
rclone copycommand to copy the logfile into the seaweedfs bucket - Then we can run
rclone syncwith the appropriate edge device flags we saw in this article - Attach those commands to a systemd unit file and then create a timer that runs every 2-2.5 hours
To make things even cooler is to have a loop cycle through each of your Cloud Service Providers. Conceptually, you would do the following:
- Add your CSPs credentials to rclone
- Create a bash array of your CSPs rclone config names
- Loop through the array by testing the connection to your CSP
- If the connection goes through, then you can perform a backup. Else, skip that CSP and go to the next CSP in the array