Speeding up many small transfers to a unifi nas

# April 23, 2025

Moving some of my old local ML projects to a NAS was taking an unbearably long time. A folder with 25k small files was estimating a whole day to transfer. The whole thing was syncing kilobytes at a time.

There's a lot of overhead to moving small files. Especially over the network. Each file has separate headers, checksum verification, etc. Not to mention it seems there are some Mac specific gotchas with the default SMB config that might exacerbate the issue.

When bundled into a single tarfile the transfer finished in 2 minutes flat. First, build a new uncompressed tarball. If you're not limited by your connection and have enough storage overhead on your system to duplicate your folder size, this will be much faster than having to do the round-trip of compression -> transfer -> decompression.

tar -cvf myfolder.tar  myfolder/

Then ssh into your admin account configured in the panel. I have my local ssh key added as a remote trusted key, which keeps me from digging up the password every time.

ssh root@<unas-ip>

When you drop into the UNAS shell you're inside the root filesystem of UniFi OS, not the data pool. The disks you created in UniFi Drive are mounted under the volume1 ZFS dataset, inside a hidden service directory.

cd /volume1/.srv/.unifi-drive
ls -1          # every shared drive you made is listed here

Each drive has two sub-directories:

  • .data โ€“ the actual payload if the drive is not encrypted
  • .unencrypted / .encrypted โ€“ shown instead of .data when you enabled encryption for that drive

So a plain, un-encrypted drive called Media lives at:

cd /volume1/.srv/.unifi-drive/Media/.data

If you encrypted it:

cd /volume1/.srv/.unifi-drive/Media/.unencrypted   # filenames are still clear

(Those same paths are what NFS exports. showmount -e <unas-ip> will list them exactly.)

When you're in the right place, unzip on the remote:

tar -xvf <archive_name>.tar

After doing this dance a few times, I got tired of manually running the three-step process. You can pretty easily add a bash alias that handles the whole workflow: tar locally, transfer over SSH, then extract remotely.

The core building blocks are straightforward:

Zipping up locally:

tar -cvf "$tar_name" "$local_folder"

Transferring them:

scp "$tar_name" "$remote_host:$remote_path"

Unzipping them remote:

ssh "$remote_host" "cd '$remote_path' && tar -xvf '$tar_name' && rm '$tar_name'"

Here's the full implementation with error handling and cleanup:

scpzip() {
    if [ $# -lt 3 ]; then
        echo "Usage: scpzip <local_folder> <remote_host> <remote_path>"
        echo "Example: scpzip ./myproject [email protected] /volume1/.srv/.unifi-drive/Media/.data/"
        return 1
    fi

    local_folder="$1"
    remote_host="$2" 
    remote_path="$3"

    # Get the folder name for the tar file
    folder_name=$(basename "$local_folder")
    tar_name="${folder_name}.tar"

    echo "๐Ÿ“ฆ Creating tar archive: $tar_name"
    tar -cvf "$tar_name" "$local_folder"

    if [ $? -ne 0 ]; then
        echo "โŒ Failed to create tar archive"
        return 1
    fi

    echo "๐Ÿš€ Transferring $tar_name to $remote_host"
    scp "$tar_name" "$remote_host:$remote_path"

    if [ $? -ne 0 ]; then
        echo "โŒ Failed to transfer archive"
        rm "$tar_name"
        return 1
    fi

    echo "๐Ÿ“‚ Extracting archive on remote host"
    ssh "$remote_host" "cd '$remote_path' && tar -xvf '$tar_name' && rm '$tar_name'"

    if [ $? -eq 0 ]; then
        echo "โœ… Successfully transferred and extracted $folder_name"
        echo "๐Ÿงน Cleaning up local tar file"
        rm "$tar_name"
    else
        echo "โŒ Failed to extract archive on remote host"
        return 1
    fi
}

Now instead of the manual three-step dance, you just run:

scpzip ./my-ml-project [email protected] /volume1/.srv/.unifi-drive/Media/.data/

The alias handles creating the tar archive, transferring it via SCP, SSHing into the destination to extract it, and cleaning up both the local and remote tar files when done. Those 25,000 small files still get bundled and transferred in minutes instead of hours.

One note: if you're working with really large datasets that are text based, you might want to add compression with tar -czvf and tar -xzvf, but for most cases the time saved on compression isn't worth the CPU overhead when you're already getting massive speed improvements from bundling.

And just like that. The full 25GB safely stored on the network in a few minutes instead of an entire day. Small files, man. They'll get you.

/dev/newsletter

Technical deep dives on machine learning research, engineering systems, and building scalable products. Published weekly.

Unsubscribe anytime. No spam, promise.