Sparse Files – Commands Overview

I was just checking which commands are supporting sparse files. I post hereafter a short overview of what I found out (working on Linux).

  • Create a sparse file of 20 GiB:
    dd if=/dev/zero of=foo bs=1 count=1 seek=20G

  • Check that a file is sparse:
    ls -alsh
    Compare the first versus second size column (the first one is the space taken on disk).

  • Copy a sparse file:
    cp foo bar
    Copy already detects and handles correctly sparse files.

  • Make a non-sparse file sparse (works only if it contains blocks filled with zeros):
    cp --sparse=always foo bar

  • Expand a sparse file to a non-sparse one:
    cp --sparse=never foo bar

  • Copy remotely a sparse file: scp does not support sparse files, it will expand them to non-sparse ones. Rsync does support them. Just use the -S or --sparse option. Example:
    rsync -vS foo root@someserver:/some/path/

How to Know if a File on Linux is Sparse?

A sparse file is a file which does not take more space on disk than needed. Such a file is usually used to store a partition image on disk, for instance with a virtualization solution like Xen.

It’s super easy, to know if a file is sparse or not. Just use the ‘s’ option of ls.

ls -alsh

will yield:

4.0K drwxr-xr-x 9 root root 4.0K 2010-03-23 18:22 .
4.0K drwxr-xr-x 23 root root 4.0K 2009-01-09 19:47 ..
12G -rw-r----- 1 root root 24G 2007-01-10 19:55 dompat.data
3.7G -rw-r--r-- 1 root root 7.1G 2009-01-06 21:09 dompat-hardy.sys
501M -rw-r----- 1 root root 501M 2007-01-07 16:40 dompat.swap

Where the first size column is the effective space taken on disk while the second size column is the max space of that file. We see that dompat.data is sparse, since its max size is 24 GB while it takes only 12 GB on disk.


