Ubuntu: Can I copy large files faster without using the file cache?



Question:

After adding the preload package, my applications seem to speed up but if I copy a large file, the file cache grows by more than double the size of the file.

By transferring a single 3-4 GB virtualbox image or video file to an external drive, this huge cache seems to remove all the preloaded applications from memory, leading to increased load times and general performance drops.

Is there a way to copy large, multi-gigabyte files without caching them (i.e. bypassing the file cache)? Or a way to whitelist or blacklist specific folders from being cached?


Solution:1

There is the nocache utility, which can prepended to a command like ionice and nice. It works by preloading a library which adds posix_fadvise with the POSIX_FADV_DONTNEED flag to any open calls.

In simple terms, it advises the kernel that caching is not needed for that particular file; the kernel will then normally not cache the file. See here for the technical details.

It does wonders for any huge copy jobs, e. g. if you want to backup a multi terabyte disk in the background with the least possible impact on you running system, you can do something along nice -n19 ionice -c3 nocache cp -a /vol /vol2.

A package will be available in Ubuntu 13.10 and up. If you are on a previous release you can either install the 13.10 package or opt for this 12.04 backport by François Marier.


Solution:2

For single large files, use dd with direct I/O to bypass the file cache:

If you want to transfer one (or a few) large multi-gigabyte files, it's easy to do with dd:

dd if=/path/to/source of=/path/to/destination bs=4M iflag=direct oflag=direct  
  • The direct flags tell dd to use the kernel's direct I/O option (O_DIRECT) while reading and writing, thus completely bypassing the file cache.
  • The bs blocksize option must be set to a reasonably large value since to minimize the number of physical disk operations dd must perform, since reads/writes are no longer cached and too many small direct operations can result in a serious slowdown.
    • Feel free to experiment with values from 1 to 32 MB; the setting above is 4 MB (4M).

For multiple/recursive directory copies, unfortunately, there are no easily available tools; the usual cp,etc do not support direct I/O.

/e iflags & oflags changed to the correct iflag & oflag


Solution:3

You can copy a directory recursively with dd using find and mkdir

We need to workaround two problems:

  1. dd doesn't know what to do with directories
  2. dd can only copy one file at a time

First let's define input and output directories:

SOURCE="/media/source-dir"  TARGET="/media/target-dir"  

Now let's cd into the source directory so find will report relative directories we can easily manipulate:

cd "$SOURCE"  

Duplicate the directory tree from $SOURCE to $TARGET

find . -type d -exec mkdir -p "$TARGET{}" \;  

Duplicate files from $SOURCE to $TARGET omitting write cache (but utilising read cache!)

find . -type f -exec dd if={} of="$TARGET{}" bs=8M oflag=direct \;  

Please note that this won't preserve file modification times, ownership and other attributes.


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »