how to deal with white spaces in file names in UNIX/Linux
More often nowadays, I found that I need to deal with something ‘foreign’ in unix/linux servers I manage professionally. One thing in particular is an increased number of files whose names have white spaces in them. Most unix/linux utilities use any white space (tab, new line, space) as their delimiter character of choice, hence the problem.
These files are either legal immigrants from the other operating systems (uploads to a CMS/wiki/blog, for example), or some native programs or utilities with “foreign” roots (vmware server).
The common way to deal with it, is to opt to use ASCII NUL as delimiter when invoking command line utilities. This capability is a built-in forĀ more unix/linux utilities than I knew when I started to research to resolve a problem I had.
- grep -Z
- find . -print0
- xargs -0
- cut –output-delimiter=”\0″
“find . -print0 | xargs -0 ls -ltr” does work well, while ’svn status | cut -b8 –output-delimiter=”\0″ | xargs -0 ls -ltr’ doesn’t. It puzzled me for a bit, before I realized that the list from svn status already has “\n” to seperate each record.
The solution is really simple, just to swap each “\n” with “\0″, so each record is now delimited by “\0″. With that, you can keep the famous ‘xargs -0′ syntax. Alternately, one can simply specify the delimiter is “\n” instead of any other white space (tab, space, etc.)
$ /bin/ls -ltr /tmp | grep junk | cut -b45- | xargs -d “\n” ls -ltr
-rw-rw-r-- 1 experts8 www 43 05-19 21:14 junk 2
-rw-rw-r-- 1 experts8 www 43 05-19 21:17 junk3
$ /bin/ls -ltr /tmp | grep junk | cut -b45- | tr “\n” “\0″ | xargs -0 ls -ltr
-rw-rw-r-- 1 experts8 www 43 05-19 21:14 junk 2
-rw-rw-r-- 1 experts8 www 43 05-19 21:17 junk3










