|Perl Language Reference Manual|
by Larry Wall and others
Paperback (6"x9"), 724 pages
RRP £29.95 ($39.95)
Sales of this book support The Perl Foundation! Get a printed copy>>>
28.1.3 Files and Filesystems
Most platforms these days structure files in a hierarchical fashion. So, it is reasonably safe to assume that all platforms support the notion of a "path" to uniquely identify a file on the system. How that path is really written, though, differs considerably.
Although similar, file path specifications differ between Unix, Windows, Mac OS, OS/2, VMS, VOS, RISC OS, and probably others. Unix, for example, is one of the few OSes that has the elegant idea of a single root directory.
DOS, OS/2, VMS, VOS, and Windows can work similarly to Unix with
as path separator, or in their own idiosyncratic ways (such as having
several root directories and various "unrooted" device files such NIL:
Mac OS 9 and earlier used
: as a path separator instead of
The filesystem may support neither hard links (
symbolic links (
The filesystem may support neither access timestamp nor change timestamp (meaning that about the only portable timestamp is the modification timestamp), or one second granularity of any timestamps (e.g. the FAT filesystem limits the time granularity to two seconds).
The "inode change timestamp" (the
-C filetest) may really be the
"creation timestamp" (which it is not in Unix).
VOS perl can emulate Unix filenames with
/ as path separator. The
native pathname characters greater-than, less-than, number-sign, and
percent-sign are always accepted.
RISC OS perl can emulate Unix filenames with
/ as path
separator, or go native and use
. for path separator and
signal filesystems and disk names.
Don't assume Unix filesystem access semantics: that read, write, and execute are all the permissions there are, and even if they exist, that their semantics (for example what do r, w, and x mean on a directory) are the Unix ones. The various Unix/POSIX compatibility layers usually try to make interfaces like chmod() work, but sometimes there simply is no good mapping.
If all this is intimidating, have no (well, maybe only a little) fear. There are modules that can help. The File::Spec modules provide methods to do the Right Thing on whatever platform happens to be running the program.
use File::Spec::Functions; chdir(updir()); # go up one directory my $file = catfile(curdir(), 'temp', 'file.txt'); # on Unix and Win32, './temp/file.txt' # on Mac OS Classic, ':temp:file.txt' # on VMS, '[.temp]file.txt'
File::Spec is available in the standard distribution as of version 5.004_05. File::Spec::Functions is only in File::Spec 0.7 and later, and some versions of perl come with version 0.6. If File::Spec is not updated to 0.7 or later, you must use the object-oriented interface from File::Spec (or upgrade File::Spec).
In general, production code should not have file paths hardcoded. Making them user-supplied or read from a configuration file is better, keeping in mind that file path syntax varies on different machines.
This is especially noticeable in scripts like Makefiles and test suites,
which often assume
/ as a path separator for subdirectories.
Also of use is File::Basename from the standard distribution, which splits a pathname into pieces (base filename, full path to directory, and file suffix).
Even when on a single platform (if you can call Unix a single platform), remember not to count on the existence or the contents of particular system-specific files or directories, like /etc/passwd, /etc/sendmail.conf, /etc/resolv.conf, or even /tmp/. For example, /etc/passwd may exist but not contain the encrypted passwords, because the system is using some form of enhanced security. Or it may not contain all the accounts, because the system is using NIS. If code does need to rely on such a file, include a description of the file and its format in the code's documentation, then make it easy for the user to override the default location of the file.
Don't assume a text file will end with a newline. They should, but people forget.
Do not have two files or directories of the same name with different
case, like test.pl and Test.pl, as many platforms have
case-insensitive (or at least case-forgiving) filenames. Also, try
not to have non-word characters (except for
.) in the names, and
keep them to the 8.3 convention, for maximum portability, onerous a
burden though this may appear.
Likewise, when using the AutoSplit module, try to keep your functions to 8.3 naming and case-insensitive conventions; or, at the least, make it so the resulting files have a unique (case-insensitively) first 8 characters.
Whitespace in filenames is tolerated on most systems, but not all, and even on systems where it might be tolerated, some utilities might become confused by such whitespace.
Many systems (DOS, VMS ODS-2) cannot have more than one
. in their
> won't be the first character of a filename.
< explicitly to open a file for reading, or even
better, use the three-arg version of open, unless you want the user to
be able to specify a pipe open.
open my $fh, '<', $existing_file) or die $!;
If filenames might use strange characters, it is safest to open it
sysopen instead of
open is magic and can
translate characters like
|, which may
be the wrong thing to do. (Sometimes, though, it's the right thing.)
Three-arg open can also help protect against this translation in cases
where it is undesirable.
: as a part of a filename since many systems use that for
their own semantics (Mac OS Classic for separating pathname components,
many networking schemes and utilities for separating the nodename and
the pathname, and so on). For the same reasons, avoid
Don't assume that in pathnames you can collapse two leading slashes
// into one: some networking and clustering filesystems have special
semantics for that. Let the operating system to sort it out.
The portable filename characters as defined by ANSI C are
a b c d e f g h i j k l m n o p q r t u v w x y z A B C D E F G H I J K L M N O P Q R T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 . _ -
and the "-" shouldn't be the first character. If you want to be
hypercorrect, stay case-insensitive and within the 8.3 naming
convention (all the files and directories have to be unique within one
directory if their names are lowercased and truncated to eight
characters before the
., if any, and to three characters after the
., if any). (And do not use
.s in directory names.)
|ISBN 9781906966027||Perl Language Reference Manual||See the print edition|