The .NET Framework doesn't change the structure of the file system, nor does it build a new layer on top of it.

More simply, but also more effectively for developers, it supplies a new object model for file system-related operations. A managed application can work with files and directories using high-level methods rather than low level understanding of the file system. This article provides an overview of methods and classes contained in the System.IO namespace.

The .NET Framework reworks, rationalizes, and simplifies key portions of the Win32 API. Within the .NET Framework?with very few exceptions?Microsoft has redesigned the whole Win32 API and made it available to programmers in an object-oriented fashion. In this article, you will see how to manage paths as a special data type with ad-hoc methods and properties; how to work to retrieve as much information as possible about files and directories, and how to read and write files.

Managing Files

The .NET Framework uses System.IO as the main namespace to work with file systems. Within this namespace, you can identify three groups of related classes that accomplish the following tasks:

  • Retrieve information and perform basic operations on files and directories
  • Perform string-based manipulation on paths
  • Read and write operations on data streams and files

The .NET Framework provides I/O functionality through a few global static classes, such as File, Directory, and Path. You declare these classes as static (or shared in Visual Basic .NET) and in order to use them, you don't need to create specific instances of the classes. File, Directory, and Path are just the repository of global, type-specific functions that you call to create, copy, delete, move, and open files and directories. All of these functions requires a file or a directory name to operate. To write or read files, you also have specific classes to manage streams and bytes at your disposal.

If you're going to work with files within a .NET managed application, chances are good that you have to use the methods from the File class. So let's start by taking a look at the methods exposed by this class (see Table 1).

The path parameter that all methods require can indicate a relative or absolute path. A relative path is interpreted as relative to the current working directory. To obtain the current working directory, you use the GetCurrentDirectory method provided by the Directory class. Any methods above that perform write operations that will create the specified file if it does not exist. If the file does exist, it will be overwritten as long as it is not marked read-only.

Each time an application invokes a method on the File class, a security check is performed on the involved file system elements. The check verifies that the current user has the permission to perform the specified operation. If you use the same file or directory several times, this embedded security check might result in a slight performance hit. For this and other reasons, the .NET Framework defines an instance-specific type to wrap the functionality of files called the FileInfo class. If you need to access a file in a repeated fashion, you can use the FileInfo class to perform the security check only once. Should you always use FileInfo and disregard the File class? Well, consider that, in general, the methods of the global classes have an internal implementation that results in more direct code. For this reason, global objects are preferable for one-shot calls.

If you look at the overall set of functionality both provide, the FileInfo class looks very similar to the static File class. However, the internal implementation and the programming interface is slightly different. The FileInfo class works on a particular file and requires that you instantiate the class before you access its methods and properties.

FileInfo fi = new FileInfo("mydoc.txt");

When you create an instance of the FileInfo class, you specify a filename, either fully or partially qualified. The filename you indicate is only checked for the name consistency and not for existence. If the filename you indicate through the class constructor is unacceptable an exception is thrown. Common pitfalls are colons in the middle of the string, invalid characters, blank names, or names longer than 256 characters. Table 2 lists the properties of the FileInfo class.

The methods available for the FileInfo class are summarized in Table 3. As you can see, you can group methods into two categories: methods to perform simple stream-based operations on the contents of the file, and methods to copy or delete the file itself.

The FileInfo class represents a logical wrapper for a system element that is continuously subject to concurrent changes. Can you be sure that the information returned by the FileInfo object is always up to date? Properties such as Exists, Length, Attributes, and LastAccessTime can easily contain inconsistent values if other users may make changes concurrently.

When you create an instance of FileInfo, no information is actually read from the file system. As soon as you attempt to read the value of one of the aforementioned critical properties, the class invokes the Refresh method, reads the current state of the file, and caches that information. For performance reasons, though, the FileInfo class doesn't automatically refresh the state of the object each time properties are read. It does that only the first time that it reads one of the properties.

To force this built-in behavior, you should call Refresh whenever you need to read up-to-date information about the attributes or the length of a file. Whether or not you need to refresh this data depends greatly on the needs of your application. Under the hood, the Refresh method makes a call to the Win32 FindFirstFile function and uses the information contained in the returned WIN32_FIND_DATA structure to populate the properties of the FileInfo class. You need to consider whether or not the application needs the overhead of calling this API function.

Copying and Deleting Files

To make a copy of the current file, you can use the CopyTo method, which comes with two overloads. Both overloads copy the file to another file but the first overload just disallows overwriting, while the other gives you a chance to control overwriting through a Boolean parameter.

FileInfo fi = fi.CopyTo("NewFile.txt", true);

Notice that both methods require that the first argument be a filename. It can't be the name of a directory where you want the file to be copied. If you use a directory name, that will be the name of the output file.

The Delete method permanently deletes the file from disk. Using this method, there is no way to programmatically send the deleted file to the recycle bin. To put a file in the recycle bin you must resort to creating a .NET wrapper for the Win32 API function that does that. The API function you need is named SHFileOperation.

The Attributes property indicates the file system attributes of the given file. In order to set or read an attribute, the file must already exist and the application must have access to it. To write an attribute value to a file, you must also have a write permission, otherwise the FileIOPermissionAccess exception is raised. The attributes of a file are expressed using the FileAttributes type. (See Table 4.)

The values in the table correspond to those defined in the Win32 SDK. Notice that not all attributes are applicable to both files and directories. You set attributes on a file using code as in the code snippet below.

// Make the file read-only and hidden
FileInfo fi = new FileInfo("mydoc.txt")
fi.Attributes = FileAttributes.ReadOnly |
                FileAttributes.Hidden;

Note you cannot set all of the attributes listed in Table 4 through the Attributes property. For example, the system assigns the Encrypted and the Compressed attributes only if the file is contained in an encrypted or compressed folder. Likewise, you can give a file a reparse point or you can mark is as a sparse file only through specific API functions and only on NTFS volumes.

Working with Directories

To manage a directory as an object you use the Directory global object or the DirectoryInfo class. The global Directory class exposes static methods for creating, copying, and moving directories and for enumerating their files and subdirectories. Table 5 lists the methods on the Directory class.

Note that the Delete method has two overloads. By default, it deletes only empty directories and throws an IOException exception if the directory is not empty or marked read-only. The second overload includes a Boolean argument that, if set to true, enables the method to recursively delete the entire directory tree.

// Clear a directory tree
Directory.Delete(dirName, true);

The DirectoryInfo class represents the instance-based counterpart of the Directory class and works on a particular directory.

DirectoryInfo di = new DirectoryInfo(@"c:\");

To create an instance of the DirectoryInfo class, you specify a fully qualified path. Just as for FileInfo, the path is checked for consistency but not for existence. Note that the path can also be a filename or a Universal Naming Convention (UNC) name. If you create a DirectoryInfo object passing a filename, the class will use the directory that contains the specified file. Table 6 shows the properties available with the DirectoryInfo class.

The Name property of the file and directory classes is read-only and you cannot use it to rename the corresponding file system's element. The methods you can use on the DirectoryInfo class are listed in Table 7.

The GetFileSystemInfos method returns an array of objects, each of which points to a file or a subdirectory contained in the directory bound to the current DirectoryInfo object. Unlike GetDirectories and GetFiles methods which simply return the names of subdirectories and files as plain strings, GetFileSystemInfos returns a strongly-typed object for each entry?either DirectoryInfo or FileInfo. The return type of the method is an array of FileSystemInfo objects.

public FileSystemInfo[] GetFileSystemInfos()

FileSystemInfo is the base class for both FileInfo and DirectoryInfo. GetFileSystemInfos has an overloaded version that can accept a string with search criteria.

Let's see how to use the file and directory classes to build a simple console application that lists the contents of a directory. The full source code is presented in Listing 1.

GetFileSystemInfos accepts a filter string that you can use to set some criteria. The filter string can contain wild card characters such as ? and *. The ? character is a placeholder for any individual character, while * represents any string of one or more characters. A bit more problematic is selecting all files that belong to one group or another. Likewise, there's no direct way to obtain all directories plus all the files that match certain criteria. In similar cases, you must query each result set individually and then combine them together in a single array of FileSystemInfo objects. The following code snippet shows how to select all the subdirectories and all the aspx pages in a given folder.

FileSystemInfo fsiDirs = (FileSystemInfo[])
        di.GetSubdirectories();
FileSystemInfo fsiAspx = (FileSystemInfo[])
        di.GetFiles(".aspx");

You can fuse the two arrays together using the methods of the Array class.

Working with Paths

Although paths are nothing more than strings, it's a common feeling that they deserve a tailor-made set of functions to makes paths easier to manipulate. The Path type provides programmers with the unprecedented ability to perform operations on instances of a string class that contain file or directory path information. Path is a single-instance class that contains only static methods. A path can contain either absolute or relative location information for a given file or folder. If the information about the location is incomplete and partial, then the class completes it using the current location, if applicable.

The members of the Path class let you perform everyday operations such as: determining whether a given filename has a certain extension, changing the extension of a filename leaving all the remainder of the path intact, combining partial path strings into one valid path, and more. The Path class doesn't work in conjunction with the operating system and should be simply viewed as a highly specialized string manipulation class.

The members of the Path class never interact with the file system to verify the correctness of a filename. Even though you can combine two strings to get a valid directory name, that would not be sufficient to actually create that new directory. On the other hand, the members of the Path class are smart enough to throw an exception if they detect that a path string contains invalid characters. Table 8 lists the methods of the Path class.

It's interesting to notice that any call to GetTempFileName promptly results in creating a zero-length file on disk and specifically in the system's temporary folder (i.e., C:\Windows\Temp). This is the only case in which the Path class happens to interact with the operating system.

I/O with Files

In the .NET Framework, the atomic element to read from, or write to, is the stream. A stream abstracts the contents of a variety of potential data stores, including local and network disk files, memory, and databases. You can read or write a Stream object using a couple of tailor-made tools?the reader and the writer.

A reader reads one chunk of information at a time. The structure of the data read depends on the particular reader and the underlying stream. For example, a text reader will read rows of text recognizing the carriage return/linefeed pair as the separator between chunks. Likewise, the binary reader will process every single byte in the stream as the XML reader moves from one node to the next. The reader operates in a read-only, forward-only way. You can't move back to an already processed, or skipped, chunk of data; nor can you edit the current data the pointer references.

In the .NET Framework, you find available quite a few specialized readers including TextReader, BinaryReader, XmlReader, and database-specific readers such as SqlDataReader and OracleDataReader. Although all of these reader classes have a common subset of functions, and an overall similar way of working, they don't derive from the same base class. Reader classes work on top of streams. Depending on the implementation of each individual class, the stream may be passed explicitly as a constructor argument or through its file name or URL.

The Stream class supports three basic operations: reading, writing, and seeking. Reading and writing operations entail transferring data from a stream into a data structure and vice versa. Seeking consists of querying and modifying the current position within the stream of data.

The .NET Framework provides a number of predefined Stream classes including FileStream, MemoryStream, and the fairly interesting CryptoStream, which automatically encrypts and decrypts data as you write or read. Each different storage implements its own stream by deriving from the base Stream class. The StreamReader class is a generic reader class for any type of stream. Finally, the StringReader class lets you read a string of text using the same programming interface as readers that operate on data stores.

You transform the contents of a file into a stream using the FileStream class. The following code shows how to open a file that you want to read:

FileStream fs = new FileStream(filename,
FileMode.Open, FileAccess.Read);

Streams supply a rather low-level programming interface which, although functionally effective, is not always apt for classes that need to perform more high-level operations such as reading the whole content of a file or a single line.

To manipulate the contents of a file as a binary stream, you just pass the FileStream object down to a specialized reader object that knows how to handle it.

BinaryReader bin = new BinaryReader(fs);

If you want to process the file's contents in a text-based way, then you can use the StreamReader class, as shown below.

StreamReader reader;
reader = new StreamReader(fileName);
reader.BaseStream.Seek(0, SeekOrigin.Begin);
string text = reader.ReadToEnd();
reader.Close();

To write files, you often use the StreamWriter class and access its underlying stream, which can also be an encrypted stream. The following code snippet shows how to create a file.

StreamWriter writer = new StreamWriter(file);
writer.WriteLine(text);
writer.Close();

Creating binary files that contain images or raw data doesn't happen along different guidelines. You just use BinaryWriter (or BinaryReader for reading) as the writer object and its ad hoc set of methods.

All reader classes have a writer counter class. So you have a StreamWriter class acting as a generic writer for streams and more specific classes such as TextWriter, XmlWriter, BinaryWriter, and StringWriter. Curiously, the .NET Framework does not have a sort of SqlDataWriter class which would configure a server cursor. Server cursors are not supported as of version 1.1 of the .NET Framework.

Summary

Although the substance of the underlying file system is not something that changed with .NET, the platform that determines the way in which you work with the constituent elements of a file system?files and directories, changed quite a bit.

The introduction of streams as programmable objects is a key step in the sense that it unifies the API necessary to perform similar operations on conceptually similar storage media. Another key enhancement is the introduction of reader and writer objects. They provide a kind of logical API by means of which you read and write any piece of information in nearly identical ways. The .NET Framework also provides a lot of facilities to perform the basic management operations with files and directories, including path functions and common-use methods. In just one slogan, with .NET way of working with the file system is easier and more effective. Just do it.™