The .NET File System Object Model

The .NET Framework doesn't change the structure of the file system, nor does it build a new layer on top of it.More simply, but also more effectively for developers, it supplies a new object model for file system-related operations. A managed application can work with files and directories using high-level methods rather than low level understanding of the file system. This article provides an overview of methods and classes contained in the System.IO namespace.

The .NET Framework reworks, rationalizes, and simplifies key portions of the Win32 API. Within the .NET Framework?with very few exceptions?Microsoft has redesigned the whole Win32 API and made it available to programmers in an object-oriented fashion. In this article, you will see how to manage paths as a special data type with ad-hoc methods and properties; how to work to retrieve as much information as possible about files and directories, and how to read and write files.

Managing Files

The .NET Framework uses System.IO as the main namespace to work with file systems. Within this namespace, you can identify three groups of related classes that accomplish the following tasks:

  • Retrieve information and perform basic operations on files and directories
  • Perform string-based manipulation on paths
  • Read and write operations on data streams and files

The .NET Framework provides I/O functionality through a few global static classes, such as File, Directory, and Path. You declare these classes as static (or shared in Visual Basic .NET) and in order to use them, you don't need to create specific instances of the classes. File, Directory, and Path are just the repository of global, type-specific functions that you call to create, copy, delete, move, and open files and directories. All of these functions requires a file or a directory name to operate. To write or read files, you also have specific classes to manage streams and bytes at your disposal.

If you're going to work with files within a .NET managed application, chances are good that you have to use the methods from the File class. So let's start by taking a look at the methods exposed by this class (see Table 1).

The path parameter that all methods require can indicate a relative or absolute path. A relative path is interpreted as relative to the current working directory. To obtain the current working directory, you use the GetCurrentDirectory method provided by the Directory class. Any methods above that perform write operations that will create the specified file if it does not exist. If the file does exist, it will be overwritten as long as it is not marked read-only.

Each time an application invokes a method on the File class, a security check is performed on the involved file system elements. The check verifies that the current user has the permission to perform the specified operation. If you use the same file or directory several times, this embedded security check might result in a slight performance hit. For this and other reasons, the .NET Framework defines an instance-specific type to wrap the functionality of files called the FileInfo class. If you need to access a file in a repeated fashion, you can use the FileInfo class to perform the security check only once. Should you always use FileInfo and disregard the File class? Well, consider that, in general, the methods of the global classes have an internal implementation that results in more direct code. For this reason, global objects are preferable for one-shot calls.

If you look at the overall set of functionality both provide, the FileInfo class looks very similar to the static File class. However, the internal implementation and the programming interface is slightly different. The FileInfo class works on a particular file and requires that you instantiate the class before you access its methods and properties.

FileInfo fi = new FileInfo("mydoc.txt");

When you create an instance of the FileInfo class, you specify a filename, either fully or partially qualified. The filename you indicate is only checked for the name consistency and not for existence. If the filename you indicate through the class constructor is unacceptable an exception is thrown. Common pitfalls are colons in the middle of the string, invalid characters, blank names, or names longer than 256 characters. Table 2 lists the properties of the FileInfo class.

The methods available for the FileInfo class are summarized in Table 3. As you can see, you can group methods into two categories: methods to perform simple stream-based operations on the contents of the file, and methods to copy or delete the file itself.

The FileInfo class represents a logical wrapper for a system element that is continuously subject to concurrent changes. Can you be sure that the information returned by the FileInfo object is always up to date? Properties such as Exists, Length, Attributes, and LastAccessTime can easily contain inconsistent values if other users may make changes concurrently.

When you create an instance of FileInfo, no information is actually read from the file system. As soon as you attempt to read the value of one of the aforementioned critical properties, the class invokes the Refresh method, reads the current state of the file, and caches that information. For performance reasons, though, the FileInfo class doesn't automatically refresh the state of the object each time properties are read. It does that only the first time that it reads one of the properties.

To force this built-in behavior, you should call Refresh whenever you need to read up-to-date information about the attributes or the length of a file. Whether or not you need to refresh this data depends greatly on the needs of your application. Under the hood, the Refresh method makes a call to the Win32 FindFirstFile function and uses the information contained in the returned WIN32_FIND_DATA structure to populate the properties of the FileInfo class. You need to consider whether or not the application needs the overhead of calling this API function.

Copying and Deleting Files

To make a copy of the current file, you can use the CopyTo method, which comes with two overloads. Both overloads copy the file to another file but the first overload just disallows overwriting, while the other gives you a chance to control overwriting through a Boolean parameter.

FileInfo fi = fi.CopyTo("NewFile.txt", true);

Notice that both methods require that the first argument be a filename. It can't be the name of a directory where you want the file to be copied. If you use a directory name, that will be the name of the output file.

The Delete method permanently deletes the file from disk. Using this method, there is no way to programmatically send the deleted file to the recycle bin. To put a file in the recycle bin you must resort to creating a .NET wrapper for the Win32 API function that does that. The API function you need is named SHFileOperation.

The Attributes property indicates the file system attributes of the given file. In order to set or read an attribute, the file must already exist and the application must have access to it. To write an attribute value to a file, you must also have a write permission, otherwise the FileIOPermissionAccess exception is raised. The attributes of a file are expressed using the FileAttributes type. (See Table 4.)

The values in the table correspond to those defined in the Win32 SDK. Notice that not all attributes are applicable to both files and directories. You set attributes on a file using code as in the code snippet below.

// Make the file read-only and hidden
FileInfo fi = new FileInfo("mydoc.txt")
fi.Attributes = FileAttributes.ReadOnly | 
                FileAttributes.Hidden;

Note you cannot set all of the attributes listed in Table 4 through the Attributes property. For example, the system assigns the Encrypted and the Compressed attributes only if the file is contained in an encrypted or compressed folder. Likewise, you can give a file a reparse point or you can mark is as a sparse file only through specific API functions and only on NTFS volumes.

Working with Directories

To manage a directory as an object you use the Directory global object or the DirectoryInfo class. The global Directory class exposes static methods for creating, copying, and moving directories and for enumerating their files and subdirectories. Table 5 lists the methods on the Directory class.

Note that the Delete method has two overloads. By default, it deletes only empty directories and throws an IOException exception if the directory is not empty or marked read-only. The second overload includes a Boolean argument that, if set to true, enables the method to recursively delete the entire directory tree.

// Clear a directory tree
Directory.Delete(dirName, true);

The DirectoryInfo class represents the instance-based counterpart of the Directory class and works on a particular directory.

DirectoryInfo di = new DirectoryInfo(@"c:\");

To create an instance of the DirectoryInfo class, you specify a fully qualified path. Just as for FileInfo, the path is checked for consistency but not for existence. Note that the path can also be a filename or a Universal Naming Convention (UNC) name. If you create a DirectoryInfo object passing a filename, the class will use the directory that contains the specified file. Table 6 shows the properties available with the DirectoryInfo class.

The Name property of the file and directory classes is read-only and you cannot use it to rename the corresponding file system's element. The methods you can use on the DirectoryInfo class are listed in Table 7.

The GetFileSystemInfos method returns an array of objects, each of which points to a file or a subdirectory contained in the directory bound to the current DirectoryInfo object. Unlike GetDirectories and GetFiles methods which simply return the names of subdirectories and files as plain strings, GetFileSystemInfos returns a strongly-typed object for each entry?either DirectoryInfo or FileInfo. The return type of the method is an array of FileSystemInfo objects.

public FileSystemInfo[] GetFileSystemInfos()

FileSystemInfo is the base class for both FileInfo and DirectoryInfo. GetFileSystemInfos has an overloaded version that can accept a string with search criteria.

Let's see how to use the file and directory classes to build a simple console application that lists the contents of a directory. The full source code is presented in Listing 1.

GetFileSystemInfos accepts a filter string that you can use to set some criteria. The filter string can contain wild card characters such as ? and *. The ? character is a placeholder for any individual character, while * represents any string of one or more characters. A bit more problematic is selecting all files that belong to one group or another. Likewise, there's no direct way to obtain all directories plus all the files that match certain criteria. In similar cases, you must query each result set individually and then combine them together in a single array of FileSystemInfo objects. The following code snippet shows how to select all the subdirectories and all the aspx pages in a given folder.

FileSystemInfo fsiDirs = (FileSystemInfo[]) 
        di.GetSubdirectories();
FileSystemInfo fsiAspx = (FileSystemInfo[]) 
        di.GetFiles(".aspx");

You can fuse the two arrays together using the methods of the Array class.

Working with Paths

Although paths are nothing more than strings, it's a common feeling that they deserve a tailor-made set of functions to makes paths easier to manipulate. The Path type provides programmers with the unprecedented ability to perform operations on instances of a string class that contain file or directory path information. Path is a single-instance class that contains only static methods. A path can contain either absolute or relative location information for a given file or folder. If the information about the location is incomplete and partial, then the class completes it using the current location, if applicable.

The members of the Path class let you perform everyday operations such as: determining whether a given filename has a certain extension, changing the extension of a filename leaving all the remainder of the path intact, combining partial path strings into one valid path, and more. The Path class doesn't work in conjunction with the operating system and should be simply viewed as a highly specialized string manipulation class.

The members of the Path class never interact with the file system to verify the correctness of a filename. Even though you can combine two strings to get a valid directory name, that would not be sufficient to actually create that new directory. On the other hand, the members of the Path class are smart enough to throw an exception if they detect that a path string contains invalid characters. Table 8 lists the methods of the Path class.

It's interesting to notice that any call to GetTempFileName promptly results in creating a zero-length file on disk and specifically in the system's temporary folder (i.e., C:\Windows\Temp). This is the only case in which the Path class happens to interact with the operating system.

I/O with Files

In the .NET Framework, the atomic element to read from, or write to, is the stream. A stream abstracts the contents of a variety of potential data stores, including local and network disk files, memory, and databases. You can read or write a Stream object using a couple of tailor-made tools?the reader and the writer.

A reader reads one chunk of information at a time. The structure of the data read depends on the particular reader and the underlying stream. For example, a text reader will read rows of text recognizing the carriage return/linefeed pair as the separator between chunks. Likewise, the binary reader will process every single byte in the stream as the XML reader moves from one node to the next. The reader operates in a read-only, forward-only way. You can't move back to an already processed, or skipped, chunk of data; nor can you edit the current data the pointer references.

In the .NET Framework, you find available quite a few specialized readers including TextReader, BinaryReader, XmlReader, and database-specific readers such as SqlDataReader and OracleDataReader. Although all of these reader classes have a common subset of functions, and an overall similar way of working, they don't derive from the same base class. Reader classes work on top of streams. Depending on the implementation of each individual class, the stream may be passed explicitly as a constructor argument or through its file name or URL.

The Stream class supports three basic operations: reading, writing, and seeking. Reading and writing operations entail transferring data from a stream into a data structure and vice versa. Seeking consists of querying and modifying the current position within the stream of data.

The .NET Framework provides a number of predefined Stream classes including FileStream, MemoryStream, and the fairly interesting CryptoStream, which automatically encrypts and decrypts data as you write or read. Each different storage implements its own stream by deriving from the base Stream class. The StreamReader class is a generic reader class for any type of stream. Finally, the StringReader class lets you read a string of text using the same programming interface as readers that operate on data stores.

You transform the contents of a file into a stream using the FileStream class. The following code shows how to open a file that you want to read:

FileStream fs = new FileStream(filename, 
FileMode.Open, FileAccess.Read);

Streams supply a rather low-level programming interface which, although functionally effective, is not always apt for classes that need to perform more high-level operations such as reading the whole content of a file or a single line.

To manipulate the contents of a file as a binary stream, you just pass the FileStream object down to a specialized reader object that knows how to handle it.

BinaryReader bin = new BinaryReader(fs);

If you want to process the file's contents in a text-based way, then you can use the StreamReader class, as shown below.

StreamReader reader;
reader = new StreamReader(fileName);
reader.BaseStream.Seek(0, SeekOrigin.Begin);
string text = reader.ReadToEnd();
reader.Close();

To write files, you often use the StreamWriter class and access its underlying stream, which can also be an encrypted stream. The following code snippet shows how to create a file.

StreamWriter writer = new StreamWriter(file);
writer.WriteLine(text);
writer.Close();

Creating binary files that contain images or raw data doesn't happen along different guidelines. You just use BinaryWriter (or BinaryReader for reading) as the writer object and its ad hoc set of methods.

All reader classes have a writer counter class. So you have a StreamWriter class acting as a generic writer for streams and more specific classes such as TextWriter, XmlWriter, BinaryWriter, and StringWriter. Curiously, the .NET Framework does not have a sort of SqlDataWriter class which would configure a server cursor. Server cursors are not supported as of version 1.1 of the .NET Framework.

Summary

Although the substance of the underlying file system is not something that changed with .NET, the platform that determines the way in which you work with the constituent elements of a file system?files and directories, changed quite a bit.

The introduction of streams as programmable objects is a key step in the sense that it unifies the API necessary to perform similar operations on conceptually similar storage media. Another key enhancement is the introduction of reader and writer objects. They provide a kind of logical API by means of which you read and write any piece of information in nearly identical ways. The .NET Framework also provides a lot of facilities to perform the basic management operations with files and directories, including path functions and common-use methods. In just one slogan, with .NET way of working with the file system is easier and more effective. Just do it.™

Leonardo Esposito

&

By: Leonardo Esposito

dinoesp@hotmail.com

Dino Esposito is a mentor at Solid Quality Mentors where he manages the ASP.NET, workflow, and AJAX courseware. A speaker at many industry events including Microsoft TechEd, Basta, DevWeek, and DevConnections, Dino is the author of the two volumes of Programming Microsoft ASP.NET 2.0 Applications, for Microsoft Press. You can find late breaking news at http://weblogs.asp.net/despos.

leesposi@libero.it

Fast Facts

The .NET file system object model supplies three groups of related functions?information about files and directories, ad hoc methods for manipulating paths, and tools to create and manage files of any type. The ability to manage files comes from the System.IO namespace.


File Attributes Description

The FileAttributes enumeration type, and specifically its ToString method, has an extremely handy feature. When you call ToString, the class returns a string with a description of the attributes. The returned text consists of a comma-separated string in which each attribute is automatically translated into descriptive text. For example, if you call the method to operate on a read-only and hidden file, the output that you get is "Readonly, Hidden".


Names of Paths in C#

Often paths contain the backslash (\) character, which has a special meaning to C-based languages such as C#. The typical workaround, in use for years, consists in using a double backslash \\. However, C# provides a more elegant workaround?prefixing the path with the @ symbol. The @ character tells the C# compiler to consider the following string as literal text and process it verbatim.



Table 1: Methods exposed by the File class.
Method NameDescription
AppendTextCreates and returns a stream object for the specified file. The stream allows you to append UTF-8 encoded text.
CopyCopies an existing file to a new file. The destination cannot be a directory name or an existing file.
CreateCreates a new file.
CreateTextCreates a new file (or opens one if a file already exists) for writing UTF-8 text.
DeleteDeletes the file specified.
ExistsDetermines whether the specified file exists.
GetAttributesGets the attributes of the file.
GetCreationTimeReturns the creation date and time of the specified file.
GetLastAccessTimeReturns the last access date and time for the specified file.
GetLastWriteTimeReturns the last write date and time for the specified file.
MoveMoves a specified file to a new location. Also provides the option to specify a new filename.
OpenOpens a file on the specified path.
OpenReadOpens an existing file for reading.
OpenTextOpens an existing UTF-8 encoded text file for reading.
OpenWriteOpens an existing file for writing.
SetAttributes Sets the specified attributes for the given file.
SetCreationTimeSets the date and time the file was created.
SetLastAccessTimeSets the date and time the specified file was last accessed.
SetLastWriteTimeSets the date and time that the specified file was last written.


Table 2: Properties of the FileInfo class.
Property NameDescription
AttributesGets or sets the attributes of the current file.
CreationTimeGets or sets the time when the current file was created.
DirectoryReturns a DirectoryInfo object representing the parent directory.
DirectoryNameGets a string representing the directory's full path.
ExistsIndicates whether a file with the current name exists.
ExtensionGets the string representing the extension of the filename, including the period (.).
FullNameReturns the full path of the current file.
LastAccessTimeGets or sets the time when the current file was last accessed.
LastWriteTimeGets or sets the time when the current file was last written.
LengthReturns the size in bytes of the current file.
NameReturns the name of the file.


Table 3: Methods of the FileInfo class.
MethodDescription
AppendTextCreates and returns a stream object for the current file. The stream allows you to append UTF-8 encoded text.
CopyToCopies the current file to a new file.
CreateCreates a file. It's a simple wrapper for the File.Create method.
CreateTextCreate a file and returns a Stream object to write text.
DeletePermanently deletes the current file. Fails if the file is open.
MoveToMoves the current file to a new location, providing the option to specify a new filename.
OpenOpens the file with various read/write and sharing privileges.
OpenReadCreates and returns a read-only stream for the file.
OpenTextCreates and returns a Stream object to read text from the file.
OpenWriteCreates and returns a write-only Stream object that you can use to write text to the file.
RefreshRefreshes the information that the class can have about the file.
ToStringReturns a string that represents the fully qualified path of the file.


Table 4: The FileAttributes enumeration.
AttributeDescription
ArchiveIndicates that the file is an archive.
CompressedThe file is compressed.
DeviceNot currently used. Reserved for future use.
DirectoryThe file is a directory.
EncryptedThe file or directory is encrypted. For a file, this means that all data in the file is encrypted. For a directory, this means that encryption is the default for newly created files and directories but not necessarily that all current files are encrypted.
HiddenThe file is hidden and doesn't show up in directory listings.
NormalThe file has no other attributes set. Note that this attribute is valid only if used alone.
NotContentIndexedThe file should not be indexed by the system indexing service.
OfflineThe file is offline and its data is not immediately available.
ReadOnlyThe file is read-only.
ReparsePointThe file contains a reparse point, which is a block of user-defined data associated with a file or a directory. Requires an NTFS file system.
SparseFileThe file is a sparse file. Sparse files are typically large files whose data are mostly zeros. Requires an NTFS file system.
SystemThe file is a system file, part of the operating system or used exclusively by the operating system.
TemporaryThe file is temporary and can be deleted by the application any time soon.


Table 5: Methods on the Directory class.
Method NameDescription
CreateDirectoryMakes sure that the specified path exists in all of its included subdirectories.
DeleteDeletes a directory and, optionally, all of its contents.
ExistsDetermines whether the given directory exists.
GetCreationTimeGets the creation date and time of the specified directory.
GetCurrentDirectoryGets the current working directory of the application.
GetDirectoriesReturns an array of strings filled with the names of the child subdirectories of the specified directory.
GetDirectoryRootGets volume and root information for the specified path.
GetFilesReturns the names of files in the specified directory.
GetFileSystemEntriesReturns an array of strings filled with the names of all files and subdirectories contained in the specified directory.
GetLastAccessTimeReturns the date and time the specified directory was last accessed.
GetLastWriteTimeReturns the date and time the specified directory was last written.
GetLogicalDrivesReturns an array of strings filled with the names of the logical drives found on the computer. Strings have the form "<drive letter>:\".
GetParentRetrieves the parent directory of the specified path. The directory is returned as a DirectoryInfo object.
MoveMoves a directory and its contents to a new location. An exception is thrown if you move the directory to another volume or if a directory with the same name exists.
SetCreationTimeSets the creation date and time for the specified directory.
SetCurrentDirectorySets the application's current working directory.
SetLastAccessTimeSets the date and time the specified file or directory was last accessed.
SetLastWriteTimeSets the date and time a directory was last written to.


Table 6: Properties of the DirectoryInfo class.
Property Description
AttributesGets or sets the attributes of the current directory.
CreationTimeGets or sets the creation time of the current directory.
ExistsDetermines whether the directory exists.
ExtensionReturns the extension (if any) in the directory name.
FullNameReturns the full path of the directory.
LastAccessTimeGets or sets the time when the current directory was last accessed.
LastWriteTimeGets or sets the time when the current directory was last written.
NameReturns the name of the directory bound to this object.
ParentReturns the parent of the directory bound to this object.
RootReturns the root portion of the directory path.


Table 7: Methods of the DirectoryInfo class.
Method NameDescription
CreateCreates a directory. It's a simple wrapper for the Directory.Create method
CreateSubdirectoryCreates a subdirectory on the specified path. The path can be relative to this instance of the DirectoryInfo class.
DeleteDeletes the directory.
GetDirectoriesReturns an array of DirectoryInfo objects, each pointing to a subdirectory of the current directory.
GetFilesReturns an array of FileInfo objects, each pointing to a file contained in the current directory.
GetFileSystemInfosRetrieves an array of FileSystemInfo objects representing all the files and subdirectories in the current directory.
MoveToMoves a directory and all of its contents to a new path.
RefreshRefreshes the state of the DirectoryInfo object.


Table 8: Methods of the Path class.
Method NameDescription
ChangeExtensionChanges the extension of the specified path string.
CombineConcatenates two path strings together.
GetDirectoryNameExtracts and returns the directory information for the specified path string.
GetExtensionReturns the extension of the specified path string.
GetFileNameReturns filename and extension of the specified path string.
GetFileNameWithoutExtensionReturns the filename of the specified path string without the extension.
GetFullPathReturns the absolute path for the specified path string.
GetPathRootReturns the root directory for the specified path.
GetTempFileNameReturns a unique temporary filename and creates a zero-byte file by that name on disk.
GetTempPathReturns the path of the temporary folder.
HasExtensionDetermines whether the specified path string includes an extension.
IsPathRootedReturns a value that indicates whether the specified path string contains an absolute path.


Listing 1: A directory listing utility
using System;
using System.IO;

public class TextDirs
{
  public static void Main(string[] args)
  {
    string dirName = @"c:\";
    if (args.Length > 0)
      dirName = args[0];

    DirectoryInfo di = new DirectoryInfo(dirName);
    foreach(FileSystemInfo fsi in 
                          di.GetFileSystemInfos("*.*"))
    {
      string text = "";

// Creation time
text += fsi.CreationTime.ToString() + '\t';

// Type and Size
if (fsi is DirectoryInfo)
{
   text += "<DIR>" + '\t';
   text += "     " + '\t';
}
else
{
         text += "      " + '\t';
         FileInfo fi = (FileInfo) fsi;
         text += String.Format(fi.Length.ToString(), 
               "{0}") + '\t';
      }

// Name
text += fsi.Name;
Console.WriteLine(text);
    }
  }
}