Each database system, such as SQL Server or Oracle, maintains a set of system tables that contain the metadata about the database objects contained therein. These system tables hold table and column names, relationship data, check constraints, and much more data. The SQL to retrieve this metadata is very different on each database server. For example, on SQL Server, you select all tables using the sys.tables table or the Information_Schema.Tables view. To retrieve all tables on Oracle, you use the all_tables table. In PostgreSQL, you can use the Information_Schema.Tables view. The problem is that if you want code that's consistent between different databases, you must write different SQL for each database system, then create an API for developers to call.

In this article, you'll learn how to use the GetSchema() method on the DbConnection class to retrieve tables, views, columns, index, stored procedures, and more from any database system. This method is implemented by each data provider to retrieve schema information in a generic fashion. What do you do with this information? You can present column names to your user to let them select columns to filter on for a report. You can use it to build your own code generator. You can even use it to create a SQL comparison tool. Follow along with this article to see how easy it is to use GetShema() to accomplish these various tasks.

Introducing the GetSchema() Method

GetSchema() is a virtual method on the DbConnection class in the .NET Framework. This method is overridden by the SQL Server, Oracle, ODBC, and OLE DB, and other data providers. Each provider uses the appropriate metadata tables in their respective servers to retrieve the metadata for each call you make to the GetSchema() method. The following list is an example of some of the metadata that you may be able to return from your database server. How many of these items you can retrieve is dependent on the data provider you're using.

  • Columns
  • Databases
  • Data types
  • Foreign keys
  • Index columns
  • Indexes
  • Procedure parameters
  • Procedures
  • Reserved words
  • Tables
  • Users
  • User-defined types
  • Views
  • View columns

To use the GetSchema() method, create an instance of the SqlConnection or OracleConnection, or your own provider's implementation of the DbConnection class, and pass a valid connection string to the constructor. Open the connection and invoke the GetSchema() method passing in a string of the collection you wish to return, as shown in the following code snippet.

using SqlConnection cn = new("CONNECTION STRING");

cn.Open();

DataTable dt = cn.GetSchema("COLLECTION NAME");

The “COLLECTION NAME” you pass to the GetSchema() determines which data you get back. For example, if you pass “Tables”, you get the list of tables within the database specified in your connection string. If you pass “Columns”, you get a list of all the columns within the database specified in your connection string.

The Same but Different

Although you can retrieve columns, tables, indexes, etc., from each provider, be aware that they don't always return the same column names for each call. For example, when you request all tables from SQL Server, GetSchema() returns the columns TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, and TABLE_TYPE to describe the tables. In Oracle, the columns OWNER, TABLE_NAME, and TYPE are used to describe the tables in the database. The ODBC and OLE DB providers provide different column names. Not all providers return the same data. For a complete list of the SQL Server, Oracle, ODBC, and OLE DB, check out the Microsoft documentation at https://bit.ly/3V3CMZv.

I wish Microsoft had provided a specification for what was returned by each call, as that would have made it easier if you need to target different database systems at your work. I guess Microsoft thinks most developers typically only work with a single system. For those of us who must work with multiple database systems, I recommend mapping the data returned from each provider to your own schema classes. You can then create your own collection of schema classes for each type of metadata you need to retrieve. I'll show you how to do this later in this article.

Adding Restrictions

Most of the calls to GetSchema() support a second parameter called restrictions. The restrictions are a string array with anywhere from one to five elements. For example, let's say you wish to restrict the columns to retrieve from a single table, declare a four-element array, and fill in the second and third elements with the schema and table name, as shown in the following code.

string?[] restrictions = new string?[4];
restrictions[1] = "SalesLT";
restrictions[2] = "Product";
DataTable dt = cn.GetSchema("Columns", restrictions);

Each collection has its own restrictions array, but most of the arrays use the catalog (database) name as the first element, the owner (schema) as the second element, and a database object name as the third element. You should note that there's no way to specify a wildcard search using the GetSchema() method. This means you can't have it return all tables that start with “Prod”, for example.

Get All Tables and Views

If you want to follow along with this article, create a new console application in VS 2022 or in VS Code named SchemaInfo. I'm using .NET 6/7 for these samples, but the GetSchema() method is also available in the later versions of the .NET Framework, so the code should work with some minor changes. After creating a console application, go into the NuGet Package Manager and add the System.Data.SqlClient package if you're using SQL Server as I'm going to be using in this article, or select the data provider for your database server. Open the Program.cs file and replace the contents with the code shown in Listing 1.

Listing 1: Display all tables in a database using the GetSchema() method

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get All Tables in the Database
DataTable dt = cn.GetSchema("Tables");

// Display Column Names
string format = "{0,-20}{1,-10}{2,-35}{3,-15}";
Console.WriteLine(format, "Catalog", "Schema", "Name", "Type");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["TABLE_CATALOG"], row["TABLE_SCHEMA"],
    row["TABLE_NAME"], row["TABLE_TYPE"]);
}

Add two using statements to bring in the namespaces for System.Data and System.Data.SqlClient. The GetSchema() method returns an instance of the DataTable class that's contained in the System.Data namespace. The SqlConnection class is contained within the System.Data.SqlClient namespace. Create a string variable named conn and set the connection string to one that references one of your SQL Server (or your specific database server) instances and a database within that instance. I'm using the AdventureWorksLT sample database located on my local computer.

Next, create an instance of a SqlConnection class passing in the connection string within a using block. Open the SqlConnection object and call the GetSchema() method passing in the string “Tables”. The GetSchema() method goes out to the database and uses system tables and/or views to retrieve the list of tables in that database.

Each .NET data provider calls different system tables/views to retrieve the list of tables. In SQL Server, for instance, the INFORMATION_SCHEMA.TABLES view is used to retrieve the list of tables. If you're using Oracle, ODBC, or OLE DB, those providers use different methods to retrieve the list of tables. Be aware that each provider can return different column names and may not return the same number or names of columns.

When returning the list of tables from SQL Server, four columns are returned: the catalog/database, the owner/schema, the table, and the type of object (BASE_TABLE or VIEW). When you pass “Tables” to the GetSchema() method, it returns both tables and views because a view is really just a read-only table. If you want to retrieve views only, you can pass “Views” to the GetSchema() method. If you want to retrieve only tables, you need to pass a second parameter to the GetSchema() method, as you will see in the next section.

The last few lines of code in Listing 1 set up a format string with how much space to allow for each column in the DataTable object. Within curly braces, you specify the column number and how much space to allow. Use a minus sign (-) to specify that you want the data in the column to be left-aligned. Iterate over each row and display each column on the console. To figure out how many and which columns are returned from each call to GetSchema(), check the documentation online, or call each one and view the columns that are returned.

Try It Out

Create the console application, add the System.Data.SqlClient package (or the applicable package for your database server), and add the code shown in Listing 1 to the Program.cs file. Modify the connection string to point to your server and a database on that server. Run the application and you should see a list of tables displayed like Figure 1. The list of tables you get back will be different from mine as it will be based on which database you are connected to.

Figure 1: GetSchema() can return a list of all tables and views in a database.
Figure 1: GetSchema() can return a list of all tables and views in a database.

Get Tables within a Specific Schema

As mentioned previously, you can pass in an array of values to the second parameter of the GetSchema() method. The array you pass is different for each method but is always a string array with one to five elements in it. For most calls, the first element is the catalog/database name, and the second parameter is the owner/schema name. After that, the elements you pass are dependent on what you're calling. Leave an element null if you don't wish to restrict the return values based on that item. For example, if you don't care to restrict the list of tables by the owner/schema, set that value to null.

In Listing 2, you see an example of a restriction array that's used to filter the tables returned. There are four restrictions you can specify for tables: catalog/database, owner/schema, table name, and a string with either “BASE TABLE” or “VIEW” within it. As I mentioned previously, the one thing missing from this interface is the ability to use a wildcard to retrieve all tables that start with “CU”, for example. After creating the four element array, pass it as the second argument to the GetSchema() method. In Listing 2, you only get those tables back that are in the SalesLT schema and are base tables, not views.

Listing 2: Display tables within a schema by using the restrictions array

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// restrictions[0] = Catalog/Database
// restrictions[1] = Owner/Schema
// restrictions[2] = Table Name
// restrictions[3] = Table Type
string?[] restrictions = new string?[4];
restrictions[0] = null;
restrictions[1] = "SalesLT";
restrictions[2] = null;
// Use 'BASE TABLE' or you get views too
restrictions[3] = "BASE TABLE";

// Get All Tables in the Database
DataTable dt = cn.GetSchema("Tables", restrictions);

// Display Column Names
string format = "{0,-20}{1,-10}{2,-35}{3,-15}";
Console.WriteLine(format, "Catalog", "Schema", "Name", "Type");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format, 
    row["TABLE_CATALOG"], row["TABLE_SCHEMA"],
    row["TABLE_NAME"], row["TABLE_TYPE"]);
}

Try It Out

Add the restrictions array shown in Listing 2 to the code you wrote in the Program.cs and run the application to see that you now only return tables within the SalesLT schema. If you're using a different database from AdventureWorksLT, adjust the schema/owner name to what is applicable to your database.

Get Views Only

There are two ways to retrieve views within a database. You can set the fourth element to “VIEW” in Listing 2, or you can call GetSchema() method using “Views” as the first argument, as shown in Listing 3. Be aware that the columns returned when you use “Views” as the first argument are different than when you use “Tables”. You still get the catalog/database, schema, and view name, but you also get CHECK_OPTION and IS_UPDATABLE columns in SQL Server. In Oracle, you get many other columns as well, and you also get the SQL text for the view. That's something I wish Microsoft had included in this call to GetSchema() for SQL Server.

Listing 3: Pass “Views” to the GetSchema() method to retrieve all views in the database.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get All Views in the Database
DataTable dt = cn.GetSchema("Views");

// Display Column Names
string format = "{0,-20}{1,-10}{2,-35}{3,-15}";
Console.WriteLine(format, "Catalog", "Schema",
  "Name", "Check Option");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["TABLE_CATALOG"], row["TABLE_SCHEMA"],
    row["TABLE_NAME"], row["CHECK_OPTION"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 3 and run the application to see just the views from your database.

Get Columns

When you request column information, 21 discreet pieces of information about the column are returned when using SQL Server, but the Oracle provider only returns nine pieces of information. The ODBC and OLE DB providers return 21 and 33 data points for a column, respectively. The data points in common among all providers are the catalog/database, owner/schema, column name, data type, length, precision, scale, and nullable. Listing 4 shows the code you write to retrieve column data from a SQL Server.

Listing 4: Retrieve column data using GetSchema() to retrieve column names and data types.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get All Columns in the Database
DataTable dt = cn.GetSchema("Columns");

string format = "{0,-20}{1,-30}{2,-20}{3,-20}";

// Display Column Names
Console.WriteLine(format, "Table Schema",
  "Table Name", "Column Name", "Data Type");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format, row["TABLE_SCHEMA"],
    row["TABLE_NAME"], row["COLUMN_NAME"],
    row["DATA_TYPE"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 4 and run the application to see all the columns in your database.

Get Columns for a Single Table

The code in Listing 5 shows the restrictions you use to filter the column data. The first element in the restrictions array is the catalog/database name. The second element is the schema name. The third element is the table name. In the code, I'm only setting the schema and table name to just list those columns that make up that one table. The fourth element can be a specific column name if you only need the data for a single column.

Listing 5: The column restrictions include catalog, owner, table and column name.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// restrictions[0] = Catalog/Database
// restrictions[1] = Owner/Schema
// restrictions[2] = Table/View Name
// restrictions[3] = Column Name
string?[] restrictions = new string?[4];
restrictions[0] = null;
restrictions[1] = "SalesLT";
restrictions[2] = "Product";
restrictions[3] = null;

// Get Columns for a Specific Table
DataTable dt = cn.GetSchema("Columns", restrictions);

string format = "{0,-20}{1,-30}{2,-20}{3,-20}";

// Display Column Names
Console.WriteLine(format, "Table Schema",
  "Table Name", "Column Name", "Data Type");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format, row["TABLE_SCHEMA"],
    row["TABLE_NAME"], row["COLUMN_NAME"],
    row["DATA_TYPE"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 5 and run the application to see the columns in the Product table. If you're using a different database, just substitute a different schema and table name into the second and third elements of the restrictions array.

Get Stored Procedures and Functions

The call to GetSchema() to return stored procedures is similar to the call to return tables. It returns stored procedures and functions just like the tables call returns both tables and views. The code in Listing 6 is just like the previous functions, except you pass “Procedures” to the GetSchema() method. The columns returned include the catalog/database, owner/schema, routine name, routine type (procedure or function), the date created, and the date the routine was last altered. I'm only displaying a few of these columns in the code provided in Listing 6.

Listing 6: The call to get procedures returns both stored procedures and functions.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get the Stored Procedures and
// Functions in the Database
DataTable dt = cn.GetSchema("Procedures");

// Display Column Names
string format = "{0,-12}{1,-40}{2,-12}{3,-30}";
Console.WriteLine(format, "Schema",
  "Procedure Name", "Type", "Created");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["ROUTINE_SCHEMA"], row["ROUTINE_NAME"],
    row["ROUTINE_TYPE"], row["CREATED"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 6 and run the application to see all the stored procedures and functions in your database.

Get Stored Procedures Only

The problem with the above code is that it gives you both stored procedures and functions. You need to pass the second parameter to this call and specify “PROCEDURE” or “FUNCTION” as the fourth element of the restrictions array. Add the code shown below prior to calling the GetSchema() method, and add this restrictions array as the second parameter in the call to GetSchema().

string?[] restrictions = new string?[4];
restrictions[0] = null;
restrictions[1] = null;
restrictions[2] = null;
restrictions[3] = "PROCEDURE";
// Get Stored Procedures Only
DataTable dt = cn.GetSchema("Procedures", restrictions);  

Try It Out

Add the code shown above into the appropriate location in the Program.cs file and run the application to see just the stored procedures within your database.

Get a List of Databases on SQL Server

In SQL Server, there's the concept of a database within a server. In Oracle, the database is the instance of the Oracle server you connect to. So, the only database provider you can pass the string “Databases” to the GetSchema() method is the SQL Client provider. Listing 7 shows the code to get the list of databases within a SQL Server instance. Notice that the connection string does not require the Initial Catalog key/value pair; you just need to connect to the server with valid credentials to retrieve the list of databases. Only three columns are returns from this call: database_name, dbid, and create_date.

Listing 7: Only when using SQL Server can you get the complete list of databases from your server instance.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get the Databases in SQL Server
DataTable dt = cn.GetSchema("Databases");

// Display Column Names
string format = "{0,-25}{1,-10}{2,-25}";
Console.WriteLine(format, "Database Name", "DB ID", "Created");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["database_name"], row["dbid"],
    row["create_date"]);
}

Try It Out

If you're using SQL Server, modify the code you wrote in the Program.cs to match the code shown in Listing 7 and run the application to see all the databases in your SQL Server instance.

Get Users

When using SQL Server or Oracle, you can retrieve the list of users by passing “Users” to the GetSchema() method. On each system, you get the user name, the user ID, and the creation date. On SQL Server, you also get the last time the user information was modified. The code in Listing 8 shows you how to retrieve all users within the database AdventureWorksLT. If you eliminate the Initial Catalog key/value pair, you can retrieve all users in the SQL Server instance.

Listing 8: You can retrieve user data on both SQL Server and Oracle.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get the Users
DataTable dt = cn.GetSchema("Users");

// Display Column Names
string format = "{0,-25}{1,-10}{2,-25}";
Console.WriteLine(format, "User Name", "User Id", "Created");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["user_name"], row["uid"],
    row["createdate"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 8 and run the application to see all the users in your database. Try running the code with and without the Initial Catalog to see the differences.

Get Metadata about GetSchema() Method

There are a lot more objects you can retrieve from the database other than tables, views, procedures, and columns. But I think you see the pattern on how to retrieve the other items I listed at the beginning of this article. Let's now look at getting information about the GetSchema() method itself.

Because you're passing a string as the first parameter to the GetSchema() method, you may be wondering how I knew what string value to pass to get each collection. It turns out that if you call GetSchema() with no parameters (or with the parameter “MetaDataCollections”), it returns a list of all of the collection names you may use, how many restrictions there are for each collection, and how many identifier parts are used.

In the code shown in Listing 9, I'm using the SQL Server provider, so the list of collections that are returned (shown in Figure 2) are different if you are using an Oracle, ODBC, or OLE DB provider.

Listing 9: Call GetSchema() with no parameters to retrieve information about each collection name you can retrieve.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get the Metadata Collections
DataTable dt = cn.GetSchema();
//DataTable dt = 
//  cn.GetSchema("MetaDataCollections");

// Display Column Names
string format = "{0,-25}{1,-20}{2,-20}";
Console.WriteLine(format, "Collection Name",
  "# of Restrictions", "# of Identifier Parts");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["CollectionName"],
    row["NumberOfRestrictions"],
    row["NumberOfIdentifierParts"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 9 and run the application to see all the collections you can pass to the GetSchema() method. When running SQL Server, you should see a list shown in Figure 2.

Figure 2: A list of all the collections that GetSchema() can retrieve when using the SQL Server provider
Figure 2: A list of all the collections that GetSchema() can retrieve when using the SQL Server provider

Identifier Part

The last column shown in Figure 2 is called Identifier Part. An identifier part refers to how many composite parts are needed to fully qualify a database object. For example, in SQL Server, to identify a table, it takes three parts in the form of DatabaseName.SchemaName.TableName. For Oracle, it only takes two parts: OwnerName.TableName. So the numbers you see in Figure 2 may be different if you're running a provider other than SQL Server.

Get Metadata about Restrictions

Other questions you may have include how you know how many elements and what elements you can set in the restrictions array. Once again, you pass an argument to the GetSchema() method, the string “Restrictions”. Calling GetSchema() with this argument returns each collection name and any corresponding values you can set, as shown in Figure 3. Methods that are self-documenting like this are a great idea and I wish Microsoft would do this with more of their APIs.

Figure 3: Retrieve a list of all restrictions for each collection name using the GetSchema() method.
Figure 3: Retrieve a list of all restrictions for each collection name using the GetSchema() method.

In the code shown in Listing 10, you get five columns returned, as shown in Figure 3. The first column is the collection name, the second column is the restriction name that tells you what you would put into each element of the array. The third and fourth column aren't that important, but the fifth column tells you into which element number you place the data you want to filter upon.

Listing 10: The GetSchema() method tells you which restrictions are available for each collection.

using System.Data;
using System.Data.SqlClient;

string conn = "Data Source=Localhost;
  Integrated Security=True;";

using SqlConnection cn = new(conn);

cn.Open();

// Get All Restrictions
DataTable dt = cn.GetSchema("Restrictions");

// Display Column Names
string format = "{0,-25}{1,-20}
  {2,-20}{3,-25}{4,-10}";
Console.WriteLine(format, "Collection Name",
  "Restriction Name", "Parameter Name", 
  "Default", "Number");

// Display Data
foreach (DataRow row in dt.Rows) {
  Console.WriteLine(format,
    row["CollectionName"], 
    row["RestrictionName"],
    row["ParameterName"], 
    row["RestrictionDefault"],
    row["RestrictionNumber"]);
}

Try It Out

Change the code you wrote in the Program.cs to match the code shown in Listing 10 and run the application to see all the restrictions that are applicable for your data provider. Depending on what provider you're running, you might see different values from those shown in Figure 3.

SQL Compare Utility

Now that you've seen how to retrieve metadata about your databases such as tables and columns, what can you do with this information? As I mentioned at the beginning of the article, you can present a list of fields to filter upon for reports or build a code generator. Another idea is to build a tool that checks to see what tables, columns, indexes, etc. are missing between a development, a QA, and production databases.

Think about a typical software development lifecycle. As you develop your application, you make database changes to a development database. When you're ready to move your code to the QA team for them to check, you not only need to create a build of your application, but you also need to update the QA database so it matches your development database. Once your QA process is complete, you need to determine the differences between your QA database and the production database. There are a few tools on the market that you can purchase to accomplish this task. However, you have almost all the tools you need from the code shown in this article to create your own SQL compare utility. You just need a few classes and some LINQ queries to build this compare utility.

Create Schema Entity Classes

Because each data provider returns different columns in their DataTable objects, it's a good idea to create some classes to hold the table and column information. You could also create classes to hold index, procedure, function, and view information as well, but let's just build a couple so you can see the design pattern. You can then add additional classes as you need them.

Create a class named SchemaBase that holds common properties for different schema items, as shown in the following code snippet.

#nullable disable
namespace SchemaInfo;
public class SchemaBase
{
  public string Catalog { get; set; }
  public string Schema { get; set; }
  public string TableName { get; set; }
  public override string ToString()
  {
    return $"{Schema}.{TableName}";
  }
}

I'm not going to worry about initializing every non-nullable string, so I'm adding the #nullable disable directive at the top of the file. This class is going into the namespace SchemaInfo. Add three properties to hold the catalog/database: the owner/schema, and the table name. I like overriding the ToString() method to make looking at classes easier when in a breakpoint in Visual Studio.

Next, create a class named TableSchema that inherits from SchemaBase. The only other property you need for a table, other than the three in the SchemaBase class, is the TableType property. Put this class into the SchemaInfo namespace as well.

#nullable disable
namespace SchemaInfo;
public class TableSchema : SchemaBase
{
  public string TableType { get; set; }
}

Finally, create a class named ColumnSchema that also inherits from the SchemaBase class and adds three more properties: ColumnName, OrdinalPosition, and DataType. Once again, override the ToString() method to show information about the column when using the data viewer in Visual Studio or when passing an instance of this class to the Console.WriteLine() method.

#nullable disable
namespace SchemaInfo;
public class ColumnSchema : SchemaBase
{
  public string ColumnName { get; set; }
  public int OrdinalPosition { get; set; }
  public string DataType { get; set; }
  public override string ToString()
  {
    return $"{base.ToString()}
      .{ColumnName} ({OrdinalPosition})";
  }
}

Create Compare Helper Class

Create a static class named SqlServerCompareHelper in your console application. This class is going to have all static methods to perform the comparisons. Make the new class file look like the following code snippet.

#nullable disable
using System.Data;
namespace SchemaInfo;
public static class SqlServerCompareHelper
{
}

Add a method named GetData() to which you pass a connection string and the collection name to retrieve. This method creates a new connection and opens that connection to the database. It then calls the GetSchema() method, passing in the collection name. The DataTable returned from GetSchema() is returned from this method.

public static DataTable GetData(
  string conn, string name)
{
  using SqlConnection cn = new(conn);
  cn.Open();
  // Get the Meta Data
  return cn.GetSchema(name);
}

Map DataTable Columns to Schema Classes

Add a TableSchemaToList() method (Listing 11) to the SqlServerCompareHelper class. This method takes the DataTable from the call to the GetSchema("Tables") method and builds a generic list of TableSchema objects. Each iteration through the rows of the DataTable creates an instance of a TableSchema class and maps the columns from the DataTable to each property in the TableSchema class. If you're using a data provider other than SQL Server, modify the columns I'm using in the DataTable to the ones returned from the GetSchema() method of your provider.

Listing 11: Map the columns from the DataTable to create a list of TableSchema objects

private static List<TableSchema> 
  TableSchemaToList ( DataTable dt )
{
  List<TableSchema> ret = new();

  foreach (DataRow row in dt.Rows) {
    TableSchema entity = new()
    {
      Catalog = row["TABLE_CATALOG"].ToString(),
      Schema = row["TABLE_SCHEMA"].ToString(),
      TableName = row["TABLE_NAME"].ToString(),
      TableType = row["TABLE_TYPE"].ToString()
    };

    ret.Add(entity);
  }

  return ret;
}

Add a method named ColumnSchemaToList() (Listing 12) to the SqlServerCompareHelper class. In this method, you take the results from the call to the GetSchema("Columns") method and turn each row of the DataTable into a generic list of TableColumn objects. If you're using a data provider other than SQL Server, modify the column names in the DataTable. I'm using the ones returned from the GetSchema() method of your provider.

Listing 12: Map the columns from the DataTable to create a list of TableColumn objects

private   static  List<ColumnSchema> 
   ColumnSchemaToList ( DataTable dt )
{
  List<ColumnSchema> ret = new();

  foreach (DataRow row in dt.Rows) {
    ColumnSchema entity = new()
    {
       Catalog = row["TABLE_CATALOG"].ToString(),
       Schema = row["TABLE_SCHEMA"].ToString(),
       TableName = row["TABLE_NAME"].ToString(),
       ColumnName = row["COLUMN_NAME"].ToString(),

       OrdinalPosition = Convert.ToInt32(row["ORDINAL_POSITION"]),
       DataType = row["DATA_TYPE"].ToString(),
    };

    ret.Add(entity);
  }

  return ret;
}

Create Table Compare Method

Add a method named CompareTables() (Listing 13) to perform the comparison of the tables between two different databases on two different servers. In this method, you pass in two different connection strings. The first one is to the database with the changes made during development, the second one is to the database you wish to check to see if the changes have been made or not.

Listing 13: Use LINQ to compare two collections to determine what tables are missing from a database.

public static List<TableSchema> 
  CompareTables(string connSource, string connTarget)
{
  List<TableSchema> ret = new();
  List<TableSchema> sourceList;
  List<TableSchema> targetList;

  sourceList = TableSchemaToList(GetData(connSource, "Tables"));
  targetList = TableSchemaToList(GetData(connTarget, "Tables"));

  // Simulate missing tables
  targetList.RemoveRange(2, 3);

  // Use ExceptBy to locate differences 
  // between the two lists
  ret = sourceList.ExceptBy<TableSchema,object>(
    targetList.Select(row => new {row.Catalog, 
      row.Schema,row.TableName,row.TableType}),
    row => new {row.Catalog, row.Schema, 
      row.TableName, row.TableType}).ToList();

  return ret;
}

Pass in two different connection strings to this method. Create three generics lists of TableSchema classes. One is used as the return value, the other two hold the list of TableSchema classes returned after building the DataTables from each database by calling GetSchema("Tables") and turning those into the generic lists by calling the TableSchemaToList() method.

Instead of using two different servers and two different databases, I'm going to use the same database, but I'm going to simulate that the target database is missing a few items. I do this by removing a few items using the RemoveRange() method on the targetList variable. The data within the sourceList and the targetList collections are now different.

To retrieve the list of what items are missing from the target database, you can employ the LINQ ExceptBy() method. This method tells you which items are missing from one collection to another. The ExceptBy() method is a generic method so the first generic type you supply is the type of data to return, in this case that's TableSchema. The second generic type to pass is the key type you're going to be using to compare the data between the source and the target lists.

The ExceptBy() method needs two lists of data, so two arguments are passed to it. Create the first argument by selecting all rows from the targetList collection and turning each row into an anonymous object with all four properties. The second argument is a lambda expression created by turning each row from the sourceList into an anonymous object with four properties that match the first list. Remember that when using comparison methods in LINQ, unless you've implemented an EqualityComparer class for the TableSchema class, the method uses reference comparison to see if one object reference is equal to the other. That won't work, so you must create anonymous objects to compare all four properties from one collection to the other.

Create Column Compare Method

Now that you've seen how to perform comparisons using the LINQ ExceptBy() method, create other comparison methods to find the differences between all the objects between one database and another. In Listing 14, you see a method named CompareColumns() that compares all the columns from one database to another. I'm just going to write these two comparison methods for you. You now have the design pattern you can follow to perform comparisons between other database objects such as indexes, stored procedures, views, etc.

Listing 14: Use LINQ to compare two collections to determine what columns are missing from a database.

public static List<ColumnSchema> 
  CompareColumns(string connSource, string connTarget)
{
  List<ColumnSchema> ret = new();
  List<ColumnSchema> sourceList;
  List<ColumnSchema> targetList;

  sourceList = ColumnSchemaToList(GetData(connSource, "Columns"));
  targetList = ColumnSchemaToList(GetData(connTarget, "Columns"));

  // Simulate missing columns
  targetList.RemoveRange(1, 5);

  // Use ExceptBy to locate 
  // differences between the two lists
  ret = sourceList.ExceptBy<ColumnSchema, object>(
      targetList.Select(row => new {
          row.Catalog, row.Schema, 
          row.TableName, row.ColumnName,
          row.OrdinalPosition, row.DataType }),
      row => new { row.Catalog, row.Schema,
          row.TableName, row.ColumnName, 
          row.OrdinalPosition, row.DataType }).ToList();

  return ret;
}

Try It Out

Modify the Program.cs file to match the code shown below and run the application to see the simulated differences between tables from one database to another.

using SchemaInfo;

string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
    Integrated Security=True;";

List<TableSchema> tables = 
    SqlServerCompareHelper.CompareTables(conn, conn);
foreach (TableSchema item in tables) {Console.WriteLine(item);}

Now, change the last few lines of code to retrieve the differences between columns and run the application to see the differences.

List<ColumnSchema> columns = 
    SqlServerCompareHelper.CompareColumns(conn, conn);
foreach (ColumnSchema item in columns) {Console.WriteLine(item);}

Query Collections Using T-SQL

Although the GetSchema() method is a very useful tool, it doesn't provide you with all of the metadata you might wish to retrieve. Almost all database servers provide you with much more information than is returned from GetSchema(). For example, in SQL Server, if you select all columns from the sys.tables system table, you get a ton of information about the table that GetSchema() doesn't return. For example, you can get the creation date, last modified date, the maximum IDENTITY value generated, and many more items. If you wish to retrieve check constraints, column privileges, or if you wish to join two or more of the system tables to retrieve a table and all of its related tables, there's no way to accomplish these tasks using the GetSchema() method. As I also mentioned previously, there's no way to use wildcards to retrieve all tables or columns that match a specific criterion.

Instead of using the GetSchema() method to return metadata, let's look at using T-SQL and ADO.NET to retrieve the same metadata, but solve some of the issues I just mentioned. Add a new class named TSqlHelper to your console application as shown in the code snippet below.

using System.Data;
using System.Data.SqlClient;
namespace TSqlSchemaInfo;
public static class TSqlHelper 
{
  public static void DisplayTables(string conn,
  string? schema, string? table) 
  {
  }
}

Get Tables

Fill in the DisplayTables() method with the code shown in Listing 15. This method accepts three parameters: a connection string, an optional schema name, and an optional table name. For this sample, I'm using the Information_Schame.Tables view, but I could have used the sys.tables system table as well. Check the schema parameter to see if it's null, and if it isn't, add the AND statement to check for where the TABLE_SCHEMA column is like the value passed in. Add the percent sign (%) after the schema variable so it can find all schemas that match the beginning of the schema value. Feel free to modify this wildcard to work as you see fit.

Listing 15: When using T-SQL you can add wildcard capability to your code to display tables.

public static void DisplayTables(string conn, 
  string? schema, string? table)
{
  DataTable dt = new();
  string sql = "SELECT * FROM INFORMATION_SCHEMA.TABLES";
  sql += " WHERE TABLE_TYPE = 'BASE TABLE'";
  if (schema != null) {
    sql += $" AND TABLE_SCHEMA LIKE '{schema}%'";
  }
  if (table != null) {
    sql += $" AND TABLE_NAME LIKE '{table}%'";
  }

  using SqlDataAdapter da = new(sql, conn);

  da.Fill(dt);

  // Display Column Names
  string format = "{0,-10}{1,-35}{2,-15}";
  Console.WriteLine(format, "Schema", "Name", "Type");

  // Display Data
  foreach (DataRow row in dt.Rows) {
    Console.WriteLine(format,
      row["TABLE_SCHEMA"], 
      row["TABLE_NAME"], 
      row["TABLE_TYPE"]);
  }
}

Next, check the table parameter to see if it's null, and if it isn't, add the AND statement to check where the TABLE_NAME is like the value passed in. You now have very flexible code to check for both a schema name and table name, just a schema name, just a table name, or don't check for either and have all tables in all schemas displayed from this method. This is a much more flexible approach than the one used by the GetSchema() method. The rest of the code in this method is standard ADO.NET. Use the SqlDataAdapter to submit the SQL and retrieve a DataTable filled with the metadata. Display the column headings, then iterate over the rows in the DataTable and display the table information found.

Try It Out

Now that you have this method created in the TSqlHelper class, open the Program.cs file and modify it to look like the following:

using TSqlSchemaInfo;
string conn = "Data Source=Localhost;
  Initial Catalog=AdventureWorksLT;
  Integrated Security=True;";
// Get all Tables
TSqlHelper.DisplayTables(conn, null, null);

Run the console application and you should see a list of all tables in your database. You can also pass in a schema name as the second parameter to display all tables just within that one schema.

// Get Tables in 'dbo' Schema
TSqlHelper.DisplayTables(conn, "dbo", null);

Another option is to pass in a partial table name as the last parameter and display all tables that start with that prefix as shown in the following code snippet.

// Get any Tables That Starts with 'Cust'
TSqlHelper.DisplayTables(conn, null, "Cust");

The last option is to pass both a schema name and a table name (or partial name) to only display tables that match the name passed within the schema “dbo”.

// Get any Tables in the "dbo" Schema 
// That Start with "Cust"
TSqlHelper.DisplayTables(conn, "dbo", "Cust");

Get Check Constraints

Check constraints is metadata that you can't retrieve from the GetSchema() method. Add another method to the TSqlHelper class named DisplayCheckConstraints(), as shown in Listing 16. In this method, use the Information_Schema.Check_Constraints view to retrieve all check constraints. You may also optionally pass in a schema name to only display those constraints within that schema. If you want, feel free to add a third parameter to filter on a constraint name similar to what you did in the DisplayTables() method.

Listing 16: Check constraints cannot be retrieved using GetSchema(), so write your own code to retrieve this metadata.

public static void DisplayCheckConstraints(
  string conn, string? schema)
{
  DataTable dt = new();
  string sql = "SELECT * FROM 
    INFORMATION_SCHEMA.CHECK_CONSTRAINTS";
  if (schema != null) {
    sql += $" WHERE CONSTRAINT_SCHEMA 
      LIKE '{schema}%'";
  }

  using SqlDataAdapter da = new(sql, conn);

  da.Fill(dt);

  // Display Column Names
  string format = "{0,-10}{1,-40}{2,-40}";
  Console.WriteLine(format, "Schema", "Name", "Clause");

  // Display Data
  foreach (DataRow row in dt.Rows) {
    Console.WriteLine(format, 
      row["CONSTRAINT_SCHEMA"],
      row["CONSTRAINT_NAME"], 
      row["CHECK_CLAUSE"]);
  }
}

Try It Out

Open the Program.cs file and call this new method to display all the check constraints in your database.

// Get all Check Constraints
TSqlHelper.DisplayCheckConstraints(conn, null);

Next try out passing in a specific schema to only retrieve those check constraints within a specific schema.

// Get all Columns in 'SalesLT' Schema
TSqlHelper.DisplayCheckConstraints(conn, "SalesLT");

Get Foreign Keys with Columns

Another set of data that's impossible to retrieve using the GetSchema() method is to retrieve the foreign keys with the column(s) that make up those foreign keys. Yes, you can get the foreign key names and the tables to which they belong, and you can get the index columns, but there's no way to get this data back in a single call. However, using the system tables in SQL Server, you can create a JOIN to retrieve this information, as shown in Figure 4.

Figure 4: Display the foreign key name, table name, and column used in the foreign key
Figure 4: Display the foreign key name, table name, and column used in the foreign key

Add a new method to the TSqlHelper class named DisplayForeignKeys(), as shown in Listing 17. Create a SQL JOIN to retrieve the key name, the parent table and column, and the referenced table and column for each foreign key in your database. You can also pass in a table name to just retrieve the foreign keys for that specific table. This SQL is specific to SQL Server, but you should be able to create similar SQL to retrieve the same information from any database server.

Listing 17: Get a list of foreign key table and column information using a custom SQL query.

public static void DisplayForeignKeys(
  string conn, string? table)
{
  DataTable dt = new();
  string whereSQL = string.Empty;
  if (table != null) {
    whereSQL += $" WHERE parentTab.name LIKE '{table}%' ";
  }

  string sql =
    @$"SELECT
    obj.name AS KeyName,
    sch.name AS SchemaName,
    parentTab.name AS TableName,
    parentCol.name AS ColumnName,
    refTable.name AS ReferencedTableName,
    refCol.name AS ReferencedColumnName
  FROM sys.foreign_key_columns fkc
  INNER JOIN sys.objects obj
    ON obj.object_id = fkc.constraint_object_id
  INNER JOIN sys.tables parentTab
    ON parentTab.object_id = fkc.parent_object_id
  INNER JOIN sys.schemas sch
    ON parentTab.schema_id = sch.schema_id
  INNER JOIN sys.columns parentCol
    ON parentCol.column_id = parent_column_id
    AND parentCol.object_id = parentTab.object_id
  INNER JOIN sys.tables refTable
    ON refTable.object_id = fkc.referenced_object_id
  INNER JOIN sys.columns refCol
    ON refCol.column_id = referenced_column_id
    AND refCol.object_id = refTable.object_id
  {whereSQL}
  ORDER BY TableName";

  using SqlDataAdapter da = new(sql, conn);

  da.Fill(dt);

  // Display Column Names
  string format = "{0,-48}{1,-33}{2,-25}";
  Console.WriteLine(format, "Key Name",
    "Table Name", "Column Name");

  // Display Column Data
  foreach (DataRow row in dt.Rows) {
    string kName = row["KeyName"].ToString()
      ?? string.Empty;
    string tName = row["TableName"].ToString()
      ?? string.Empty;
    Console.WriteLine(format,
      kName[0..Math.Min(45, kName.Length)],
      tName[0..Math.Min(30, tName.Length)],
      row["ColumnName"]);
  }
}

Try It Out

Now that you have the foreign key method created in the TSqlHelper class, open the Program.cs file, and make the call to this method as follows:

// Get all Foreign Keys
TSqlHelper.DisplayForeignKeys(conn, null);

To view just a single table's foreign keys, pass in the table name to the second parameter.

// Get Foreign Keys for a Table
TSqlHelper.DisplayForeignKeys(conn, "SalesOrderDetail");

As you can see, you have much more flexibility when you use the system tables/views in your database system as compared to using the GetSchema() method.

Free SQL Compare and Code Generation Utility

Instead of having to build your own SQL Compare tool, you can download one for free. I created a set of developer utilities (Figure 5) years ago and they are available at https://github.com/PaulDSheriff/PDSC-Tools. Besides a SQL Server comparison tool, you also get a Computer Cleaner tool that helps clean up the multiple locations where Visual Studio and .NET leave a bunch of temporary files. There is also a Project Cleaner to remove folders and files that are unnecessary when you're doing an initial check-in to source control, or you just want to zip up the project to send to a colleague. The next utility is a Property Generator that helps you build different types of properties for your classes. Included are auto-properties, full properties, and raise property changed properties. Simple text templates are available for you to modify these or add your own types of property generation.

Figure 5: Download the free set of PDSC Developer Utilities to get the SQL Compare tool, plus much more.
Figure 5: Download the free set of PDSC Developer Utilities to get the SQL Compare tool, plus much more.

Two other tools (JSON Generator and XML Generator) let you choose a table, view, or type in your own SQL and generate either XML or JSON from your table. The C# Data Repo lets you choose a table, view, or type in your SQL and generate a hard-coded C# class with data from the SQL. This is useful if you're building a prototype and don't want to mess with a database. The Code Generator tool generates full CRUD classes using the Entity Framework. It also can generate full CRUD MVC pages using .NET 6/7. You can customize this code generator. You may also add your own sets of classes, pages, or anything you want.

Summary

In this article, you learned how to retrieve metadata about your database using the .NET Framework's GetSchema() method. This method provides a standard method to retrieve data about your database objects. However, be aware that the data that's returned from each call is different from one data provider to another. Retrieving metadata from your database can be useful to create all sorts of utilities such as a SQL comparison tool, or a code generator. You should take some time to understand the system tables in your specific database server as they can provide you with much more information than the GetSchema() method.