Making Sense of the XML DataType in SQL Server 2005

As database developers, many of us have had to dip our feet into the wide ocean of XML.

It should come as good news that in SQL Server 2005, you can store XML in the database with a new XML datatype. Although this is good news, many developers have been storing XML in the database for some time now. Without implicit support for XML, developers have been shoving XML documents into text fields since XML’s inception.

SQL Server 2000 included some XML features out of the box. Key among these features was the ability to return results as XML using the FOR XML clause. SQL Server 2005’s functionality is markedly different. In SQL Server 2005, XML is a genuine data type, which means that you can use XML as a column in tables and views, in T-SQL statements, or as parameters of stored procedures. You can now store, query, and manage XML documents directly in the database.

SQL Server 2005's XML functionality is markedly different from what SQL Server 2000 provides.

More importantly, you can also now specify the schema to which your XML must conform. Aside from providing a mechanism to validate your XML in the database, this also allows you to describe complex types of data to be stored and to have an engine that enforces those rules.

Using the XML Datatype

The XML datatype is not substantially different than any other datatype in SQL Server. It can be used in any place you would ordinarily use any SQL datatype. For example, the following creates an XML variable and fills it with a XML:

DECLARE @doc xml
    
SELECT @doc = '<Team name="Braves" />'

Although literal XML is useful, you can also fill an XML variable using a query and the SQL Server’s FOR XML syntax:

SELECT @doc =
  (SELECT * FROM Person.Contact FOR XML AUTO)

The XML datatype is not limited to use as a variable. You can also use the XML data type in table columns. You can assign default values and the NOT NULL constraint is supported:

CREATE TABLE Team
(
  TeamID int identity not null,
  TeamDoc xml DEFAULT '<Team />' NOT NULL
)

Inserting XML data into tables is just a matter of specifying the XML to add in the form of a string:

-- Insert a couple of records
INSERT INTO Team (TeamDoc)
VALUES ('<Team name="Braves">
           <Players>
             <Pitcher name="John Smoltz"
                      role="Closer"/>
           </Players>
         </Team>');
    
INSERT INTO Team (TeamDoc)
VALUES ('<Team name="Red Sox">
           <Players>
             <Pitcher name="Petro Martinez"
                      role="Starter"/>
           </Players>
        </Team>');

When creating instances of XML in SQL Server 2005, the only conversion is from a string to XML. Similarly, going in the reverse direction, you can only convert to a string. Converting to and from text and ntext is not allowed.

Limitations of the XML Data Type

Although the XML datatype is treated like many other datatypes in SQL Server 2005, there are specific limitations to how it is used. These limitations are:

XML types cannot convert to text or ntext data types.
No data type other than one of the string types can be cast to XML.
XML columns cannot be used in GROUP BYstatements.
Distributed partitioned views or materialized views cannot contain XML data types.
Use of the sql_variant instances cannot include XML as a subtype.
XML columns cannot be part of a primary or foreign key.
XML columns cannot be designated as unique.
Collation (COLLATE clause) cannot be used on XML columns.
XML columns cannot participate in rules.
The only built-in scalar functions that apply to XML columns are ISNULL and COALESCE. No other scalar built-in functions are supported for use against XML types.
Tables can have only 32 XML columns.
Tables with XML columns cannot have a primary key with more than 15 columns.
Tables with XML columns cannot have a timestamp data type as part of their primary key.
Only 128 levels of hierarchy are supported within XML stored in the database.

XML Type Methods

Up to this point, the examples have shown the XML datatype being used as just a blob of data, but this is where the real power of the XML data type shows itself. The XML data type supports several methods that can be called using the UDT dot syntax (myXml.operation()) syntax. The supported operations are listed in Table 1.

You can use XML as a column in tables and views, in T-SQL statements, or as parameters of stored procedures.

For the following sections, you will use a table called Team that contains a row for every team name. In each row, there is a TeamDoc row that contains XML about the team:

CREATE TABLE Team
(
  TeamID int identity not null,
  TeamDoc xml DEFAULT '<Team />' NOT NULL
)

In the examples, assume that the following XML document exists in the Braves row of the table:

<Team name="Braves">
  <Players>
    <Pitcher name="John Smoltz" role="Closer"/>
    <Pitcher name="Russ Ortiz" role="Starter" />
    <ThirdBase name="Chipper Jones"
               role="Starter" bats="switch"/>
  </Players>
</Team>

Query Method

The query method allows you to specify an XQuery or XPath expression to evaluate. The result of the query method is an XML data type object. The specific syntax of the query method is:

query(XQuery)

The first parameter is always an XQuery expression. The following example uses a query to return an XML document with information about each team’s pitcher:

SELECT TeamDoc.query('/Team/Players/Pitcher')
FROM Team

This produces the following results:

----------------------------------------------
<Pitcher name="John Smoltz" role="Closer" />
<Pitcher name="Russ Ortiz" role="Starter" />
    
(1 row(s) affected)

The query method allows you to find and return nodes lists that match the XQuery expression you specify. The real power of the query method comes from the XQuery syntax, which is covered in detail later in this article.

Exist Method

The exist method is similar to the query method except that it is used to determine whether a query yields any results. The syntax for the exist method is:

exist(XQuery)

When you use the exist method, it evaluates the query and returns the value of 1 if the query yields any results. For example, this query finds the rows in the team table where the TeamDoc field has starting pitchers:

-- Simple Exist clause
SELECT Count(*)
FROM Team
WHERE TeamDoc.exist(
   '/Team/Players/Pitcher[@role="Starter"]') = 1

Value Method

There are times when you do not want to interpret a whole query’s result just to get a scalar value: this is where the value method is helpful. The value method is used to query the XML and return an atomic value. The syntax for the valuemethod is:

value(XQuery, datatype)

Use the value method when you want to get a single scalar value from the XML. You must specify the XQuery statement and the datatype you want it to return and you can return any datatype except the XML datatype. For example, if you want to get the name of the first pitcher on every team, you can write the query like this:

-- Do a Query to get an individual value
SELECT
TeamDoc.value('(/Team/Players/Pitcher/@name)[1]',
              'nvarchar(max)')
       as FirstPitcher
FROM Team

This query results in the scalar value of the first pitcher for each team returned in the result:

FirstPitcher
------------------------------
John Smoltz
    
(1 row(s) affected)

The difference between the query and value methods is that the query method returns an XML datatype that contains the results of the query, and the value method returns a non-XML datatype with the results of the query. The value method can only return a single (or scalar) value. You will get an error if you try to create an XQueryexpression that returns more than one value using the value method.

Modify Method

Although the XQuery standard does not provide a mechanism for updating XML, SQL Server 2005 supports a way of modifying parts of an XML object in place. This means that you do not have to retrieve an entire XML document just to make changes. To modify a document in place, you use a combination of the modify method and SQL Server 2005’s new XML Data Modification Language (XML DML).

The syntax for the Modify method is:

modify(<XMLDML>)

The Modify method takes only a single parameter, the XML DML statement. XML DML is similar, but not identical, to SQL’s insert, update and delete syntax. For example, you can modify the XML by using the insert DML statement:

SET @doc.modify('
  insert <Pitcher name="Jaret Wright"/> as last
  into (/Team/Players)[1]
')

You can do the same thing to modify an XML column by calling modify in an UPDATE statement:

-- Modify an XML doc without replacing
-- it completely!
UPDATE Team
SET TeamDoc.modify('
  insert <Pitcher name="Jaret Wright"/> as last
  into (/Team/Players)[1]
')
WHERE TeamDoc.exist('/Team[@name="Braves"]') = 1

Notice that the SET clause in this UPDATE statement does not follow the SET x = y pattern that you may be used to from writing SQL in the past. That syntax assumes that you will provide a complete new value to replace the old, which, in the case of XML, means a completely new document to replace the old. When using the XML type, the modify method changes the original document in place. There’s no need for generating a completely new and separate document, or of SQL Server attempting to replace an entire document with every change. The SET syntax in the example reflects the more efficient approach of updating a document in place.

There are three XML DML statements: insert, update, and delete. Not surprisingly, they are used to insert, update, and delete parts of an XML object. Each of these syntaxes are similar to SQL, but with some definite differences. Let’s look at the syntax for each statement separately.

Here is the syntax for the insert statement:

insert
   InsertExpression (
    {{as first | as last}
      into | after | before} LocationExpression
                    )

Immediately following the insert statement is the XML that you want to insert (InsertExpression). Next you specify how you want the XML inserted. Your choices are into, after, or before. The before and after clauses instruct the database to insert the InsertExpression as a sibling to the LocationExpression. The use of before or after specifies whether to insert it before or after the LocationExpression:

SET @doc.modify('
  insert <Pitcher role="Starter"
                  name="Jaret Wright"/>
  before (/Team/Players/Pitcher)[1]
')

The into clause inserts the InsertExpression as a child of the LocationExpression. The optional clauses of as start and as last are used to specify position of the insertion within the children:

-- Insertion within Team
SET @doc.modify('
  insert <Pitcher role="Starter"
                  name="Jaret Wright"/>
  into (/Team/Players)[1]
')
    
-- Insertion within Team, specifying it should
-- be inserted as the last element
SET @doc.modify('
  insert <Pitcher role="Starter"
                  name="Jaret Wright"/>
  as last into (/Team/Players)[1]
')

The syntax for the delete statement is very straightforward:

delete LocationExpression

The LocationExpression specifies what to delete from the XML data. For example, to delete all the Pitchers:

SET @doc.modify('delete /Team/Player/Pitcher')

Because the query specifies all pitcher elements, they will all be deleted. If you want to delete just a single element, you can specify identifying attributes. To delete just the pitcher named John Smoltz, you write the delete statement like so:

SET @doc.modify('delete
  /Team/Players/Pitcher[@name="John Smoltz"]')

You can also tell the delete statement to remove an individual attribute. For example, to delete the role attribute for the pitcher named John Smoltz the XML DML looks like this:

SET @doc.modify('delete
/Team/Players/Pitcher[@name="John Smoltz"]/@role')

Lastly, the replace value statement describes changes to make to the XML data. The syntax of the replace value statement is:

replace value of
   OriginalExpression
with
   ReplacementValue | if

The replace value statement is used to change discrete values in the XML. The only discrete values possible are the literal contents of a tag or the value of an attribute. The OriginalExpression must resolve to a single node or attribute. The ReplacementValue is usually a literal value to replace. Replacing the literal contents of a node requires the XQuery expression using the text() function to specify that you want to replace the text of a node. For example, to replace the inner text for a pitcher, you write the modify like this:

DECLARE @doc xml
    
SELECT @doc = '
<Team name="Braves" >
  <Players>
    <Pitcher name="John Smoltz" role="Closer">
      With team since 1989
    </Pitcher>
  </Players>
</ Team > '
    
    
SET @doc.modify('
  replace value of
      (/Team/Players/Pitcher[
       @name="John Smoltz"]/text())[1]
  with "May start in 2005"
')

Modifying an attribute is straightforward : you just need the XQuery expression to resolve to a single attribute. For example, to replace the value of the role attribute for the pitcher named John Smoltz with Starter,do this:

SET @doc.modify('
  replace value of (
   /Team/Players/Pitcher[
                  @name="John Smoltz"]/@role)[1]
  with "Starter"
')

The replace value syntax also supports conditional replacement by using the if…then…else syntax within the with clause of the replace value statement. For example, to replace John Smoltz’s role to Starter if he is a closer, but change it to a Closer if he is not a Starter, you could write the code:

SET @doc.modify('
  replace value of (
     /Team/Players/Pitcher[
          @name="John Smoltz"]/@role)[1]
    with (
   if (
        /Team/Players/Pitcher[
          @name="John Smoltz"]/@role = "Closer"
      ) then
     "Starter"
      else
        "Closer"
   )
')

Nodes Method

The purpose of the nodes method is to allow normalizing of a set of nodes returned by a query into a set of rows in a table-like result set. The syntax of the nodes method is:

nodes (XQuery) Table(Column)

The XQuery is the expression that picks the nodes to be exposed as a result set. The Table and Column are used to specify names in the result set. Note that you can only have one column and that it is automatically of type XML. For example, to query to get each of the pitchers, write the code like this:

DECLARE @doc xml
    
SELECT @doc = '
<Team name="Braves" >
  <Players>
    <Pitcher name="John Smoltz" role="Closer">
      With team since 1989
    </Pitcher>
  </Players>
</ Team > '
    
SELECT Team.player.query('.') as Pitcher
FROM @doc.nodes('/Team/Players/Pitcher')
     Team(player)

This results in a single result set containing rows for each of the Pitchers’ elements:

Pitcher
--------------------------------------------
<Pitcher name="John Smoltz" role="Closer" />
<Pitcher name="Russ Ortiz" role="Starter" />
    
(2 row(s) affected)

Notice that you used the query method to return these nodes in the result. The reason for this is the results of a nodes method may only be referred to by the XML methods (query, modify, delete, and update) or IS NULL and IS NOT NULL statements.

Gone are the days of needing to pull the entire XML document out of the database as a string, parsing it, making changes, and replacing the entire document.

More ordinarily, you may use the nodes method to break apart XML into a more useful result. For instance, you could get the players’ nodes by using the nodes method, and then retrieve them with the value method to get the individual values as scalar data:

SELECT Team.player.value(
         './@name', 'nvarchar(10)') as Name,
       Team.player.value(
         './@role', 'nvarchar(10)') as PlayerRole
FROM @doc.nodes('/Team/Players/Pitcher')
     Team(player)

This results in the following result set:

Name PlayerRole
--------------- ---------------
John Smoltz Closer
Russ Ortiz Starter
    
(2 row(s) affected)

XML Indexes

As you might expect, the speed of searches based on XML data in the database varies depending on how indexes are set up. For XML data, there are special indexes called XML Indexes. These indexes have a subset of the full configurability that standard indexes have, but they treat XML in a way that aids in the speed of performing searches though XML data.

Requirements

There are some limits regarding indexes on XML columns:

The only indexes that can be created for XML columns are XML indexes.
You can only add XML indexes to tables, views, table-valued variables with XML columns, or XML variables.
An XML index only supports indexing a single XML column.
Once XML indexes exist on a table, you cannot modify the primary key. If you need to do so, you must drop all XML indexes first.

Index Types

If your table meets the requirements, you can create indexes on the XML data in your tables.

The first of these index types is the primary XML index. Like its name suggests, there can be only one primary XML index on any table. You can use the CREATE PRIMARY XML INDEX command to create the primary XML index:

CREATE PRIMARY XML INDEX IXML_Teams
  ON Team (TeamDoc)

The primary XML index on a column creates a lookup based on the nodes of each node of the XML. Although this allows for speedy retrieval of individual nodes, there are other types of queries that benefit from their own indexes.

SQL Server 2005 supports three types of secondary indexes; PATH, PROPERTY, and VALUE. These secondary indexes are used to optimize certain types of operations as necessary. These secondary indexes are based on the Primary index and are used to tune specific types of queries. The secondary index types are listed in Table 2.

Use the CREATE XML INDEX syntax to create a secondary index. After you specify the table and column name, add the USING XMLINDEXclause.

CREATE XML INDEX IXML_Team_Path
  ON Team (TeamDoc)
  USING XML INDEX IXML_Teams FOR PATH

The USING XML INDEX clause takes the name of the primary XML index to create the secondary index and a type (PATH, PROPERTY or VALUE) of index. For example, create the PROPERTY or VALUE secondary indexes:

CREATE XML INDEX IXML_Team_Prop
  ON Team (TeamDoc)
  USING XML INDEX IXML_Teams FOR PROPERTY
    
CREATE XML INDEX IXML_Team_Value
  ON Team (TeamDoc)
  USING XML INDEX IXML_Teams FOR VALUE

Index Maintenance

Maintaining and modifying XML indexes is similar to maintaining standard indexes. You use the ALTER INDEX and DROP INDEX syntax for modifying XML indexes. These commands are the same commands you use for standard indexes:

ALTER INDEX IXML_Teams ON Teams REBUILD
    
DROP INDEX IXML_Teams ON Teams

Typed XML

The SQL language (and SQL database servers, by extension) represents a type system for storing information. You define data types in databases all the time. That is called schema. You define types tables and views), with certain attributes (columns and data types in those columns), relationships between types (foreign keys), and rules about the data that can be stored in types (constraints and triggers). The same thing happens in XML. In many situations, you want to dictate rules about what can be stored in XML data types. In SQL Server 2005, you can register XML schemas with the database. These schemas can be used to specify what XML can be used in a particular situation. In particular, using schemas can allow you to extend the type system by using XML schemas to specify complex data types.

In SQL Server 2005, you can register XML schemas with the database.

SQL Server 2005 allows both the use of XML columns and variables with generic XML (as you have seen earlier in this article) and the use of XML columns and variables that are typed with XML schema. When you use the XML data type with schema information, any XML inserted into a typed XML column is validated against the schema. In this way, the database ensures that the data stored is not only well-formed, but also conforms to the schema.

Using Typed XML

The first step in using typed XML is registering a schema. This is done by using the new CREATE XML SCHEMA COLLECTION statement. This new statement allows you to store schemas for XML that are used to validate XML stores:

CREATE XML SCHEMA COLLECTION BaseballSchema AS
' <? xml version="1.0" ?>
<xsd:schema
  xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="Team">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="Pitcher">
          <xsd:complexType>
            <xsd:attribute name="name"
                           type="xsd:string" />
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
      <xsd:attribute name="name"
                     type="xsd:string" />
    </xsd:complexType>
  </xsd:element>
</ xsd:schema > '

The CREATE XML SCHEMA COLLECTION statement creates a collection of schemas, any of which are used to validate XML typed with the name of the collection. This example shows a new schema called BaseballSchema being added to the database. The schema is entered as a string value. Once you have a schema registered, you can use the schema in new instances of the XML data type:

DECLARE @team xml(BaseballSchema)
    
SET @team =
    '<Team name="Braves">
       <Pitcher name="John Smoltz" />
     </Team>'
    
SELECT @team

Like you did earlier in the article, you can create a variable of XML type, but because you want to dictate the type of XML, use the parenthetical syntax specifying the schema name you registered. You can store the XML in the variable. You want a specific type of XML, one that conforms to the BaseballSchema. But what if the XML attempting to be stored does not conform to BaseballSchema? In this example, you try and store a piece of XML that has an undefined attribute (or role, if you prefer that term):

    
SET @team =
     '<Team name="Braves">
       <Pitcher name="John Smoltz"
                role="Closer" />
      </Team>'

This fails because the XML stored in the @team variable does not conform to the specific schema type. Executing the SET statement just shown yields a specific XML validation error:

XML Validation: Undefined or prohibited
attribute specified: 'role'

There may be other reasons that specifying a schema is helpful, such as aiding in queries.

As you might expect, using typed XML is straightforward in table creation:

CREATE TABLE Team
(
  TeamID int identity not null,
  TeamDoc xml(BaseballSchema)
)

Much like the XML variable example above, you can create new rows with the typed XML data, like so:

INSERT INTO Team (TeamDoc)
  VALUES ('<Team name="Braves">
             <Pitcher name="John Smoltz" />
           </Team>')

When the insertion happens, SQL Server 2005 validates the XML against the schema specified in the table declaration (BaseballSchema). The following insertion violates that schema, again by specifying an invalid role, so it will fail:

UPDATE Team
  SET TeamDocTyped =
    '<Team name="Braves">
       <Pitcher name="John Smoltz"
                role="Closer" />
     </Team>'

Being able to specify the types of XML that are allowed in a particular case is helpful to extend the type system to include complex types that are not allowed with standard SQL. If typed XML allows you to specify these complex types as type information, you need a way to manage the schemas to allow for extension and change of these types as the data matures.

Managing XML Schema

What happens when a schema changes? During development, this is likely to happen quite a bit. For example, if you wanted to add the new role attribute to the Pitcher type as defined in BaseballSchema, you would have to drop the entire schema collection:

DROP XML SCHEMA COLLECTION BaseballSchema

But this schema is in use in the Team table, so SQL Server 2005 does not allow you to drop the schema collection:

Specified collection 'BaseballSchema' cannot be
dropped because it is used by object 'Team'.

Instead, you can alter the table to drop the column referencing the schema, drop the schema itself, then re-create the schema with the new attribute, and finally, modify the table to add back the column that you dropped, this time referencing the new schema version, as seen in Listing 1.

If you were allowed to change the schema in place, you would have to re-validate all the data in the database, so SQL Server 2005’s approach of just not allowing such a change seems reasonable. There is no good solution when you need to change the schema of existing typed XML.

You can extend the schema collection by adding new schemas to them to allow new types of typed XML. This is done with the ALTER XML SCHEMA COLLECTION statement, as seen in Listing 2.

The ALTER XML SCHEMA COLLECTION statement allows you to alter a schema collection to create new top-level types to be used in the same schema collection. By using the ALTER XML SCHEMA COLLECTION syntax and adding the new Score element type, you extend the types that are allowed within the BaseballSchema. Once you add this schema, you can use the new schema types:

DECLARE @team xml(BaseballSchema)
    
SET @team =
    '<Score HomeTeam="Braves"
            AwayTeam="RedSox"
            HomeScore="5"
            AwayScore="4" />'

As mentioned earlier, because the XML is really typed to the schema collection rather than a single schema, this is perfectly acceptable. You should note that the XML data type schema name is still the same (BaseballSchema), but you can use the new types in the XML.

XQuery

Now that XML is in the database, you will want a smart way to query it. Earlier, you saw that you could use the query method of the XML datatype to search through instances of XML. In those examples, you used an XML language called XPath to search the document. XPath is a good starting point for asking simple questions of XML documents, but has limitations about asking robust questions. To this end, the World Wide Web Consortium has a working group that created a true query language for XML called XML Query (or XQuery for short).

XQuery is a language for querying XML documents. The topic of XQuery is big enough for its own book, but I’ll give you some basic information to get you started.

Every XQuery is an expression language. What this means is that every XQuery expression must evaluate to a result. Therefore, it is valid to write any expression that evaluates back to a result. That makes these valid XQuery expressions even though they do not do anything to search XML:

SELECT @doc.query('"Hello"')
SELECT @doc.query('15 * 10')

XQuery has adopted most of the XPath expression language to describe paths within XML documents. So the examples above were valid XQuery expressions, even though they looked like XPath. For example, to get the list of players, you used:

SELECT @doc.query('/Team/Players')

But to really see the power of XQuery, you want to use it to answer questions. For example, you may want to create a set of XML nodes called <Player/> that contain starting pitchers for the team:

SELECT @doc.query('
    for $b in
      /Team/Players/Pitcher[@role = "Starter"]
        return (
          <Player name="{$b/@name}"></Player>)'
               )

This XQuery expression uses several different pieces of XQuery to create the expected results. It uses the for...in keyword to say, go through all the nodes that the path expression (/Team/Players/Pitcher[@role = "Starter"]) returns (which is stored in a variable called $b), and create a <Player/> tag for each of the nodes. XQuery inserts a name attribute into the Player tag and calculates the name attribute’s value by getting the name attribute from the node ($b). This query results in an XML fragment that looks like this:

<Player name="Russ Ortiz" />
<Player name="John Thomson" />
<Player name="Mike Hampton" />
<Player name="Horacio Ramirez" />

Although this query is not rocket science, it does hint at some of the power of XQuery. Now that you’ve got a quick introduction, let’s look at some of the components of an XQuery expression.

The syntax of the XQuery language is made up of several parts:

Prolog
Iteration
Path Expressions
Conditional Expressions
Quantified Expressions

XQuery Prolog

The prolog is a place to specify any namespaces to declare for the XQuery expression. The entire prolog prefaces the body of the query. For example, you can specify the default namespace of the expression, like so:

SELECT @doc.query('
         declare namespace T="http://yukon/teams";
         return /T:Team/T:Players
    ') as Result

The purpose of declaring the namespace is to allow you to alias it across the query. You can define multiple namespaces within the prolog. In addition, the prolog can also contain a definition for the default namespace. For example:

SELECT @doc.query('
         declare default element namespace =
            "http://yukon/teams";
         return /Team/Players
    ') as Result

Other than namespaces, you can also include XML schema imports in the prolog. If you are working with typed XML, their schemas are automatically imported by the engine. For additional schemas, you can use the import schema syntax:

SELECT @doc.query('
        import schema
           "http://yukon/teams/team.xsd";
        return /Team/Players
    ') as Result

Path Expressions

XQuery adopts most of the XPath expression language to accomplish specifying paths. Therefore most of the XPath expression language is perfectly appropriate for use in XQuery. In the simple case, you can use XPath raw as the query like this:

SELECT @doc.query('/Team/Players')

You can use axis specifiers like you would in XPath as well:

SELECT @doc.query('/Team/child::Players')

Or you can use it to retrieve values from a document:

SELECT @doc.value('/Team[@name][1]',
                  'varchar(50)')

Lastly, most of XPath’s function library is part of XQuery. Therefore you can use calculations that you are already familiar with from XPath. For example, to get the count of players on the team, you can do this:

SELECT @doc.query('count(/Team/Players/*)')

Conditional Expressions

XQuery includes the if...then...else construct to allow for conditional expressions. This construct allows you to make tests to allow branching based on specific results. For example, if you want to return a result of whether the roster is full or not, you can test the team by using a conditional expression:

SELECT @doc.query('if (count(/Team/Players/*)
                       < 25)
                   then
                      "Need Players"
                   else
                      "Roster Full"')

You can nest conditional expressions to do more elaborate testing. For example, you can add a test to see if any players exists, then you can report that you have No Players, instead of just saying Need Players:

SELECT @doc.query('if (count(/Team/Players/*)
                       < 25)
                   then
                     if (count(/Team/Players/*)
                          = 0)
                     then
                       "No Players"
                     else
                       "Need Players"
                   else
                     "Roster Full"')

Conditional expressions also allow you to compound several tests together using and, or, and parentheses. This next example shows testing for both 25- and 40-man rosters:

SELECT @doc.query('
   if ((count(/Team/Players/*) = 25) or
       (count(/Team/Players/*) = 45)) then
      "Roster Full"
   else
      "Need Players"')

Quantified Expressions

There are times where you need to test for whether all or some results match some specific criteria. XQuery allows for this with Quantified Expressions. Quantified Expressions use the following syntax:

( some | every )
  <variable> in <Expression>
    satisfies <Expression>

With this syntax you can test a set of nodes based on a criteria. In this example, you are testing to see if all of the players are starters or not:

SELECT @doc.query('
        if (every $player in /Team/Player/*
          satisfies $player/@role="Starter")
        then
          "We have all Starters"
        else
          "We have Starters and others"
')

You can use the some clause to specify that at least one needs to pass the test instead of requiring all to pass the test (as seen with the use of every above). You can see this work in the example below:

SELECT @doc.query('
        if (some $player in /Team/Player/*
          satisfies $player/@role="Starter")
        then
          "We have some Starters"
        else
          "We no starters"
')

Iteration

One of the most common uses for XQuery is to iterate through all the results of a node test and perform some work. To do this, XQuery supports a set of clauses that they shorten to FLWOR (pronounced flower). FLWOR stands for the different pieces of the iteration syntax. FLWOR stands for FOR, LET, WHERE, ORDER BY, RETURN.

As you saw in the earlier example, you can use the most common parts of this syntax (the for and return clauses) to create results that concatenate results. For example, here you want to create a node for each player on the team:

SELECT @doc.query('
    for $b in /Team/Players/Pitcher[@role =
              "Starter"]
      return (<Player name="{$b/@name}">
              </Player>)')

This creates a variable called $b for each pitcher that is a starter. Then it returns a Player element with the name of the player embedded in it.

You can further enhance the query by using the where clause to specify a condition that each node must pass to become part of the node set. For example, you can make sure that every player has a name by doing this:

SELECT @doc.query('
    for $b in /Team/Players/Pitcher[@role =
                                    "Starter"]
    where (count($b/@name) > 0)
      return (<Player name="{$b/@name}">
              </Player>)')

Lastly, you can further improve this query by sorting the pitchers by name using the order by clause:

SELECT @doc.query('
    for $b in /Team/Players/Pitcher[@role =
                                    "Starter"]
    where count($b/@name) > 0
    order by ($b/@name)
      return (<Player name="{$b/@name}">
              </Player>)')

You may have noticed that I skipped the let clause. In SQL Server 2005-the let clause is unsupported.

XQuery Extension Functions

To enable better integration with the database engine, Microsoft includes two functions that extend the functionality of XQuery by giving access to columns and local variables in the SELECT clause. These functions are called sql:column and sql:variable. For example, to include the TeamID in the <Player/> elements you are creating, you can call sql:column with the name of the column in the SELECT statement:

SELECT TeamID, TeamDoc.query('
    for $b in /Team/Players/Pitcher
    where count($b/@name) > 0
    order by ($b/@name)
      return (<Player
                 team="{sql:column("TeamID")}"
                 name="{$b/@name}">
              </Player>)')
FROM Team

In addition, you can use the sql:variable function to have access to local variables. For example, if you create a local variable with the date of the roster, you can insert it into the output XML by using the sql:variable clause:

DECLARE @today datetime
SET @today = '12/31/2004'
    
SELECT TeamID, TeamDoc.query('
    for $b in /Team/Players/Pitcher
    where count($b/@name) > 0
    order by ($b/@name)
      return (<Player
                dt="{sql:variable("@today")}"
                name="{$b/@name}">
              </Player>)')
FROM Team

For more information on XQuery:

The XQuery page on w3c.org’s Web site (http://www.w3c.org/XML/Query)
The XQuery specification (http://www.w3c.org/TR/xquery/)

Conclusion

Storing XML has become a mainstay of many software architectures these days. SQL Server 2005 follows this trend by including a real XML data type, but XML is not just structured storage in this case. It represents a way to extend the SQL type system with a well known and open type system: XML Schema. By treating XML as a mature type, the database can deal with XML in an efficient way.

Gone are the days of needing to pull the entire XML document out of the database as a string, parsing it, making changes, and replacing the entire document. SQL Server 2005 lets you do searches, additions, changes and deletions of parts of a document in-place. This represents a great leap in ease and performance when using XML in the database.

Method Name	Description
query	Performs a query against the XML and returns the results of the query
exists	Performs a query against the XML and returns the value of 1 if there was a result
value	Evaluates a query to return a simple value from the XML
modify	Executes a change operation on the XML document in place
nodes	Allows you to break apart XML into a table structure