T-SQL Tuesday #010: Indexes: To Be or Not To Be (a B-tree)
This blog entry is participating in T-SQL Tuesday #010, hosted this month by Michael J Swart (Blog|@MJSwart). You are invited to visit his blog to join the blog party. You are welcome to write your own participating blog post for the party or just to read more blogs participating in this month’s topic: indexes.
For my part in this month’s T-SQL Tuesday, I’m going to talk about index structures that are not B-tree structures. Everyone knows that indexes are stored in B-tree structures, right? so when are they not stored as B-tree’s?
What is a B-tree?
The B-tree structure reminds me of the way I used to draw trees when I was a kid. The tree tops were simple triangles. When you diagram an index structure, you end up with a similar structure. The difference here is that the root of the B-tree structure is at the top. The leaf (bottom most) layer of the B-tree structure contains the data and is often referred to as the index level. This is where SQL Server gets all of the good stuff from. All of the layers above that are simply pointers telling SQL Server how to get to the correct index page at the bottom. As there are more an more pages in the leaf level, more and more pages are required for tracking the path to the leaf level. As the index grows, the layers push upward and always end with only a single page at the root (very top) level.
As a side note, if the index is small enough that all of the data fits on a single page, the leaf level is the root level and no extra pages to track the path to the index page are required. To left is a simple rendering of what an index structure may look like. The bottom level with the most pages is the leaf level where the data is stored. The levels above it have pointers to which page at the next level is the correct path to get to specific data entries.
This is the standard structure used for indexes in SQL Server and most enterprise RDBMS’s. But there are occasions when SQL Server doesn’t use this structure. I’m not referring to the above mentioned scenario where the index only has a single level. Indexes that do use B-tree structures include clustered indexes, nonclustered indexes, and spatial indexes.
What does not use a B-tree?
This brings us to the whole point of this post. What kind of index doesn’t use a B-tree structure? The obvious answer is full-text indexes, but that’s too easy. Full-text is its own engine for managing and querying full-text catalogs. I’m not going to take the easy way out and go with full-text. The other index type that does not utilize B-tree structures is XML indexes.
I’m going to avoid going off on a tangent explaining how XML indexes work and how you use them. I just want to look at the structure. And the structure used to store XML indexes is …. an X-tree. No, not really. There’s no such thing as an X-tree structure as far as I know. Actually, XML indexes are stored in internal (hidden) tables. The tables themselves will even have indexes on them. So you now have B-tree clustered indexes on your non-B-tree XML indexes.
Let’s Prove It
Let’s take a look at the structure. For this test, I’m going to create a new database on my SQL Server 2008 R2 instance named TestXMLIndex. Then I will create a table with an XML column in it and populate it with a few records. I will create a primary XML index on the column and then take a look at the structures. The code below is commented to make it easy to follow along.
Create database TestXMLIndex; Go Use TestXMLIndex; Go -- Verify no XML indexes exist and -- no internal tables exist for XML indexes Select * From sys.xml_indexes Select * From sys.internal_tables Where internal_type_desc = 'XML_INDEX_NODES' Go -- Create table with an XML columns Create Table dbo.TestXML ( XMLID int identity(1, 1) not null primary key, XMLCol XML not null) Go -- Populate some XML data Insert Into dbo.TestXML (XMLCol) Values ('<root><test1>Test 1</test1></root>'), ('<root><test1>Test 2</test1></root>'), ('<root><test1>Test 3</test1></root>') Go -- Create a primary XML index CREATE PRIMARY XML INDEX PXML_TestXML_XMLCol ON dbo.TestXML (XMLCol); GO -- Verify that the XML index exists and -- that there is now an internal table for it Select * From sys.xml_indexes Select * From sys.internal_tables Where internal_type_desc = 'XML_INDEX_NODES' Go -- There is even an index on the internal table -- Both the XML index and the index on the internal table -- are stored as parent objects of the table Select IT.name As InternalTable, I.name As InternalTableIndex, XI.name As XMLIndexName From sys.internal_tables IT Inner Join sys.indexes I On I.object_id = IT.parent_object_id And I.type_desc = 'CLUSTERED' Inner Join sys.indexes XI On XI.object_id = IT.parent_object_id And XI.type_desc = 'XML' Where IT.internal_type_desc = 'XML_INDEX_NODES' Go -- If it's a table, let's look at the data Select * From sys.xml_index_nodes_2105058535_256000
Msg 208, Level 16, State 1, Line 2 Invalid object name 'sys.xml_index_nodes_2105058535_256000'.
Now that we are at the point where we try to query the table, you see that the table is inaccessible and returns an error if you try to query it. There is a workaround though. If you connect to a new query window via the dedicated admin connection (DAC), you will be able to query the table.