Content by Category
.NET 1.x
.NET 2.0
.NET 3.0
.NET 3.5
.NET 4.0
.NET 4.5
.NET Assemblies
.NET Framework
.NET Getting Started
Accessibility
ADO.NET
Advertorials
Agile Development
AJAX
Amazon Web Services
Analysis Services
Android
Architecture
Arduino
ASP .NET Web API
ASP.NET
ASP.NET MVC
ASP.NET WebForms
Azure
B2B (Business Integration)
BDD
Big Data
Bing
BizTalk
Book Excerpts
Build and Deploy
Business Intelligence
C#
C++
ClickOnce
Cloud Computing
Code Contracts
CODE Framework Info - non Technical
CODE on the Road!
COM+
Community
Conferences
Continuous Integration
Crystal Reports
CSLA.NET
CSS
Data
Debugger
Design Patterns
Development Process
Display Technologies
Distributed Computing
Document Database
DotNetNuke
DSL
Dynamic Languages
Dynamic Programming
Editorials
Enterprise Services ("COM+")
Entity Framework
Events
Expression Blend
F#
Fox to Fox
Frameworks
Functional Programming
Git
Graphics
HTML 5
Internet Explorer 8.0
Interviews
IOS
iPhone
Iron Ruby
Java
Java Script
JavaScript
jQuery
JSON
Lightswitch
LINQ
Linux
LUA
Mac OS X
MDX
Messaging
Metro
Microsoft Application Blocks
Microsoft Business Rules Framework
Microsoft Dynamics
Microsoft Expression
Microsoft Office
Mobile Development
Mobile PC
Mono
MsBuild
MVVM
MySQL
Network
NHibernate
node.js
NOSQL
Nuget
Object Oriented Development
Objective C
Odata
OLAP
Open Source
Opinion
Opinions
Oracle
ORM
Other Languages
Parallel Programming
Patterns
PHP
Podcasts
Post Mortem
PowerPoint
Print/Output
Prism
Product News
Product Reviews
Project Management
Prolog
Python
Q&A
Rails
Rake
Razor
Reporting Services
REST
RIA Services
Ruby
Ruby on Rails
Scheme
Search
Security
Services
SharePoint
SignalR
Silverlight
SOA
Social Networks
Software & Law
Software Business
Source Control
Speech-Enabled Applications
SQL Server
SQL Server 2000
SQL Server 2005
SQL Server 2008
SQL Server 2012
SQL Server CE/AnyWhere/Mobile/Compact
SSIS
Subversion
Sync Framework
Tablet PC
TDD
Team System
Techniques
Testing and Quality Control
TFS
Tips
TypeScript
UI Design
UML
User Groups
VB Script
VB.NET
Version Control
VFP and .NET
VFP and SQL Server
Virtual Earth
Vista
Visual Basic
Visual Basic 6 (and older)
Visual FoxPro
Visual Studio .NET
Visual Studio 11
Visual Studio 2005
Visual Studio 2008
Visual Studio 2010
Visual Studio 2011
Visual Studio 2012
Visual Studio Tools for Office
VSX
WCF
Web Development (general)
Web Services
WebMatrix
WF
Whitepapers
Windows 7
Windows 8
Windows Azure
Windows Live
Windows Phone 7
Windows Phone SDK
Windows Server
Windows Vista
WinForms
WinRT
Workflow
WPF
XAML
Xiine Documentation
XML
XNA
XSLT



SSWUG


XAMALOT
 


SSWUG

Reader rating:
Click here to read 2 comments about this article.
Article source: CoDe (SQL Server Observer)

SQL Server 2008 Under the Hood: Compression Technologies

Microsoft released SQL Server 2008 to the public this past July. There are many major new features that are striking, such as the Resource Governor and Policy-Based Management; features that are high-profile and have received a lot of attention.

But when you look under the hood, behind these high-visibility features and into the SQL Server engine, some other features also stand out as very notable: database encryption, an entirely new auditing framework, and the topic of this article: compression.

SQL Server 2008 uses compression in three prominent places:

  • Data compression allows you to compress database tables and indexes
  • Backup compression adds the capability of compressing database backups
  • Database mirroring log stream compression, which compresses the log data between the principal and mirror servers in a database mirroring session

Each of these features improves performance, but perhaps the one that will deliver the biggest gains is data compression.

Data Compression

You might initially think that data compression just refers to the way data is stored on disk, but in SQL Server 2008 it means more than that. Data compression actually refers to the way the contents of data and index pages are stored in the page; the resulting compressed data pages are then stored in their compressed format on disk.

You can compress data for tables, indexes, and indexed views. You load database data primarily into tables. But there are two types of subordinate database objects that also contain partial copies of a table’s data. Indexes contain selected subsets of a table’s data in B-tree structures, and indexed views contain portions of a table’s data based on the view definition. All three of these objects can benefit from data compression.

You can compress data at the row or page level. For row compression, the row’s data is compressed in such a way that SQL Server can reconstruct the full row based on the contents of the compressed row. As a result, the compressed row can move from page to page without changing the compressed image of the row. When you compress at the page level, row-level compression is enhanced with a dictionary-type lookup: each data page has a lookup table for repeating patterns in the rows of the page. If a row has to move from one page to another, its page-level compression has to be decoded from the one page and then recoded into the new page, so the row movement is a little more expensive, and so you may see some increased CPU usage on updates to compressed data and indexes.

The benefit of data compression is that the number of I/O’s (logical and physical) to access data in the page or index is reduced because the data is more compact. Because compression of a table at either the row or page level still keeps the same data, indexes on the table behave essentially the same as with uncompressed data, so you don’t need to worry about changing indexes because you’ve compressed some data.

Data compression has great potential for dramatic increases in performance and scalability. It can significantly reduce the number of I/Os required to access data, making it possible to store larger numbers of rows in a table. Meanwhile, existing tables will benefit from better query performance due to the reduction in I/O.

For example, I recently had the opportunity to test out data compression on a 1-million row, 20GB table. Compressing it at the page level reduced it by 60% in size, and accordingly reduced our benchmark query duration by a similar 60%. This was due to a 70% reduction in the number of logical I/Os required for the query. On the other hand, CPU usage for the query increased by 36%.

There is a catch: data compression is slated to be a feature of SQL Server 2008 Enterprise Edition only. If you compress tables or indexes in a SQL Server 2008 database that contains compressed tables or indexes, you will probably need the Enterprise Edition (or Developer Edition) to restore the database. For example, in SQL Server 2005, if you back up a database using table partitioning, an Enterprise Edition feature only, you cannot restore the database to a Standard, Workgroup, or SQL Express edition. We’ll have to wait and see whether data compression has the same restriction.

Backup Compression

There are several third-party products that will compress SQL Server backups, and they’ve been available for several years. Perhaps the best known is Quest Software’s SQL LiteSpeed, but there are others available as well, such as Red Gate’s SQL Backup.

SQL Server 2008 introduces native backup compression, and like data compression it is also an Enterprise Edition feature. However, unlike data compression (and table partitioning), any other SQL Server 2008 Edition can restore a compressed backup: the only restriction is that the Standard, Workgroup, and SQL Express Editions cannot actually create compressed backups.

Backup compression is different from data compression because it occurs only at the storage level. That is, the compression occurs when you create the backup, and the result is just a compressed backup file or image. Like data compression, backup compression has the potential to increase performance of backups. For example, in my own tests the SQL Server 2008 AdventureWorks2008 sample database backup file reduces in size from 176MB to 40.5MB, a 75+% reduction in size. The time taken for the backup also reduces because of the lower amount of I/O to disk required. On my laptop, it reduced from 15 seconds uncompressed to 10 seconds compressed. Again CPU usage is higher for the backup process because of the compression activity.

Database Mirroring and Compression of Log Stream

A third area where SQL Server 2008 makes use of compression is in database mirroring. When mirroring between two SQL Server 2008 instances, SQL Server will compress the transaction log records that are streamed from the principal server to the mirror server. This compression is automatic and SQL Server 2008 Books Online states that it will work out of the box and not require the Enterprise Edition.

The Microsoft SQL Server Customer Advisor Team (CAT team) ran some tests on the benefits of compressing log records and published their results at

http://sqlcat.com/technicalnotes/archive/2007/09/17/database-mirroring-log-compression-in-sql-server-2008-improves-throughput.aspx.

They found that the benefit was greatest on networks that had lower bandwidths. Again, they also report an increase in CPU usage cost due to the compression, but on balance the gain in transfer speed would seem to outweigh that cost. (Incidentally, their report notes the use of the SQL Server 2008 trace flag 1462, which disables database mirroring log compression.)

SQL Server 2008 Compression

Data compression is likely going to be the heavy hitter among the compression features because it has the potential to improve performance on queries that access large data warehouse fact tables. Even OLTP systems with very large tables will likely benefit, provided the tables are read-mostly. In fact, some people informally have called data compression the “killer feature” of SQL Server 2008.

But all three forms of compression are likely to be beneficial, and a welcome addition to the performance tuning array of tools. The benefit for compression is faster queries and the ability to scale up to even larger data sets, while the cost is an increase in CPU usage. You can see where this is going: it will only reinforce the use of quad-core CPU servers that have recently become widely available.

Ron Talmage

&

By: Ron Talmage

Ron Talmage is a mentor and co-founder of Solid Quality Mentors. He is a SQL Server MVP, a PASS Regional Mentor, and current president of the Pacific Northwest SQL Server Users Group. He has been writing white papers and articles on SQL Server since way back when.



How would you rate the quality of this article?
1 2 3 4 5
Poor      Outstanding

Tell us why you rated the content this way. (optional)

Average rating:
3.5 out of 5

28 people have rated this article.

TOWER 48

      LearnNow

 

SSWUG