What Changed?
The Fast Track to the Answers When Troubleshooting Microsoft SQL Server Reliability and Performance Problems
For a Microsoft SQL Server 2000, 2005 or 2008 database engine instance, the configuration metadata can be taken to include the collection of knobs, switches, settings, Data Definition Language (DDL) scripts necessary to exactly recreate that SQL instance. When unexpected database problems occur, regardless if on a production server or a unit test desktop computer, quick and easy access to the information about database configuration changes coincidental to the problem are invaluable to the troubleshooting or forward recovery effort. In fact, just having the previous configuration readily available to be able to rollback to can be the difference between meeting service level expectations or not. In other circumstances, the ability to identify configuration changes over time is useful in understanding what went wrong and, more importantly, in preventing further episodes of the same problems. For development shops and some database server farms, the ability to compare configurations between SQL Instances is indispensible to understanding - or better yet preventing - a change related system degradation or outage.
Almost every DBA has an arsenal of tools that help in some way with this all to common scenario. Unfortunately, may of the tools are expensive for the server. Many require that the DBA be connected to the faltering system and throwing some pretty heavy duty queries at the system. Allowing ad hoc research and comparison activities on live SQL Instances can not only impact performance but may very well make matters worse. There is also an increased likelihood that the unexpected - and it will almost never be a desirable side effect - will happen. For example, if a server is running stressed, say at 80%+ CPU for extended periods, the likelihood that any administrative or analytical query activity can impact performance is greater than if the CPU hovers around 20% or less most of time. Even if the problem is not specifically performance related, the research can cause blocking and unintended failure of application operation at the database. And if the database instance is
The DBAs troubleshooting arsenal is tested to some degree. The dedicate admin connection is also helpful. Analysts and developers on the other hand tend to follow a linear trouble shooting path based more on thier knowledge of the application than the implications of their actions on the SQL Server under inspection. For this reason, especially if database reliability and performance are important for the organization, internal analysis should not require direct analyst or developer access to live systems. Instead, monitoring for change that is presented as standardized configuration and performance metadata, sharing of knowledge, and asynchronous dissemination of information across the team supporting the SQL Server application is preferred to give everyone the information they need to properly support the database and at the same time protect the live system from unnecessary ad hoc access.
Typically, the development team will have a master source reference stored in a source control database such as CVS, Sourcesafe or Surround. This master copy goes a long way toward providing much of the information a developer needs to recreate a problem in a one-off environment. However, that can be a time consuming operation. Futher, the accuracy of the "master" source for SQL Server database object scripts is not guaranteed to be accurate. Many developers are painfuly aware of this disconnet between the master source and the database that they prefer to get scripts directly off the live system when starting a maintenance development iteration rather than rely on what is in the master souce. Why SQL Server has no integrated tools to correct this fundamental flaw remains unexplain. The vendor remains mute on the topic, preferring to pretend it is a non-issue because it is a difficult problem.
From our perspective, all that matters is that the configuration on a live system or configuration extracted by directly monitoring for change in live systems is not to be confused with the master source of the code base or a suitable source control system in a development shop. Instead, the output of the change monitoring is ought to be used to regularly verify and invariably re-sync the so called master source with what actually exists on the live database server.
Isolating "Database Problems"
It is always interesting what an business folks will lable as database problem. We will save that topic for another day. Here lets assume that we have already worked through the preliminary steps to know that a problem is manifesting at at the database.
One of the more common scenarios for a run time system glitch is during – or more classically, shortly after - a planned change to the system. That is because in most cases where changes are applied to a running live system the application code is changed after the database changes are made. Especially in those shops where too many people have the ability to make changes to the production environment, the need to be able to identify who-did-what-when is often the shortest path to resolving the problem, even if only to eliminate the most likely causes first. Unfortunately, the ability to satisfy this need is not as frequently found as it is needed. Keys to quick problem identification and to future avoidance tactics include the ability to quickly identify:
· what changed
· when the change occurred
· the state before the change
· the actions, commands and TSQL statement used to effect the changewho made the change
A good monitoring strategy will assure that the necessary ongoing data colection happens and that the data remains available to the designated audience for an appropriate time. Also, having automation in place to capture the information necessary to revert to a working state is important for quick and easy troubleshooting and problem resolution - especially so in the absence of release based change protocols that mandate tested rollback processes.
The default trace will contain some of this information. The problems with using the default trace in triage scenarios are:
the interseting changes may not have been written to the default trace
it can be time consuming to find what you are looking for in the default trace.
the trace may not be accessible
the database change may have occured days, weeks or months before the application code that calls it is introduces into an application layer. The problem will not manifest until the last change in entire system is made. This can make finding the "database problem" in the default trace a little like looking for a needle in the wrong haystack.
Production Environments
Every Database Administrator (DBA) spends more time fighting fires in the production environment than is comfortable. It comes with the territory. Much of the time and energy invested in defining and building monitoring systems for the run time environment are in defense of previous encounters with known digital arsonists. Unfortunately, and in spite of the evidence pointing overwhelmingly toward the need to do so, the ability to identify what has changed is seldom adequately addressed in the IT infrastructure. There are several reasons for this. Foremost, it is a hard problem. The difficulty of the task, the resource requirements to create and manage the monitors, and in some cases convincing the organization of the value of even the most obvious and urgent issues caused by change is difficult because the business has some expectation that things over in IT break when change happens. There is little doubt that every IT organization using SQL Server, working within the bounds of past experience to meet the demand of current requirements, cannot meet the challenge and reap the reliability and maintainability benefits of monitoring SQL Server configuration change.
Pre-Production Test Environments
Being able to quickly confirm that only expected differences exist between a test environment and a run time environment is often essential for effective testing and is helpful to confirm the database change list exposed in the testing. In scenarios where the introduction of change is meticulously serialized and therefore more easily managed, as in an integrative testing or production environment, comparison capabilities between Instances are very useful. These same software lifecycle environments often present security barriers that limit such comparisons.
Development Environments
In almost all development scenarios change to the environment is chaotic. Having an accurate record of configuration and DDL settings on development SQL Servers is essential to
· assure that the environment remains as similar to live environments as practical
· identify and verify change requirements from development upstream in the software lifecycle
· reduce the possibility for lost work
Being able to confirm any SQL Server Instance’s configuration against another SQL Instance and a master catalog is essential.
Development environments often require periodic data refreshing either directly for or as an accurate simulation of current production data. A typical development data refresh scenario in insipient applications is to restore the run time or production database, apply pending changes that are already implemented in production and move on. Usually it works OK, occasionally there are problems. Being able to quickly identify the differences between the run time environment and the development environment is obviously helpful in such efforts. In addition, the exercise will usually help to eliminate the use of production data in development. This could even make the security officer smile.
Unit Test Environments
One especially undesirable occurrence for a developer is loss of work. Hardware fails or for other reasons computers do stop responding and data, even metadata is lost. Computers used in development environments are notorious for both. When it happens, a day’s work can easily be lost even in a shop with good source control practices. Capturing all the versions made during the work day automatically and asynchronously to a local archive – in a way that will not impact the developers work, but as the change happens can prevent lost work.
An ideal Unit Test scenario would preserve the changes throughout the day. The developer would need to take a small amount of time to occasionally purge the automatically created history instead of the much more painful alternative of having to recreate a previous version for a table or stored procedure. The previous work would remain available for review and reuse. In this local system unit test scenario there would be little need to save the metadata history for long periods of time. Many shops expect at least daily check-in to source control for all work in progress. Once the development cycle is completed, the local data store can be purged if desired or necessary. The SQL activity on the unit test system is often such that a complete history of the TSQL changes on the SQL Instance would not require massive amounts of storage. The storage requirement would typically be accommodated within the disk storage allocated to any properly sized SQL Server Instance.
Post –Production Environments
Data quality is of utmost importance for Data Warehousing, BI and data mining data stores. The metadata and configuration for post production SQL Servers must not vary for this reason alone. The requirements to monitor metadata and configuration change are like those for live systems. In some shops, post-production SQL Server databases are also likely to use a separate and different software lifecycle development environment than the production development team. Having an ability to easily compare metadata and configuration between the different development environments would be a front line defense against the tendency for the environments to drift out of sync. For the BI folks, there are few things more disruptive than unexpected source system changes that break the ETL process.
Centralized Administration
Making sure that everyone is working from a common base is essential to effective database administration and also enables the highest quality development and the least disruptive change process possible for the organization. The common base requires an ability to apply role based security.
SQL Server DDL Events
Fortunately, beginning with Microsoft’s SQL Server 2005 the ability to identify change includes some new helpers. One small but welcome example is the finally renewed fervor to populate the forever senseless refdate column in dbo.sysobjects (albeit in a back door fashion by morphing it from a view of the modify_date column in sys.objects). Prior to SQL Server 2005 ALTER statements (ALTER TABLE, ALTER PROCEDURE, etc) did not update the catalog even with the date. There was always this totally bogus column in dbo.sysobjects pretending to pronounce the date and time the object was last changed. Now the column is correct. There is also now some metadata change information in the default trace and improved trace and event audit capabilities and even enhancements to the built-in audit features including Login Auditing, Common Criteria Compliance (after SP2) and C2 auditing. Other easily accessible examples are DDL triggers, DDL event notifications and Trace event notifications. These event based mechanisms provide a means to capture a fairly complete picture of metadata changes as they happen.
All that is left for the DBA is to develop a robust, reliable and low maintenance way to store and use this wealth of new data. It sounds simple: slap a DDL trigger on a server that writes change events to a table. The DBA need only keep a query handy that will show the changes for the last hour or day and Viola! A quick tool to see what changed is at hand. I agree. If you are running SQL Server 2005 there is no excuse not to have at least this rudimentary protection. Having the trigger is better than not having the trigger, unfortunately there are a few of the limitations to this scenario:
Finding Time and budget to develop and test
May miss changes that don’t raise an event
Server settings or database master key changes
Changes that don’t raise a DDL trigger event such as permissions
Multiple SQL Server instances adding to the complexity and time necessary for looking in all place
Resource requirements and management such as storage and security
Time requirements to analyse and build the reporting necessary for optimal use of the data collected
Rollback/Recovery support to identify not only what changed but also what it looked like before the change
Undoubtedly you can think of more.
Obviously, SQL Server’s new found ability to announce change comes at some cost. The biggest cost is the additional demands on the already over extended time of the DBA, operations and development staff to learn about, develop, test and implement these new capabilities . There is also the reality that others may not know how to use change identification tools if the DBA is not present. And there is some uncertainty that all changes are identified or that the approach taken may adversely affect the application that the monitoring is intended to protect.
SQLClue to the Rescue
After long deliberations and years of hands on experience with the problem of identifying SQL Server metadata and configuration changes I wrote Bill Wunder’s DDL Archive Utility to aid administrators of SQL Server 2007 and 2000 installations in monitoring those changes.
SQLClue Supercedes the DDL Archive Utility, and then some...
Many SQL Server DBAs have downloaded this DDL Archive Utility from SQL Server Central (http://www.SQLServerCentral.com), The SQL Server World Wide User Group (http://www.sswug.org ) and from my web pages (http://www.nyx.net/~bwunder) to automatically capture a daily history of changes to the stored procedures, tables and other database objects found on targeted SQL Server 6.5, SQL Server 7, SQL Server 2000 and even SQL Server 2005 and SQL Server 2008 SQL Server instances. The wide and ongoing interest in that tool speaks to its necessity in effective database administration.
Over time, in lock step with the improved capabilities of the Visual Studio IDE, the DDL Archive utility evolved from a COM/API based command line batch utility into a robust and more configurable .NET application with console capabilities. That DDL Archive Utility stores its change history in Visual SourceSafe.
While it has proven to be a useful and reliable tool, there are limitations to the DDL Archive Utilities value beginning with SQL Server 2005. The Archive utility is based on Microsoft’s Database Management Objects (DMO) database administrative programming interface. DMO is a COM technology that does not work with configuration new with SQL Server 2005 such as Service Broker, the encryption hierarchy, http endpoints, the xml data type and of course .NET assemblies. We are forced to use SQL Management Objects (SMO) - Microsoft’s replacement for DMO - to gain adequate access to the metadata that is new with SQL Server 2005. SMO works well with SQL Server 2005, fully exposes the SQL Server 2000 database engine metadata, and is also the administrative programming interface for SQL Server 2008.
To begin with, let’s face facts. Some of the permissions and access rights DBAs need to monitor are risky from the perspective of today’s more security conscious systems. And some of the data exposed during such monitoring is sensitive and valuable property of the organization.
The DDL Archive Utility installation has several dependencies including Visual SourceSafe - via a COM Interop, SQLXML, a relatively large file system staging area, and a licensed instance of SQL Server or some other scheduling mechanism. The DDL Archive Utility is also by necessity a somewhat resource intensive and therefore inefficient process that scripts every object on the database instance and then compares each to the latest version in the SourceSafe repository each time the schedule archive is executed. Since most database objects don’t change this requires a relatively large amount of CPU cycles and network bandwidth to identify the few changes that may have happened in the last day. The DDL Archive Utility employs Distributed Management Objects (DMO) scripting technology to produce the files that are compared to the SourceSafe repository. SQLDMO is not aware of any database features new in SQL Server 2005. Consequently it cannot completely archive a SQL Server 2005 instance.
After much more deliberation and many more years of experience with the needs and capabilities for monitoring change, I have created an upgrade path for all those folks using the DDL Archive Utility that wish to enjoy the same level of safety and security with SQL Server 2005 and SQL Server 2008 that they have realized with the Archive Utility on previous SQL Server releases.
The tool is adaptive, offering a suite of tabbed functions that can be enabled and disabled as desired. Each function is defaulted to a recommended configuration yet each component is configurable to match different requirements. Each component adapts to the environment as needed . Regardless of the SQLClue components used, monitoring with SQLClue adds only a very small administrative time cost and system performance impact.
SQLClue improves upon Bill Wunder’s DDL Archive Utility
To install and use SQLClue the prerequisite are the .NET Framework 3.5 and SQL Server 2005 or SQL Server 2008 running on the host server. SQLClue uses this host SQL Server as the SQL Configuration Repository for the archive of changes instead of Visual SourceSafe. The SQL Server can be any edition including SQL Server Express Edition. This means there need be no implicit additional licensing costs to use SQLClue. SQLClue relies on SQL Server Management Objects (SMO) to document the metadata changes. The changes documented provide an archive that is unprecedented in its completeness. For scriptable objects, a script is stored in the repository. For metadata that is not scriptable, the SMO properties are extracted from the server and stored as name-value property listing. Asynchronous Service Broker based Event Notifications are used to capture the vast majority of likely changes as they occur. Then at a configurable schedule interval, the events are processed into the repository. Those few metadata types that do not generate an event when changed are still routinely documented and compared one at a time to the latest version in the repository. However, the overall CPU and network performance load is a small fraction of that required y the DDL Archive Utility because the object types that change the most do generate events. At any time it is possible to peek into the local Event Notification Service Broker Queue to see event triggering changes since the last archive to the repository. In fact, at the discretion of the user, the Service Broker Queues can be con
SQLClue includes a richly user configurable comparison tool with Regular Expression support for object name matching and documentation line level filtering. The possibilities are nearly unlimited.
SQLClue expands Bill Wunder’s DDL Archive Utility
The Archive Utility provided a tool to Script database object DDL and exploits the add, check-in , version compression algorithm and security model of Visual Source Safe to maintain a versioned history of DMO database object scripts. SQLClue includes the SQL Configuration Repository that replaces this functionality and expands it to include CLR, Service Broker and other SQL Server 2005 database objects. The SQL Configuration Repository also monitors non-scriptable database metadata for change. In addition, SQLClue includes a Query Baseline and Analysis tool that Measures query performance to identify performance change over time by configuration detail. The tool keeps an eye on current query execution patterns on monitored server as well. That’s the Analysis part: the small random samples captured by SQLClue are made available for further consideration to DBA and developer alike using an infinite click through reporting palate safely isolated from the runtime system and sensitive runtime data. SQLClue goes even further and provides a framework for creating a Database Monitoring Strategy and a SQL Runbook of peer reviewed administration and operation protocols to facilitate and manage the sharing of database related knowledge and information and assure business continuity at the data layer.
In summary, there really is no comparison between SQLClue and Bill Wunder’s DDL Archive Utility. While SQLClue does everything that the Archive Utility could do, SQLClue provides more features and functions for the newer generations of SQL Server. The Archive Utility had one function and many external dependencies. SQLClue has several functions – part of one function replacing what the DDL Archive Utility – and no external application dependencies other than SQL Server.
If you would like to try SQLClue, I invite you to download the freely licensed SQLClue Workstation. Included in the download is a trial version of the Configuration Archive and the other SQLClue Server components.
This page was last modified on Thursday, March 26, 2009 10:36:47 AM