How does the modern DBA deal with 20-1,000 SQL Servers? There are a few choices.
The first thing a DBA could do is configure a set of alerts to catch error situations. While useful, alerts are reactive in nature and the DBA finds out there is a problem after it happens.
The second thing a DBA could do is to configure a set of scheduled jobs that gather and report on metrics that might predict a failure. One example is a script set that measures database space available and or free disk space on the drives/mount points, to help catch things like dwindling free space before it hits you at 3:00am on a Sunday. Again this is an excellent thing to do but implementation may vary by machine and it complicates deployment.
The third thing a DBA could do is to acquire a good monitoring tool. There are several: SQL Sentry, Idera SQL Diagnostic Manager, Redgate SQL Monitor, and Dell (formerly Quest) Spotlight on SQL Server Enterprise.
A DBA who has to monitor the health of more than five SQL servers needs the “force multiplier” effect of a good monitoring tool. Scripts and alerts won’t cut it. My environment uses Spotlight, and has previously dallied with Idera and SQL Sentry. The tool you choose should be determined by your comfort level with said tool, and how effectively it helps you to manage your environment.
What do packaged monitoring tools offer you?
1 . Solid monitoring
Your monitoring product should monitor each SQL instance for performance, blocking, PLE, cache hit rates and more. It should also monitor the health of services such as replication and or Availability Groups. It should be able to gather windows metrics, and if the server is a VM, it should gather at least some basic metrics from the underlying hypervisor.
2. Solid Alerting
A good monitoring tool should enable the DBA to configure alerts that page and or email. Alert thresholds need to be set to a default but customizable for a given machine or SQL Server instance. Alerts should be able to drive actions. For example a replication alert might be set up to stop a user process in the event of excess latency.
3. Solid high-level reporting.
Without making myself into a Spotlight evangelist, this is one area where the tool truly excels. It has a heat map with every instance in several colors, from green for all-OK to red for an immediate-attention item. Moreover, multiple alerts increase the size of the non-green instance icon to draw attention. Even better, there is an iPhone app that we can reference anywhere to temperature-check the environment.
4. Easy Provisioning
One should be able to commence monitoring with little to no effort, and preferably without installing client software or scripts.
5. A Solid Repository
This is important. You should be able to store metrics over time for trend reporting, and to access those metrics for custom reporting as needed. Such power, used wisely, can stabilize environments and make production DBA proactive and not reactive.
Some companies may view monitoring tools as a waste of money. I’m blessed to work someplace where management understands the need for the tools, and the end result is a stable environment that is easier for me to manage and is much more effective at meeting corporate needs.
I suggest that you obtain evaluation copies of the popular tools and see for yourself what they buy you in terms of proactive management of your SQL Server environment.