The Proper use of File Groups and Files

If you are creating a database, and it is going to be large, you may want to consider having more file groups than just PRIMARY. File groups enable the DBA to segregate objects into separate files. For example, in our shop, our largest database has the PRIMARY file group, which holds tables and clustered indexes.  Non-clustered indexes go into the INDEX file group, and WORM tables that contain BLOB data that isn’t frequently read go into our ARCHIVE file group.

File groups make it extremely easy to segregate a database into files that can be assigned to drives or LUNs that contain a certain speed of disk.  For example, one can have PRIMARY and INDEX on SSD, while the ARCHIVE group is on SATA. The DBA can gain more granular performance control by using file groups and LUNs together to give or take away speed. The DBA can control cost by using a file group to route infrequently-used data to cheaper (and usually slower) storage.

The Difference Between File Groups and Files

File Groups are assigned to objects (indexes, tables).  Each File group represents a span of data pages.  These pages are not, however, constrained to be on one file.  The DBA can assign many files to a single file group.

Why?  Again, performance control.  Also, space management objectives can be met by using multiple files per file group.

Assume that you have just the PRIMARY file group, and there is a great deal of update traffic.  If you have one file in that group, those pages must go solely to that one file.  Now, imagine that you have a second file, and it’s sized the same as the first file.  SQL server will distribute the pages in the file group fairly evenly between the files in the group.  If the drive / LUN / mount point is physically separated, you now have two paths to obtain or update data, usually resulting in a performance boost. Obviously, having the data on the same drive / mount point / LUN buys you no performance.

Using multiple files in multiple file groups multiplies the potential performance gain, provided that you’re not piling the files on the same drive or mount point or LUN.

However, you can also manage space.  As the drive holding the SQL data grows closer to full, you may need to cap one file and open another file to handle new data pages. This happened to me as the data in our new app grew from 250GB to nearly 2TB, at a rate nearly four times what was predicted before the application went live.  Certain Data files were set to grow no more as the mount point holding the file approached 100% full; a new mount point was added and a new file opened.  With mount points, this can be done with minimal or even zero downtime depending on configuration (results may vary with failover clustering – this may require bringing SQL Server down and back up after adding the drive or mount point as a resource).

To conclude, managing file groups and the files within the group is a powerful tool in the DBA’s performance arsenal.  Learn to shoot straight with it.

Leave a Reply

Your email address will not be published. Required fields are marked *