How To Collect SQL Server Metrics: Database File Growth

How To Collect SQL Server Metrics: Database File Growth

Introduction

Go grab yourself a hot cup of coffee, it’s Monday morning and we are going to collect some data file sizes so that we can make smart/er decisions. I am going to show you how to grab data from a SQL Server catalog view and have something up and running in about 5-10 minutes. This is what I will cover in this post:

  1. Building a script to capture system file size metrics
  2. Scheduling the data capture
  3. Querying said data

By the way: Please excuse the lack of tab-indenting on my scripts below. They are ugly… I know. I’ve been wrestling with my plugin. The code is also available on my GitHub account and you are welcome to use it.

[Update 2018-12-20] I have replaced the code in this post with GitHub Gists and added custom styling with help from Customizing GitHub Gists. Let me know what you think in the comments.


Capture the Current File Size

The first thing we’ll need to do is capture the file metrics. I’ve prepared the script below to capture the current file size, space used, and free space of all the files in a single database.

SELECT
@@SERVERNAME AS ServerName,
DB_ID() AS DatabaseID,
DB_NAME(DB_ID()) AS DatabaseName,
database_files.type_desc AS FileType,
database_files.name AS LogicalFileName,
database_files.physical_name AS PhysicalFileName,
database_files.size * 8 AS FileSize, -- Current size of file in 8KB pages
CAST(database_files.size / 128.0 AS DECIMAL(14, 4)) AS FileSizeMB, -- Same as (size * 8 / 1024)
CAST(FILEPROPERTY(database_files.name, 'SpaceUsed') / 128.0 AS DECIMAL(14, 4)) AS SpaceUsedMB,
CAST((database_files.size / 128.0) - (FILEPROPERTY(database_files.name, 'SpaceUsed') / 128.0) AS DECIMAL(14, 4)) AS FreeSpaceMB,
GETDATE() AS PollDate
FROM sys.database_files
WHERE database_files.type_desc IN ('ROWS', 'LOG');
view raw select.sql hosted with ❤ by GitHub

The sys.database_files catalog view has a lot of useful information, including system configuration settings such as growth, max_size, and database state. If you are looking to customize this script, and I encourage you to do so, just note that the size column can be a bit confusing. The docs list the description as “Current size of the file, in 8-KB pages.” All this means is that this field is really “the number of pages the database takes up”. In other words, multiply it by 8 to get the KB size.

More Info -> sys.database_files, FILEPROPERTY()

Executing this script as is, will generate the file stats of the database in your current database context. The table below will be a great place to store it.

CREATE TABLE dbo.DatabaseFileSize
(
ID INT NOT NULL IDENTITY(1,1),
ServerName NVARCHAR(128) NOT NULL,
DatabaseID SMALLINT NOT NULL,
DatabaseName NVARCHAR(128) NOT NULL,
FileType NVARCHAR(60) NOT NULL,
LogicalFileName SYSNAME NOT NULL,
PhisicalFileName NVARCHAR(260) NOT NULL,
FileSizeKB INT NOT NULL,
FileSizeMB DECIMAL(14,4) NOT NULL,
SpaceUsedMB DECIMAL(14,4) NOT NULL,
FreeSpaceMB DECIMAL(14,4) NOT NULL,
PollDate DATETIME NOT NULL,
PRIMARY KEY CLUSTERED (ID)
);
GO
view raw create_table.sql hosted with ❤ by GitHub

Note: If you were wondering how I have carefully chosen my data types, I found them in Microsoft’s SQL Server documentation. You can too! If you are trying to understand more about some of the useful data available “under the hood”, spend some time in those docs.


Cursors are Great

Sometimes, right? A cursor isn’t always the best choice, however, in this case it will do a fine job for iterating over all of the databases in an instance to capture the file size metrics. It also happens to be one of the few appropriate uses for one. I’ve taken the file metrics script from earlier and wrapped it in a cursor with some dynamic SQL.

/* Cursor for iterating over all the databases and capturing the file size data */
-- Create List of Target Databases
DECLARE @DatabaseList TABLE
(
DatabaseName VARCHAR(50)
);
INSERT INTO @DatabaseList (DatabaseName)
SELECT [name]
FROM sys.databases
-- Initialize Variables
DECLARE @DatabaseName VARCHAR(50);
DECLARE @stmt NVARCHAR(MAX);
-- Initialize Cursor
DECLARE db_cursor CURSOR FOR
SELECT DatabaseName
FROM @DatabaseList
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @DatabaseName
-- Begin Loop
WHILE @@FETCH_STATUS = 0
BEGIN
--Build and Execute Statement
SET @stmt =
'
USE ' + @DatabaseName + '
INSERT INTO DemoDB.dbo.DatabaseFileSize
(
ServerName,
DatabaseID,
DatabaseName,
FileType,
LogicalFileName,
PhisicalFileName,
FileSizeKB,
FileSizeMB,
SpaceUsedMB,
FreeSpaceMB,
PollDate
)
SELECT
@@SERVERNAME AS ServerName,
DB_ID() AS DatabaseID,
DB_NAME(DB_ID()) AS DatabaseName,
database_files.type_desc AS FileType,
database_files.name AS LogicalFileName,
database_files.physical_name AS PhysicalFileName,
database_files.size * 8 AS FileSize, -- Current size of file in 8KB pages
CAST(database_files.size / 128.0 AS DECIMAL(14, 4)) AS FileSizeMB, -- Same as (size * 8 / 1024)
CAST(FILEPROPERTY(database_files.name, ''SpaceUsed'') / 128.0 AS DECIMAL(14, 4)) AS SpaceUsedMB,
CAST((database_files.size / 128.0) - (FILEPROPERTY(database_files.name, ''SpaceUsed'') / 128.0) AS DECIMAL(14, 4)) AS FreeSpaceMB,
GETDATE() AS PollDate
FROM sys.database_files
WHERE database_files.type_desc IN (''ROWS'', ''LOG'')
';
EXECUTE sp_executesql @stmt
FETCH NEXT FROM db_cursor INTO @DatabaseName
END
CLOSE db_cursor
DEALLOCATE db_cursor
view raw cursor.sql hosted with ❤ by GitHub

Next, drop this script in an Agent Job and set the schedule to execute once a day to start collecting the data. Although this is a minimal set of data, it will grow along with the other data that you’ve been collecting. At some point you will want to purge some of this. So set up a job that clears out this data and configure it with a retention period that you have spent many hours on the whiteboard contemplating.


Now What?… Query it

Stopping here would be a totally acceptable approach. There are several ways using this data can give insight into the growth of your database files. See the following query for an example of using the LAG window function with some partitioning to give the data some extra context.

/* Example of a useful query with a measure for data file growth over time */
SELECT f.DatabaseName, f.LogicalFileName, f.FileType, f.FileSizeMB, f.SpaceUsedMB, f.FreeSpaceMB, f.PollDate,
(f.SpaceUsedMB) - LAG(f.SpaceUsedMB, 1) OVER(PARTITION BY f.DatabaseName, f.LogicalFileName ORDER by f.PollDate) AS Growth
FROM dbo.DatabaseFileSize f
view raw query_example_01.sql hosted with ❤ by GitHub

LAG was one of my favorite query additions in SQL Server 2012. It replaces self-joining queries, assigning row numbers, and lots of sorting. In the query above, I’ve used LAG to calculate the delta in SpaceUsedMB from the previous row; the previous row being determined by PARTITION BY and ORDER BY. Armed with this data, you could add additional measures and extract some useful analytics:

  • Growth patterns over the course of a year
  • Capacity planning for your storage infrastructure
  • The configuration of the proper file growth setting

What’s Next

A couple of things. I’m a fan of visuals, so I would like to share a method of building a custom PowerBI Dashboard. This is a preview of what I’ve been working on for visualizing this data.

Additionally, for any architects out there, I am working on a data model to put this in the proper format. Before purging your data consider sending it to your Data Warehouse and if space is an issue you could always just keep the monthly aggregates.

Hopefully this is useful to someone out there looking to make more informed decisions about their data file sizes and management. Thanks for dropping by.

Leave a Reply

Close Menu