The comments that followed brought up some good points about some scenarios where you might choose to not back up a very large database. I performed operations for a BI application at Microsoft whose database was 7 TB. At the time I joined the operations team, I was told that they don’t know how to back up a database that large so they just don’t do it. I told them they had to start, and one of the first things I helped them do was to get multiple dedicated LUNs set up so they could speed up the backup by striping it across multiple drives (see Day 25 of 31 Days of Disaster Recovery: Improving Performance of Backups and Restores) so that the total backup time was within a reasonable window. Unfortunately, their database wasn’t designed with multiple filegroups, so our options were limited.
The source system only kept inactive cases (a particular set of data that is no longer in use) for 90 days. This meant that they could reload lost data for up to 90 days, and the system allowed for a long enough downtime (RTO — see Day 7 of 31 Days of Disaster Recovery: Writing SLAs for Disaster Recover) that they could reload a large amount of data. We opted for a schedule of backing up the whole database once per month AND immediately after any schema upgrade.
Because we opted to perform only full backups and were not doing log backups, we need to ensure that the schema is being preserved. Or we would need to be able to identify scripts we could run to re-apply schema changes to the database. This was in SQL Server 2005 when we set this up. SQL Server 2008 wasn’t released yet. Though not an obvious feature, SQL Server 2008 did introduce a new technology that could be used to back up and restore just the schema of a database. More on that in a minute.
One argument I’ve heard for not doing backups of very large BI systems is that the data can be rebuilt from the source systems. My first question is always, what about the schema of the database? Where will you get that from? The easy answer … from the dacpac.
What’s a Dacpac?
A dacpac is a “data-tier application package”. Yes, I know the acronym makes no sense. True story (I $#!+ you not), they chose the acronym dac for data-tier application because “the acronym dta was already taken”. Apparently, the great minds that chose dac had never heard of the dedicated admin connection (the real dac) added in SQL Server 2005. Dac is generally thought of as a way to simply deployment of database applications by encasing it in a package (dacpac) that can dynamically upgrade an existing system to make it match the new schema. But there’s another use for dacpacs that maybe you haven’t considered: schema-only backups and restores.
Schema-only Backups
If you right-click on a database and highlight Tasks in SQL Server 2008 and newer, you will see several options for the dac. We’re not interested in this post in most of those options. We’re only interested in “Extract Data-tier Application”. This task will export the data tier application with no data. It’s basically just the database as an empty shell. Extracting the dacpac is very simple and there aren’t really any options for it other than where to save it.Alternatively, there is an option for exporting a data-tier application which has lots of options and allows you to export only part of a database and includes the data. That’s a discussion for another day.
Restoring a Dacpac
So let’s suppose you had a very large database and planned to re-import the data from source systems if you lose the database. Now, let’s assume the database is completely lost, and you want to start over with an empty version for the database, or what we now might call the dacpac. Restoring via the command line is actually very easy as well. The steps to restore the dacpac would be:
- Right-click on the Databases node
- Click on Deploy Data-tier Application… (not import)
- Browse for and select the dacpac you want to deploy
- Click Next
- You have to delete the pre-existing version of the database, if there is one, or give it a different name to be able to deploy it
- Click Next
- View the Summary and click Next to start the deployment
- Click Finish if the deployment completed successfully.
If you completed the above steps with no errors, you now have an empty copy of your database. Some things are different, like the database file settings will be missing and extended properties for the database are not included. You can also use the dacpac to “Upgrade the Data-tier Application” and generate a change script f the database still exists. For example, if you have an older full backup from before a schema change was made, you can then upgrade the database from the dacpac and it will generate a change script to make the schema match the dacpac.
In terms of automation, it would be simple to programmatically export dacpacs for regular backups of the schema using PowerShell. I won’t go into the PowerShell commands here. You can check out this post on the official Data Platform Blog about the PowerShell commands for dac: DAC Powershell Samples.
Long story, short, a dacpac may be a much better alternative to generating very large backups of systems where you do not need to preserve the data, but you want to protect the schema. It’s not exactly the same as a backup with no data included, but it is a great way to easily preserve the schema of a very large database.
Wayne
Very interesting Robert. Finally, a use for a dacpac that a DBA can get behind! 🙂
SQLSoldier
Thanks Wayne. My thoughts exactly when I first realized this.
dan holmes
A followup on the same topic:
http://dnhlmssql.blogspot.com/2014/01/headed-down-trail-with-my-dacpacthen-i.html
SQLSoldier
Great article, Dan. I had run into some of those problems with one database and had planned to follow up later.
Bob Beauchemin
Hi Robert,
Be careful with the DAC PowerShell Samples. They were written over DACFx 2.0 which doesn’t support all the database objects and has been superseded by DACFx 3.0 which does support all objects but has a completely different programming model.
Reference: http://www.sqlskills.com/blogs/bobb/dacfx-3-0-the-new-programming-api/
Cheers, Bob
SQLSoldier
Thank you, Bob. That’s good to know. We do have some PowerShell jobs running in production to export dacpacs weekly for our dev teams to use for script generation and such. I’ll verify which version we’re using and plan to update it if using the older version.
(SFTW) SQL Server Links 24/01/14 • John Sansom
[…] Schema-only Backups and Restores – Robert L. Davis(Blog|Twitter) […]
Dale
Robert, this is exactly how I have been using DACPAC’s for schema backups. The only part that’s been hard has been schema compare between two DACPAC’s, but I’m automating it all into our version control system, so that helps a lot.
SQLSoldier
We have them automated, but only to the point that we export them to a share for the developers to use.
My links of the week – January 26, 2014 | R4
[…] L. Soldier’s Schema-only Backups and Restores shows how a backup and restore of a database schema can be done using a […]
Mark Stacey
Some interesting comments.
I have indeed worked on DW projects with no backups, where we kept all the files we loaded from the source system available so that we can do a reload. Where did we get the schema? From source control of course.
I would definitely say that source control is a better repository for schema than a dacpac ~ if people are launching changes to the DB into prod without going via some form of source control, you have other issues than just no backups…
SQLSoldier
I agree completely, Mark. But I also like having something more in case I can’t get a hold of the schema out of source control.
I’ve worked with excellent dev teams where that would not have been an issue. I’ve also worked with developer teams who with horrible processes and they don’t even trust what was in source control.
schema only backups | sqlqueen
[…] Excerpt from http://www.sqlsoldier.com/wp/sqlserver/schemaonlybackupsandrestores […]