Data exploration and documentation can be cumbersome, classifying data can be one of those tedious but necessary things. With the introduction of such things as GDPR there is an ever greater need to classify and label your sensitive data. Microsoft attempts to help us out with this in the new SQL Data Discovery and Classification feature in SQL Server Management Studio (SSMS). This a new feature is available in Azure SQL Database and SSMS 18.0 and higher. It can be used against databases in SQL Server 2008, compatibly mode 100, and greater.
This feature will scan through your database tables and identifies columns that contain possible sensitive data. It then allows you to categorize that data as well provides a detailed report for auditing and compliance purposes.
Let’s see how it is done.
In SSMS on the database you want to use Right click on Tasks, then choose Data Discovery and Classification and Choose Classify Data. In this example, I am using the AdventureworksDW2016CTP3 database.
A results window will pop up showing how many field recommendations it has found. Click on it to view them..
When you view the data, you will see your data broken down by Schema, Table and Column. SSMS then attempts to categorize (information types) the information and estimate a sensitivity level (sensitively label) . It allows you to accept the recommendation by checking the box on left hand side. If it’s not quite what you wanted, you can adjust the Information Types and Sensitivity. Once you are satisfied with the category assignments, click on the blue “Accept selected recommendations” button located at the top and choose Save.
As shown below, I have classified 10 columns and have 64 left unclassified
It also gives me the ability to manually add a field and classification by choosing the Add Classification button at the top.
As I mentioned previously, this feature provides a way to report on the data classification.
You can retrieve the report by clicking View Report, located adjacent to the Add Classification button at the top.
*Note if your report is blank you forgot to hit SAVE while classifying.
The report breaks all your data down nicely by schema, information types and sensitivity. You can see how this can be very useful and insightful when reviews data for compliance. It does take a little time to go through and validate the results of the classification. While this process might be lengthy to accomplish, in the long run, it is well worth the time.