T-Sql Code Header Comments

In this post I want to talk about the headers that are included in most stored procedures and because I often see requirements for writing T-Sql code that include something along the lines of “all procedures MUST be well commented and adorned with a template that lists each change”, sample procedures include templates such as:

Sql procedure recommended template 1

Sql procedure recommended template 1

or:

Sql procedure recommended template 2

Sql procedure recommended template 2

These are all taken from stack overflow: http://stackoverflow.com/questions/1085672/sql-comment-header-examples

These templates all impart a sense of professionalism and care has been taken to write correct comments and we all know that dutifully ensuring comments are correct help show how well the database code has been written, except that is rarely the case. The quality of comments bears no relationship to the quality of code, in fact code is compiled by a compiler so incorrect code can be detected, incorrect comments are ignored by the compiler just as much as correct comments.

There is one section in particular that is a problem, the change history – if you use source control you have a history of changes which can be validated and checked, however if instead you rely on the comments then you are relying completely on  developers updating the comments correctly which is unlikely, either due to time constraints (i’ll fix this now and update comments later!) or potentially maliciously. If you then consider that the version history could possibly be incorrect then it can’t be trusted and if it can’t be trusted then it is irrelevant and is a waste of time.

Most templates include the name of the stored procedure, in case you don’t know how to find the name of each stored procedure is at the top of script, either after the word proc or procedure, it is sometimes surrounded by [ and ] or “‘s, if the name is in the comments then it might be after something like “Name:”, “Nombre: “, “Monkier:”, “Namesake: “, “Stage Name:”  or something more exotic:

 

name

name

 

So if we use source control (which we should) then if we take the above we have a verifiable history of who the author is (they checked in the code), when the create date is (the date it was checked in), the history which includes both comments and actual diff’s can also be taken from source control.

The above code could have been written as:

re-written with no header

re-written with no header

 

Not everything is useless, there is a real case for showing example usages, especially when there are multiple parameters passed in and it is good to see what possible parameters can be passed in but I think that is about it. You may also wish to include a copyright banner in amongst the header comments which can be included to keep lawyers happy and can be happily ignored by everyone else.

I have come up with a list of common things I see in headers and how they should be handled:

Template Replace With Why?
 Name  Script Header  It is already in an obvious place, if the editor has colouring (everything other than notepad) then it is nicely highlighted
 Author  Source control history  It is verified and easily retrievable
 Create Date  Source control history  It is verified and easily retrievable
 Change History  Source control history  It is verified and easily retrievable
 Description  The stored procedure name  If the name of the stored procedure does not accurately describe the action that it does then rename it!

In another post I would like to talk about comments within stored procedures but I think that is enough for now!

Advertisements

Unit testing constraints in Sql Server

I generally break my unit tests for T-Sql into two broad categories, the first is the schema, things like foreign keys and defaults and the second is the actual code, stored procedures and functions (triggers as well but I do my hardest not to write triggers on account of them being so awful.)

When I look at what tests to write for constraints, I ask myself the question, as I do any time I write anything “what is the point of this, do I need to write it?” and I normally come back with the answer (it is a lot of fun in my head, it is like a regular party with everyone super excited about sql server) “if there is no point in writing the tests, is there any point in the original code”, I then think whether the constraint is necessary and adds anything, if it does then it stays and I write tests for it.

I write tests for the schema as a schema is code, it might be dml instead of ddl but you better believe it is possible to get something wrong and for someone to delete something by mistake and most importantly that I or someone else will want to refactor it later and knowing what will break when a change is made is extremely important.

Sample Schema:

sample schema to demonstrate testing constraints with tSQLt

sample schema to demonstrate testing constraints with tSQLt

To write the actual tests I use tSQLt as my preferred framework and I fake the child table and the reference table and use tSQLt.ApplyConstraint to test the actual constraint. If we take this simple schema, two tables and a foreign key, if we then look at how we test the constraint, what do we care about:

– I care whether the key is defined correctly so will accept values in the child table that are in the parent table
– I care whether the key stops incorrect values going into the child table

I don’t really care if the foreign key allows all the possible values, that is Sql Servers job, I just want to ensure it does what it is I have told it to do.

To write tests, I create a new test class:

exec tSQLt.NewTestClass ‘constraint on users.users gender_id’

I could have one class per table or one per database but there isn’t much of an overhead having lots of classes and it helps to keep them separate and organised so I create one class per constraint.

Once I have the test class, I create the setup routine that will be used by my test “methods” (procedures). The Setup procedure is run before each test and it means you don’t have to keep doing the same thing so it is less rrepetitive and less error prone, remember kids no one is testing the tests!

create procedure [constraint on users.users].[SetUp]
as
exec tSQLt.FakeTable ‘gender’, ‘users’
exec tSQLt.FakeTable ‘users’, ‘users’

insert into users.gender(gender_id, gender_description)
select 1, ‘Male’

exec tSQLt.ApplyConstraint ‘users’, ‘FK_users_gender_id_genders’, ‘users’

In this case, I fake out the two tables involved in the relationship and in the parent table I add a single row. I could add every possible row and then check they work but I believe that is overkill and while it might be straightforward for gender, it is a little more involved for products that have ever been sold at a major supermarket or retailer. With the data added I then ensure that the constraint is applied and ready to allow good data to go in and block bad data and keep it out.

I then have my “happy path” test which tests that I can add data to the child table:

create procedure [constraint on users.users].[test allows male]
as
insert into users.users(user_id, gender_id)
select 100, 1

I typically always expect this to work, but it I have setup the foreign key to point to another column by accident then the insert will fail as in my test setup I only inserted a valid value into the gender_id column in the genders table.

Once I have the good path that should work if I have set everything up correctly, I then look to the unhappy path, that is what happens when something goes wrong – if I try to insert data that is incorrect, does it fail? When it fails, does it fail in the way I expect?

create procedure [constraint on users.users].[test disallows unknown values]
as
exec tSQLt.ExpectException @ExpectedMessagePattern = ‘%INSERT%conflicted%FOREIGN KEY%’
insert into users.users(user_id, gender_id)
select 100, 94898

So here all I am doing is saying that I expect to get an exception somewhere in this procedure and when I do the error is going to match the pattern “%INSERT%conflicted%FOREIGN KEY%” if it doesn’t then the test will fail. I then do the insert with a value I know not to exist (remember I am faking both tables in the Setup so know exactly what values are in each table.

Finally I need to run the tests using either “exec tSQLt.RunAll” or “exec tSQLt.RunTest ‘constraint on users.users’

The tests will run and either pass or fail, to do TDD properly you should create and run the tests, the first one should always complete so it is hard to get the red before the green but the second test should fail before the foreign key is put in, when it is added then the test will succeed as there will be an error message.

I personally think that this finds the right balance between writing tests for the sake of it (1 test for every possible value in the reference table!) and writing enough to know that the constraint is in place and it is doing what it should do – irrespective of whether the values in the table are actually correct.

Like most things in life and in IT the amount and type of tests depends on the type of application and how the application is being used, a test database for me to play around with sql doesn’t really need unit testing but a database that is critical in a medical EMR application should be well tested.

The code that goes along with this is:

https://github.com/GoEddie/Blog-UnitTestingConstraints

Enjoy!