Use extended events to show your actual io patterns

I have spent a little too much of my life worried about the performance of IO subsystems and sql server, I worry slightly less now ssd’s are more available but I still worry!

I/O patterns are pretty well understood, tools like CrystalDiskMark let you test a subsystem using random and sequential reads and writes of various sizes. Run one of these tools and then put your feet up, you know what performance you will get, job done.

If you don’t want to put your feet up you can investigate what your actual IO patterns are,to do this I like to use extended events and for this there are two events I am interested in:

  • sqlserver.file_read_completed
  • sqlserver.file_write_completed

When you use these you get some really interesting values, notably:

  • mode – this is really cool as it tells you whether the individual IO was completed using the Scatter/Gather API or whether it was Contiguous, I believe that Scatter/Gather is used by the buffeer pool to read pages and Contiguous is used to write log files – but I have no actual evidence of that!
  • duration – the time the IO took – this is really useful in its own right
  • database_id – the database
  • file_id – the data file id – we can use this to group requests to individual files
  • offset – the location of the IO
  • size – the size of the IO

By simply gathering the data and using the database_id, file_id, offset and size we can easily see for each database and file, bear in mind if you are profiling an application you may want to have filtered it to just the database you are interested in, the three important things:

  • Size
  • Type – Read or write
  • Random or Sequential

Random or Sequential we need to work out for ourselves but it is just a simple calculation of offset + size to each operation.

How do we get the data?

If you setup an extended events trace, to do this either use the wizard and add file_read_completed and file_write_completed or use this script:

CREATE EVENT SESSION [IO Patterns] ON SERVER
ADD EVENT sqlserver.file_read_completed(
ACTION(sqlserver.sql_text)),
ADD EVENT sqlserver.file_write_completed(
ACTION(sqlserver.sql_text))
ADD TARGET package0.event_file(SET filename=N’C:\SqlTrace\IOPatterns.xel’)
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)

I like to include sql_text, I realise it isn’t for everyone but it makes me feel really really good to know exactly what caused a specific IO!

If you use the wizard, find these two events:

from the new extended events session choose read and write completed

 

However you choose to create the session, start it and generate some IO, if you are testing you may need to use “DBCC DROPFREEBUFFERS” to cause some IO, SQL won’t read from disk unless it really needs to!

When you have generated your IO then you will need a query to get the data, something along the lines of this should work nicely (if slowly!):

declare @file nvarchar(255)
select @file = N’c:\SqlTrace\IOPatterns*.xel’

select
event_xml.value(‘(./@timestamp)[1]’, ‘datetime2’) as timestamp,
event_xml.value(‘(./@name)’, ‘varchar(1000)’) as event_name,
event_xml.value(‘(./data[@name=”database_id”]/value)[1]’, ‘int’) database_id,
event_xml.value(‘(./data[@name=”file_id”]/value)[1]’, ‘int’) as file_id,
event_xml.value(‘(./data[@name=”offset”]/value)[1]’, ‘int’) as offset,
event_xml.value(‘(./data[@name=”size”]/value)[1]’, ‘int’) as size,
event_xml.value(‘(./data[@name=”mode”]/value)[1]’, ‘int’) as mode,
event_xml.value(‘(./data[@name=”duration”]/value)[1]’, ‘int’) as duration,
event_xml.value(‘(./action[@name=”sql_text”]/value)[1]’, ‘varchar(4000)’) as sql_text
from (select cast(event_data as xml) xml_event_data
from sys.fn_xe_file_target_read_file(@file, null, NULL, NULL)) as event_table
cross apply xml_event_data.nodes(‘//event’) n (event_xml)

This generates output like:

output from extended events file_read_completed and file_write_completed

output from extended events file_read_completed and file_write_completed

 

The time this works best for me is if I am spec’ing out a new system for an existing application, normally when the old hardware is end of life and is being replaced and I can get a trace of an actual production system and see what it’s IO characteristics are.

Alternatives

The alternatives are to use procmon, this is how I used to do it but now we can tie IO requests to actual queries so you can forget about procmon in my mind!

The other thing that tells you about IO’s is the dynamic management function sys.dm_io_virtual_file_stats and it tells you alot but it doesn’t help me worry less about IO patterns!

Advertisements

Get told about explicit & implicit conversions in Sql 2012+ using extended events

Sql Server  2012 introduced a new extended event warning, which tells you when you get a plan with a conversion in it that causes scans instead of seeks so that you know when a plan is used that could have been better if the data types were correct. This is a massive performance issue and will essentially make a filter or join non-sargable causing a table or index scan rather than a seek. The issues of implicit conversions has been covered many times before so I won’t go over it here (see http://sqlperformance.com/2013/04/t-sql-queries/implicit-conversion-costshttps://www.sqlskills.com/blogs/jonathan/implicit-conversions-that-cause-index-scans/ or even http://lmgtfy.com/?q=performance+issues+with+implicit+conversions)

The new extended event is called “sqlserver.plan_affecting_convert” and fires every time there is a conversion in a plan that stops sql from doing a seek. There are a number of different plan_affecting_converts so check the ConvertIssue attribute for “Seek Plan” for this particular issue.

To demo this, I have created a table with a varchar datatype and a unique index on the column so a seek can be performed, however I use an nvarchar to search the column which means that every single value in the column needs to be converted from a varchar to a nvarchar before the comparison can take place, the plan looks like this and there is a warning to say that it isn’t very good:

Query and Execution Plan with Seek Warning

Now if you ware writing some T-Sql and you see this then you can change it there and then but if the code is loose on the server, previously to this event you need to do things like search the plan cache for implicit conversions remembering, there is a good example of this here:

 http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/01/08/finding-implicit-column-conversions-in-the-plan-cache.aspx

and

http://www.brentozar.com/blitz/implicit-conversion/

Now with this extended event you can sit and wait and get notified when it happens, this lets you be much more pro-active in finding performance issues on your Sql Server. To create the event, either use the wizard to create the session or do something like:

 

CREATE EVENT SESSION [plans_with_conversions] ON SERVER
ADD EVENT sqlserver.plan_affecting_convert(
ACTION(package0.callstack,sqlos.cpu_id,sqlserver.client_app_name,sqlserver.client_connection_id,sqlserver.database_name,sqlserver.plan_handle,sqlserver.server_principal_name,sqlserver.sql_text,sqlserver.tsql_frame,sqlserver.tsql_stack))
WITH (STARTUP_STATE=OFF)

 

You will then get alerts that look like:

 

Convert Warning Extended Event

From this you have the text that caused the issue and the plan handle so you can track down the query that is being so mean to your cpu (and probably i/o subsysten) and make it play nice!

Hapy Extended Eventing!