Tuesday, October 26, 2010

Deadlock two ways (Microsoft SQL Server)

Think of "Duck two-ways" or some other delicious meal. This is as tasty ;-)

If you have done any development with Microsoft SQL Server with even a moderate amount of concurrent processing, then you have probably witnessed the dreaded:
Transaction (Process ID 42) was deadlocked on lock resources with another process and has been chosen as the deadlock victim
This indicates that Microsoft SQL Server has detected a deadlock and has killed one of the processes to eliminate the stale mate. Let's go through two very different deadlock conditions that have the same central theme. I have found these to be extremely common in production.

Intuitively deadlock is a "deadly embrace" that occurs when two resources are being contended over. Thus, people are often surprised when they get deadlocks and "they weren't touching the same records!". Unfortunately, in traditional pessimistic locking solutions to concurrency control, other records must be locked to guarantee ACID properties. There are other solutions to concurrency control (MVCC is the one everyone reaches for) that avoid most/all of this. However, MSSQL by default employs a pessimistic approach. Also, I don't mean to sound as if I'm picking on MSSQL-- the principles below are true in a number of database implementations. DB/2 is also particularly nasty with pessimistic locking style deadlocks. The two presented below were reported to the JTDS forums, and are from real production.

Deadlock I


This is the common unicode "mismatch" problem. This has been covered on blogs a lot. I am going to use this as a way to describe how to research deadlocks. But if you're already bored, go ahead and skip to Deadlock II
Here is a table definition (slightly changed for brevity):

CREATE TABLE [PM1](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[KEYNAME] [varchar](50) NOT NULL,
[VALUE] [varchar](255) NULL,
CONSTRAINT [PK_PM1] PRIMARY KEY CLUSTERED
(
[ID] ASC
))
CREATE NONCLUSTERED INDEX [pmkv_idx] ON [PM1]
(
[KEYNAME] ASC,
)
Now we execute multiple, concurrent:
DELETE FROM PM1 WITH (ROWLOCK) WHERE KEYNAME = @P0

And we get a deadlock! Let's familiarize ourselves with some technical details and terms:

Clustered and Non-Clustered Indexes


There are two database structures for this DDL. One primary, clustered index, which has a key [ID]. And a secondary, non-clustered index, which has a key [KEYNAME]. A "clustered" index means that all of the data, the entire record, is stored in the leaf of the index. It is still an index--it is still searchable by [ID], but when it gets to a leaf, the entire data record is physically stored there. This is different from a secondary or non-clustered index in which the leaf of the index contains some kind of reference to the whole data record. In our secondary index, the index is "searchable" by [KEYNAME], but in the leaves of the index, the "reference" is actually the primary key value (ID), which can be used then to "search" the primary, clustered index. This is how databases use secondary indexes-- they find matching data with the index, then do a bookmark lookup, which is a join, into the clustered index to get the rest of the data.

Seeking vs Scanning


There are different actions the engine can perform on these structures: the engine can scan the index, in which case it will start somewhere (often the beginning) and start reading each record in order. As these indexes a B+-Trees, the leaf level is a doubly linked list. Thus, from a single leaf node, you can read sibling records in order without having to go "up and down" the index levels. Scanning is expensive as it is an O(n) operation which degrades as the number of records increases.

The engine can seek the index, in which case it will use a binary search to locate a leaf node (record). This is much preferable as it is a O(log n) operation-- and the base of that log is very high (the number of records per page). Thus seeking is why we put indexes on tables-- to take advantage of the "quick" searching.

Getting Information about your deadlock


In Microsoft SQL Server (2005+) there is a useful report that can give you good details about your deadlock. Without this information, it's very difficult to understand what's really happening. You can get this information two ways
  • SQL Server Profiler, capturing the Deadlock event
  • Enable Trace Flag 1222
I find it generally useful to enable T1222 as a startup parameter for SQL Server, so that whenever a deadlock occurs, it dumps the details to the SQL Server error log file. This is great, and incurs negligible performance overhead.

Trace 1222 Output


Let's look at the 1222 output and see what actually happened. There are two sections to the 1222: (1) the processes involved, (2) the wait-for graph, which shows which locks are acquired which are trying to be acquired. Here is an abbreviated 1222 for this deadlock:

<deadlock victim="process3dcab08">
 <process-list>
  <process id="process3dcab08" ...>
   <executionStack>
    <frame ... >DELETE FROM PM1 WITH (ROWLOCK) WHERE AND KEYNAME = @P0 </frame>
   </executionStack>
   <inputbuf>(@P0 nvarchar(4000))DELETE FROM PM1 WITH (ROWLOCK) WHERE KEYNAME = @P0 </inputbuf>
  </process>
  <process id="process3de1198" >
   ... (same as previous process) ...
  </process>
 </process-list>
 <resource-list>
  <keylock ... objectname="DB1.dbo.PM1" indexname="PM_IDidx" ... mode="X" ... >
   <owner-list>
    <owner id="process3de1198" mode="X"/>
   </owner-list>
   <waiter-list>
    <waiter id="process3dcab08" mode="U" requestType="wait"/>
   </waiter-list>
  </keylock>
  <keylock ... objectname="DB1.dbo.PM1" indexname="PM_IDidx" ... mode="X" ... >
   <owner-list>
    <owner id="process3dcab08" mode="X"/>
   </owner-list>
   <waiter-list>
    <waiter id="process3de1198" mode="U" requestType="wait"/>
   </waiter-list>
  </keylock>
 </resource-list>
</deadlock>

The process-list section describes all of the processes (concurrent executions) involved. I have colored one blue and one green to help distinguish them. So as we see each is holding an X (exclusive) lock, which the other is trying to get a U (update) lock for.

The database must hold an X lock before actually modifying a record, thus we expect this to be involved. However, a U lock is unexpected. Update locks are an "in between" lock, which are used by the database when scanning for potential records that need to be updated. In this case, a U lock is acquired before checking the value to see if it matches your WHERE clause. If it does, then the U is "upgraded" to an X lock, if not, then it is released. However, our WHERE claue has KEYNAME = @P0, which is indexed. Thus, we shouldn't be scanning anything-- we should be seeking on the index. When seeking-- the database goes exactly to the records which qualify, and thus X locks are requested to begin with. Thus, we expect to see only X locks from the statements above.

So what happened?


The answer is subtle: look at the input buffer for the delete statement:

<inputbuf>(@P0 nvarchar(4000))DELETE FROM PM1 WITH (ROWLOCK) WHERE KEYNAME = @P0 </inputbuf>

The parameter for the delete statement was declared by the driver as nvarchar, but if you look back at the DDL, you will see that the KEYNAME column is declared as varchar.

The NVARCHAR datatype is a unicode text datatype, whereas VARCHAR is non-unicode, 8-bit character datatype. The engine cannot directly compare an nvarchar value to a varchar value-- it must use a conversion. This conversion makes the where clause predicate not sargable, and that disallows the optimizer from seeking on it. Instead it scans the whole index, one record at a time, and for each one, requests a U lock, converts and checks the value, releases the U lock. Thus, for every delete of a single record, every record must be checked with an expensive, conflicting lock. And as seen in our case, two concurrent deletes happen to be checking and ran into each other.

The solution


So why did the driver do that? Well this was a Java application, and in Java strings are unicode by default. By default, the driver binds strings to their unicode counterpart. However, there is a setting on the driver's connection url, which instructs the driver to bind to non-unicode instead. The key is that you want to ensure that the driver is sending parameters the same as the underlying schema is defined. For JTDS the setting is sendStringParametersAsUnicode and it defaults to TRUE. Thus, if your schema is defined with all VARCHARs and you are using them in indexes, then you want to ensure this property is set to FALSE in your connection url. Here is an example of a connection url that forces the driver to send them as varchar instead of nvarchar:

jdbc:jtds:sqlserver://mssql01:1433/DB1;sendStringParametersAsUnicode=false

Deadlock II


This deadlock is a little less common, but interesting, because it also has an "unexpected" wait-for graph in the 1222 output. The schema was jBPM, which may be familiar. I need to see if someone has already fixed this in jBPMs delivered schema.

Trace 1222 Output


First, let's look at the 1222 trace:

<deadlock victim="process3de0868">
 <process-list>
  <process id="process3deb08" ...>
   ...
   <inputbuf>(@P0 bigint,@P1 int)delete from JBPM4_SWIMLANE where DBID_= @P0 and DBVERSION_= @P1</inputbuf>
  </process>
  <process id="process3de0868" ...>
   ... same as other process ...
  </process>
 </process-list>
 <resource-list>
  <keylock ... objectname="dbo.JBPM4_TASK" indexname="PK__JBPM4_TASK__2BFE89A6" mode="X" ...>
   <owner-list>
    <owner id="process3deb08" mode="X"/>
   </owner-list>
   <waiter-list>
    <waiter id="process3de0868" mode="S" requestType="wait"/>
   </waiter-list>
  </keylock>
  <keylock ... objectname="dbo.JBPM4_TASK" indexname="PK__JBPM4_TASK__2BFE89A6" mode="X" ...>
   <owner-list>
    <owner id="process3de0868" mode="X"/>
   </owner-list>
   <waiter-list>
    <waiter id="process3deb08" mode="S" requestType="wait"/>
   </waiter-list>
  </keylock>
 </resource-list>
</deadlock>

So we have two workers executing a delete by primary key... and they're deadlocking on S locks!? An S (shared) lock is used when the database needs to get a consistent read on something. Thus, by default, when you execute SELECT statements, shared locks are being issued under the covers to ensure read consistency (w.r.t the isolation level you're running at).

So what happened?


There is a hint in the wait-for graph that shows the problem: Look at the keylock that is being contended-- the indexname is PK__JBPM4_TASK__2BFE89A6. This is an auto-generated name for the primary, clustered index for the JBPM4_TASK table. We're deleting from the SWIMLANE table, so why would there be any locks on the TASK table? The TASK table has a foreign key reference (field swimlane_) to the SWIMLANE table. Thus, when you delete a SWIMLANE record, the database must ensure that there are no records referring to the SWIMLANE record. If there are, then the delete cannot proceed, because it would be a foreign key constraint violation. To do this check, the database needs to find records in the JBPM4_TASK table WHERE swimlane_ = id of swimlane being deleted. This read will be done using S locks to ensure consistency. Unfortunately, the JBPM4_TASK table does NOT have a secondary index on the swimlane_ column. Thus, the only option the database has is to scan the entire table by primary, clustered index. Thus, for each record, a S lock must be acquired, the record evaluated for the WHERE clause, and then the S lock is released. Each one of these is a deadlock opportunity if there are concurrent operations touching the TASK table.

The Solution


Adding an index on the swimlane_ column in the TASK table will reduce the n deadlock opportunities to 1, which is a significant improvement-- not to mention the performance improvement.

The Overall Theme


While these are very different deadlocks, the theme is the same: unintended scans. Scans are not only a performance problem, but a deadlock problem, as you increase the number of times that you request conflicting locks while holding locks. Thus, take care when considering which indexes to create-- its a tricky problem with many factors.

Steve

20 comments:

  1. Hi Steve,

    In related to "The parameter for the delete statement was declared by the driver as nvarchar, but if you look back at the DDL, you will see that the KEYNAME column is declared as varchar."

    Is not good also modify the table columns from varchar to nvarchar?

    Regards

    ReplyDelete
  2. @Marcos
    Yes absolutely-- you could make the columns nvarchars and the conversion would go away, making the predicate sargable. The point is to make sure that the bound parameters in your sql statement and your columns match.

    Thanks,
    Steve

    ReplyDelete
  3. Hi Steve,

    You are right!

    1. Using jtds and sendStringParametersAsUnicode = FALSE
    1.1. No deadlock founded involving @P0 nvarchar(4000)
    1.2. There was deadlock founded involving bigint, int (numbers)

    2. We just back again to jtds and sendStringParametersAsUnicode = TRUE
    as proof of concept and:
    2.1. There was deadlock involving @P0 nvarchar(4000) and bigint, int (numbers)

    3. We used Microsoft driver and there is no deadlock.

    Tomorrow we are going to be use sendStringParametersAsUnicode = TRUE and check/analyse all indexes.

    Let me send to you the deadlock via email.

    Thanks again Steve.

    ReplyDelete
  4. @Marcos
    Ok- well the MSFT driver must be defaulting differently or using some metadata to send them correctly. I've only used the MSFT driver 1.2 in production, and I'm pretty sure that it defaulted to send unicode. Maybe they've changed this in driver 2.0.

    So you still got a deadlock with using numbers? And it wasn't the deadlock described in Deadlock II? Yes please send me the 1222 trace.

    Steve

    ReplyDelete
  5. This is a nice article..
    Its very easy to understand ..
    And this article is using to learn something about it..

    c#, dot.net, php tutorial

    Thanks a lot..!

    ReplyDelete
  6. Hi Steve,I have a deadlock issue with jbpm different than the one you described in deadlock II... it throws a deadlock when trying to retrieve the token's children. The where clause is not a nvarchar, it's a bigint (numeric(19,0)):
    select ... from JBPM_TOKEN children0_ where children0_.PARENT_= @P0

    The PARENT_ column already has an NONCLUSTERED INDEX... could this be a "deadlock III"?

    I tried adding the unicode parameter as well just in case.

    What's weird is that everything works great with mysql, only sqlserver has the problem (both 2005 and 2008).

    Thanks!

    ReplyDelete
  7. @naty
    Interesting indeed. Do you have the deadlock wait-for graph (T1222 trace)? I would love to see it.

    If you are using the innodb storage engine in MySql then its not surprising that you don't see the issue. Shared locks are implemented totally different in innodb than MSSQL.

    If you can post the waitfor graph I would love to help (or make a new post for deadlock III).

    Steve

    ReplyDelete
  8. I'm not a dba, so I'm really struggling with this... any help you can give me would be fantastic.
    My client won't give up on sqlserver and I can't get it to work.

    Can I change something in my sqlserver configuration to make it more "mysql innodb" like? I've tried setting a lower isolation level in my java code with no luck.
    You can download the deadlock xdl graphs from here: http://dl.dropbox.com/u/966834/deadlock/sqllogs.rar

    Let me know if there's anything else I can provide that'd help.
    Thanks again!!!

    ReplyDelete
  9. @naty
    Yes if you are running MSSQL 2005 or 2008 then you can enable "Read committed snapshot" isolation level on the database (its a database level configuration option). Then set your connection isolation level (RCSI) to be "read committed". This doesn't use shared locks like tradition mssql pessimistic locking, and uses snapshots instead. This is a "safe" change to make in _most_ cases. I don't know specifically if jBPM is relying on locking behavior that will be different under RCSI. You can run a regression and try it or try googling. My bet would be that RCSI will be fine and will solve your problem.

    I will look at the dropbox file in a bit and let you know if I see anything useful.

    ReplyDelete
  10. @naty
    Ok I looked at the wait-for graph and RCSI will fix your problem. jPBM can fix this with a programming change, but I need to investigate a little more in what they are doing in their transactions (they are going back to select stuff after holding locks). This could be a bug, an oversight, or necessary for correctness. I don't really have time to spend in the jBPM source at the moment. But I think RCSI will be your easy fix.

    Let me know how it goes!

    I will try to do a post on this soon, because this is one example of a "3-way" deadlock!

    Steve

    ReplyDelete
  11. I set the database isolation with
    ALTER DATABASE [dbname] SET READ_COMMITTED_SNAPSHOT ON;

    And the isolation in the datasource with



    (I'm using org.apache.commons.dbcp.BasicDataSource)

    The deadlock is different now, it comes up when trying to update the table...

    (@P0 bigint,@P1 varchar(8000),@P2 bigint)update JBPM_VARIABLEINSTANCE set TASKINSTANCE_=@P0, NAME_=@P1 where ID_=@P2

    so I'm still in a deadlock...

    I uploaded the new xdls here:
    http://dl.dropbox.com/u/966834/deadlock/sqllogsv2.rar

    something else under your sleeve there?

    I really appreciate the effort, so thanks again!

    ReplyDelete
  12. I changed the driver to jtds (i was using microsoft jar to see if that solved anything) and the deadlock is now

    (@P0 bigint,@P1 int)delete from JBPM_PROCESSINSTANCE where ID_= @P0 and VERSION_= @P1

    I can see in the stack that the isolation level is "read commited" so I guess that worked... but I don't know if this deadlock if before or after the update one...

    ReplyDelete
  13. @naty
    Wow jBPM is really deadlock happy! This is a different deadlock. I don't have the schema in front of me but issuing shared locks when doing a delete (as is in the wait-for graph in the second rar file you uploaded) is characteristic of deadlocking when scanning for foreign-key constraint violations.

    So I would first look at the schema for JBPM_VARIABLEINSTANCE and see if there are any recursive relationships where a column (maybe TASKINSTANCE or NAME) are foreign keys back on to the JBPM_VARIABLEINSTANCE table. If not then look and see if there are any cascade actions on your foreign key constraints.

    See the wait-for is showing a waiting shared lock on the clustered index of the table you are updating...this only makes sense with a foreign key constraint (at least thats the only thing that comes to mind).

    Could you send me the DDL for the database?

    JTDS and MSSQL driver do things differently, but the differences that I am aware of (and have production experience with) shouldn't make a difference w.r.t. deadlocks. However, many users have reported differences. I think these differences are more coincidence than real difference-- but I don't really know. Like I said-- I have not observed a _technical_ reason for why the deadlocks should be different with the two drivers. Thus I remain skeptical that the drivers fundamentally change the deadlock "surface area".

    If you truly don't see the other deadlock, then send me the wait-for graph with the new one. And the DDL that you are using with this version of jBPM and I can try and help.

    Steve

    ReplyDelete
  14. The jbpm database has too many relationships between the tables. The relationship between variableinstance and taskinstance is mapped in the taskinstance.hbm.xml like this (I had to remove some chars to be able to post this, sorry):

    map name="variableInstances" cascade="all"
    key column="TASKINSTANCE_" foreign-key="FK_VAR_TSKINST"
    index type="string" column="NAME_"
    one-to-many class="org.jbpm.context.exe.VariableInstance" map

    I can send you a backup of the database, it's not that heavy, can you give an email where I can contact you?
    If not, here is an image of the diagram. I marked the variableinstance and taskinstance tables so you can see them without going crazy
    http://dl.dropbox.com/u/966834/deadlock/diagram.PNG

    The thing is that if I change the mapping, something stops working... for instance, if I change the cascade="all" I get a transient entity exception... so I was hoping to be able to fix this without changing the jbpm schema... but I'm starting to loose hope.

    I'm using jbpm 3.3.1, switching to jbpm4 or even 5 would be a lot of work and I don't even know if it would solve the problem...

    do you believe there's a solution for all this that doesn't involve jbpm customization?

    Thanks again!

    ReplyDelete
  15. @naty
    Cool- well that validated my guess of the self-referencing relationship in variableinstance. Try adding a nonclustered index to the TASKINSTANCE_ column in VARIABLEINSTANCE. I assume that its not currently indexed and that is why its resorting to a full scan of VARIABLEINSTANCE (issuing shared locks for each record) to validate there is no foreign key that will be orphaned.

    Yes definitely don't change the hibernate cascade relationship. In my previous comment I was referring to database cascades -- not hibernates notion of cascading (similar concepts, but implemented totally differently and have different application semantics anyways).

    I bet the index will fix this particular deadlock problem. But that doesn't mean others wont crop up. There could be another solution (turning off the foreign key constraint on TASKSCHED_) but prefer trying adding the index first.

    Many deadlocks can be fixed through index changes (as in Deadlock II) or through the driver change (as in Deadlock I). Adding indexes don't change application semantics. Well that's not necessarily true-- conceptually they don't change application semantics but when locks come into play then its not really side effect free. This is a nice demonstration of the principle of leaky abstraction

    Let me know how that goes! However, I am currently vacationing in Italy and am about to go to bed so I might not respond quickly.

    Steve

    ReplyDelete
  16. Steve, I guess I'm making your vacation geek friendly!I'm sorry about that.
    I already have that index created. Actually I have 2 added indexes:

    CREATE NONCLUSTERED INDEX [IDX_TASKINST_TK] ON [dbo].[JBPM_VARIABLEINSTANCE]([TASKINSTANCE_] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];

    CREATE NONCLUSTERED INDEX [IDX_PROCIN_VERSION] ON [dbo].[JBPM_PROCESSINSTANCE] ([VERSION_] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY];

    The error I'm getting now is on a delete...
    (@P0 bigint)delete from JBPM_BYTEARRAY where ID_= @P0

    xdls here: http://dl.dropbox.com/u/966834/deadlock/sqllogsv3.rar

    I also have a "cleaning process" that runs when the processinstance ends (when enabled). If that's enabled the deadlock is in the delete statement, but from JBPM_PROCESSINSTACE instead of JBPM_BYTEARRAY

    http://dl.dropbox.com/u/966834/deadlock/sqllogsv4.rar

    Jbpm is "deadlock" scratch the "happy" part! :p

    grazie!

    ReplyDelete
  17. Great Post! This is a subtle difference that was causing so many problems.

    P.S

    I tired a simple CAST(@badNvarCar as varchar) in order to match our index, and the execution plans changed to use seek,

    ReplyDelete
  18. Hi Steve and team,
    I have table called Products has many columns. It is a temporary table used for reporting purpose. The data will be processed to this table concurrently by multiple user requests.(seperate Stored procedure to make DML operations to this table)
    Table Structure:
    Create table Products
    (instance uniqueidentifier,
    inserted datetime,
    col1, col2,...)

    Inserted column will be populated with GETDATE() to have when the data are inserted. And instance column have the newid() value. one user request will have one unique id; may have million rows. The below are the queries which will be executed concurrently, which causing the deadlock. Please advise me
    Query1:
    "Set transaction isolation readuncommitted
    Delete P from Products (Nolock) where instance = 'XXXX-xxx-xxx-xx'"

    Query2:
    "Set transaction isolation readuncommitted
    Delete P from Products (Nolock) where inserted<=DATEADD(hh, -10, getdate())"

    Note: The nonclustered index is created on instance column.

    PLEASE ADVISE

    ReplyDelete
  19. Vasu-

    Can you tell me all of the indexes on this table? If you can turn on trace flag 1222. This will cause MSSQL to output the "wait for" graph in the MSSQL error log. This is really the only way to know what is causing the deadlock. If you can post the output that shows up in the error log from the 1222 trace flag, I can tell you what the problem is.

    Steve

    ReplyDelete
  20. Great post, very interesting.

    Nice one here too on deadlocks:

    SQL deadlock

    ReplyDelete