By: Jeffrey Yao | Updated: 2017-01-18 | Comments (8) | Related: More > Professional Development Interviewing
Problem
We need to hire a new senior SQL Server DBA. The SQL Server DBA team will focus on the candidate's technical skills while our HR team will take care of other things, like team match, culture fit, etc. What are some of the questions we could ask to host a quality technical interview to assess a candidate?
Solution
Interviewing is an interesting topic and different interviewers may prefer different questions and approaches. For DBA positions, most of time, I prefer asking questions that can be demonstrated through coding.
There are a few reasons why coding answers is preferred:
- Concise and straight: code itself is a language that is less likely to be misunderstood than a narrative description.
- Objective and standard: the final answer will be executable code with expected results, so our evaluation on candidates can be more objective.
- Higher and stricter criteria: to answer a question by using examples, there is a lot work to do, i.e. conceiving the test case, preparing the test data, coding the script and doing the presentation and explaining the result), a candidate must have real hands-on experience and thorough knowledge on the topic.
The following are some of my favorite questions which are designed to test a DBA's knowledge, and the answers do not require long T-SQL scripts (ie. 50+ lines).
- Can you please demo scenarios of SQL Server locks and deadlocks?
- Can you please demo scenarios of SQL Server dirty reads, non-repeatable reads and phantom reads?
- Can you please demo scenarios of SQL Server nested loop / hash / merge join operations?
- Can you please demo various SQL Server locks (S / U / X / IX etc)?
- Can you please demo ACID properties of a SQL Server transaction?
- Can you please demo the difference between DELETE and TRUNCATE in SQL Server?
In my opinion, these questions are suitable for intermediate to senior DBAs.
Sample Answers to the Questions Above
1. Can you please demo scenarios of SQL Server locks and deadlocks?
Answer: In SSMS, we will open 4 windows, named W1, W2, W3 and W4.
-- In W1, we prepare the data use tempdb if object_id('dbo.t', 'U') is not null drop table dbo.t; create table dbo.t (a int primary key, b varchar(30)) go insert into dbo.t (a, b) values (1, 'hello'), (2, 'world'); go
-- In W2, we start an update trans without committing -- this session spid = 55 use tempdb -- demonstrate the blocking scenario begin tran update dbo.t set b = 'hello 2' where a = 1; --commit tran
-- in W3, do a select, and it will be waiting, this is called blocking -- this session spid = 56 use tempdb select * from dbo.t
-- in W4, we can check the locks of the sessions (spid 55, 56) exec sp_lock 56 exec sp_lock 55
In the following snapshot, we can see the W3 session (spid 56) is waiting for a Shared Key lock.
Now we will demo the deadlock scenario, we will use the previous four SSMS windows. We first cleanout W2 and W3, and then enter new code as follows:
-- This is W2, we start an update trans without committing use tempdb -- demonstrate the deadlocking scenario -- first run the first two lines in W2 and then switch to W3 to run the script in W3 begin tran update dbo.t set b = 'hello 2' where a = 1; -- after the script in W3 run, return here and run the following select -- this will cause a deadlock on this session select * from dbo.t where a = 2;
After the update is run in W2, we switch to W3 and run this script.
-- this is W3 -- run the following after run the update in W2 begin tran update dbo.t set b = 'world 2' where a = 2; -- this will block the SELECT statement in W2 select * from dbo.t with (xlock) where a = 1 -- this is blocked by W2 UPDATE statement -- return to W2 to run the SELECT statement and see the deadlocking occur
Now we return back to W2 and run the SELECT statement, we will immediately get a deadlock message, and the W2 session is rolled back as shown below:
The reason is in the W2 session, the UPDATE is blocking the W3 SELECT while its own SELECT is waiting for a lock currently held by the W3 UPDATE. At the same time, the same thing happened to the W3 statements, i.e. the W3 UPDATE is blocking the W2 SELECT while the W3 SELECT is waiting for a lock held by the W2 UPDATE. This immediately caused a deadlock.
2. Can you please demo scenarios of SQL Server dirty reads, non-repeatable reads and phantom reads?
Answer: When talking about dirty reads and the like, we are actually talking about transaction isolation level and a candidate is expected to fully grasp this basic DBA knowledge.
We will still use the SSMS windows we created before, delete all code in W2 and W3 and re-enter the following code:
-- This is W2, we start an update trans without committing use tempdb -- demonstrate the dirty read begin tran update dbo.t set b = 'hello 2' where a = 1; select * from dbo.t where a = 1; -- uncommitted tran for a = 1; waitfor delay '00:01:00' -- wait for 1 min, so W3 script can finish rollback
-- this is W3 -- run the following to show the dirty read use tempdb set transaction isolation level read uncommitted select * from dbo.t where a = 1 -- we will get b='hello 2' which is uncommitted go
-- this is W3 -- run the following to show the non-repeatable read issue use tempdb set transaction isolation level READ COMMITTED begin tran select * from dbo.t where a = 1 waitfor delay '00:00:30' -- wait for 30 SECONDS so the W2 UPDATE can finish select * from dbo.t where a = 1 commit tran -- after run the script, immediately switch to W2 and run the update go
-- This is W2, we start an update trans without committing use tempdb -- demonstrate the non-repeatable read update dbo.t set b = 'hello 2' where a = 1;
In W3, we expect the two SELECTs to return the same data, in reality it does not as shown below:
This non-repeatable read can be easily solved if we set the transaction isolation level to REPEATABLE READ in W3, so the W2 UPDATE will wait until the W3 transaction is done.
To demo the phantom read issue, we will do the following:
-- this is W3 -- run the following to show the phantom read -- solve the issue by -- set transaction isolation level serializable use tempdb set transaction isolation level REPEATABLE READ begin tran select * from dbo.t where a != 1 waitfor delay '00:00:30' -- wait for 30 SECONDS so the W2 INSERT can finish select * from dbo.t where a != 1 commit tran -- after run the script, immediately switch to W2 and run the insert go
-- This is W2, we insert a record to cause the phantom read use tempdb -- demonstrate the phantom read insert into dbo.t (a, b) values (3, 'phantom read?')
In W3, we see the two exact SELECTs in the same transaction return two different results, i.e. phantom reads:
To prevent phantom reads, we should set the isolation level in W3 to SERIALIZABLE.
3. Can you please demo various SQL Server locks (S / U / X / IX etc)?
Answer: This is pretty straight-forward, but to catch a U lock is a little bit tricky because a U lock is more like a lock in transition, it will automatically switch to a X lock when the condition is right, so to display it I have to use table hint UPDLOCK explicitly to demo it.
use tempdb set transaction isolation level repeatable read -- we need this isolation level to make the S lock is held until transaction is over begin tran select * from dbo.t with (updlock) where a = 2; -- U lock on Key a = 2 select * from dbo.t where a = 1; -- S lock on Key a = 1 update dbo.t set b = b where a = 3 -- X lock on Key a = 3; -- show the lock select resource_type, resource_description, request_mode, request_type , request_status, request_session_id from sys.dm_tran_locks where request_session_id = @@spid order by request_session_id; rollback tran
4. Can you demo ACID properties of a SQL Server transaction?
Answer: I think every DBA has heard of ACID, even if they don't know the exact definition (Atomicity Consistency Isolation Durability). A DBA should be able to show test cases with code.
We will design a test case here by first creating a Salary table.
We will open a new SSMS window and create sample table as follows:
use tempdb if object_id('dbo.EmployeeSalary', 'U') is not null drop table dbo.EmployeeSalary; create table dbo.EmployeeSalary (Name varchar(100), Salary money check (Salary < 100000) ); go --populate with two records insert into dbo.EmployeeSalary (Name, Salary) values ('john', 50000), ('mary', 60000.0); go
We then create another SSMS window and do the demo to show Atomicity:
-- demo Atomicity -- say we need to give each person a salary increase of 10% use tempdb begin tran update dbo.EmployeeSalary set Salary = Salary * 1.10 where Name = 'john'; select * from dbo.EmployeeSalary -- you will see john's salary is already updated from 50000 to 55000 waitfor delay '00:00:30' --doing something else, if anything unexpected happens during this time to disrupt this session, there will be no update to anyone -- let's say the current session id is 55, we can open another SSMS window, and runs kill 55 to simulate this current session is disrupted update dbo.EmployeeSalary set Salary = Salary * 1.10 where Name = 'mary'; commit tran -- since the session is killed, nobody's salary is updated, john's salary will be rolled back to 50000
If this current session is killed before the whole transaction is done, we can do a SELECT * from dbo.EmployeeSalary to see that no one got the 10% salary increase.
To demo Consistency, we can open another window and do the following:
-- demo Consistency -- say we need to give each person a salary increase of 10%, but by mistake, we update Mary's salary from -- Salary * 1.10 to Salary * 11.0; -- this will make Mary's salary to 660,000 instead of 66,000, and thus violates the business rule that -- everyone's salary is less than 100,000 -- so Mary's transaction will not go through. update dbo.EmployeeSalary set Salary = Salary * 11.0 where Name = 'Mary';
Isolation can actually refer to what we demoed above for DIRTY READs, because we can set the ISOLATION to several different levels to show the transaction behavior, so using the previous example to showcase this ISOLATION feature should be sufficient.
Durability is hard to demo via code, all I can think of is to commit an insert/update transaction then simply kill the SQL Server service or power off the computer (to simulate a system crash) and then once the SQL Server service restarts, search for the inserted/updated records, they should show up. This proves the DURABILITY property, i.e. once committed, it will survive any malfunction of the system.
5. Can you please demo scenarios of SQL Server Nested Loop, Merge and Hash join operators?
Answer: This is to test a DBA's knowledge about why the SQL Server optimizer chooses different join operators and the reason behind the decision.
use tempdb if object_id('dbo.t', 'U') is not null drop table dbo.t if object_id('dbo.s', 'U') is not null drop table dbo.s go create table dbo.t (id int primary key, a int); create table dbo.s (id int, a int); go ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x4.c from X4 inner join X4 x on x4.c = x.c) insert into dbo.t (id, a) select row_number() over (order by x5.c), row_number() over (order by x5.c) from X5 ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x4.c from X4 inner join X4 x on x4.c = x.c) insert into dbo.s (id, a) select row_number() over (order by x3.c), row_number() over (order by x3.c) from X3 go set showplan_all on go -- "Nested Loops" occurs when one table is small and another big and the big table has its joined column indexed. select * from dbo.t inner join dbo.s on s.id = t.id go set showplan_all off go use tempdb drop table dbo.t drop table dbo.s create table dbo.t (id int primary key, a int); create table dbo.s (id int primary key, a int); go ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x4.c from X4 inner join X4 x on x4.c = x.c) insert into dbo.t (id, a) select row_number() over (order by x5.c), row_number() over (order by x5.c) from X5 ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x3.c from X4 inner join X4 x on x4.c = x.c) insert into dbo.s (id, a) select row_number() over (order by x5.c), row_number() over (order by x5.c) from X5 go set showplan_all on go -- "Merge Join" occurs when both tables are big and the joined columns are indexed / sorted. select * from dbo.t inner join dbo.s on s.id = t.id go set showplan_all off go use tempdb drop table dbo.t drop table dbo.s create table dbo.t (id int , a int); create table dbo.s (id int , a int); go ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x4.c from X4 inner join X4 x on x4.c = x.c) insert into dbo.t (id, a) select row_number() over (order by x5.c), row_number() over (order by x5.c) from X5 ; with X1 as (select 1 as c union all select 1) , X2 as (select x1.c from X1 inner join X1 x on x1.c = x.c) , X3 as (select x2.c from X2 inner join X2 x on x2.c = x.c) , X4 as (select x3.c from X3 inner join X3 x on x3.c = x.c) , X5 as (select x3.c from X3 inner join X4 x on x3.c = x.c) insert into dbo.s (id, a) select row_number() over (order by x5.c), row_number() over (order by x5.c) from X5 go set showplan_all on go -- "Hash Match" occurs when two big tables are not indexed on their join columns select * from dbo.t inner join dbo.s on s.id = t.id go set showplan_all off go
If you run the whole script, you will get the following result, and you can see different join operators.
Detailed technical info can be found at MSDN.
If a candidate can show you this, it means s/he is very likely to be experienced in performance tuning.
6. Can you demo the difference between DELETE and TRUNCATE in SQL Server?
Answer: Everyone knows DELETE can have a WHERE clause while TRUNCATE cannot, but experienced DBA may tell you more.
For example, I am happy to hear a candidate tell me that if a table is replicated, you cannot truncate it, it would be even better if the candidate can give a reason.
-- prepare environments use tempdb if object_id('dbo.tblChild', 'U') is not null drop table dbo.tblChild; go if object_id('dbo.tblParent', 'U') is not null drop table dbo.tblParent; go create table dbo.tblParent (id int identity primary key, a varchar(30)); create table dbo.tblChild (id int identity primary key, b varchar(30), pid int references dbo.tblParent (id)) go -- difference 1, with both tables empty, DELETE and TRUNCATE behave differently delete from dbo.tblParent -- succeeds, (0 rows(s) affected) truncate table dbo.tblParent; -- fails, Cannot truncate table 'dbo.tblParent' because it is being referenced by a FOREIGN KEY constraint. -- now let's populate some table insert into dbo.tblParent (a) values ('dad'), ('mom'), ('uncle'); insert into dbo.tblChild (b, pid) values ('son', 1), ('daughter', 2) go -- difference 2, DELETE can remove record 'uncle' with a where clause while TRUNCATE cannot delete from dbo.tblParent where a='uncle'; -- succeeds truncate table dbo.tblParent where a = 'uncle' -- syntax error -- difference 3, identity column is reset for TRUNCATE while not so for DELETE select IDENT_CURRENT('dbo.tblChild'); -- returns 2 delete from dbo.tblChild -- succeeds select IDENT_CURRENT('dbo.tblChild'); -- returns 2, notice before and after DELETE, the current identity value remains unchanged -- re-insert the deleted records insert into dbo.tblChild (b, pid) values ('son', 1), ('daughter', 2) go select IDENT_CURRENT('dbo.tblChild'); -- returns 4 truncate table dbo.tblChild -- succeeds select IDENT_CURRENT('dbo.tblChild'); -- returns 1, notice before and after TRUNCATE, the current identity value is changed from 4 (before) to 1 (after) -- other differences, performance, trigger firing, replication-involved table. log size, lock required, parallel work (in delete, another transaction can still insert while -- this is not possible, if truncate is ongoing) -- these may need more sophisticated test environment setup, I just ommit here on purpose.
One theory I'd like to hear is that a TRUNCATE is equal to a DROP and RE-CREATE of a table and with this theory, we can explain most of differences, such as why an Identity value is reset, why a table cannot be truncated if it is being referenced, etc.
Summary
This tip proposes questions for a technical interview for DBAs. It aims to make the interview tough, so only candidates who are strong in theory and hands-on experience will survive.
Because no one can memorize all the T-SQL syntax, a local copy of Books Online is recommended to be available for candidates, but not access to the internet.
Next Steps
All the above-mentioned questions are not difficult to answer verbally for a senior DBA, but to answer via code is not as simple and only those who have rich hands-on experience will stand out.
I purposely left out the SNAPSHOT isolation level in my answer of Q2, so try to code an example that demos the transaction behavior under the SNAPSHOT isolation and explain its benefits.
Prepare your own "codeable" questions for new hires and please share your thoughts on how to conduct quality interviews for DBAs!
The following list of articles provide technical details for the questions we have discussed:
- Understanding SQL Server Physical Joins
- Compare Repeatable Read and Serializable SQL Server Transaction Isolation Levels
- Understanding SQL Server Locking
- Differences between Delete and Truncate in SQL Server
- Demonstrations of Transaction Isolation Levels in SQL Server
- Comparison of SQL Server Serializable and Snapshot isolation levels
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2017-01-18