If you know nothing about Statistics, you've come to the right place. I'm not going to bore you with long technical definitions that make no sense to people who don't know anything about it. I'm going to show everything with examples and visuals. You just sit back and relax.
I have a Customers table. When I search for 'Abigail Parks' I get 5 records.
And I have 3 indexes for this table. First is the clustered index, which is created automatically for the primary key column. The second and third ones are non-clustered indexes, one for the NameSurname column and the other one is for the Birthdate column. Now I'm wondering which index SQL used for my query. Was it able to find NameSurname index? When I click on Display Estimated Execution Plan button I get this:
SQL was able to find the right index. In my next query, I try to find how many customers were born on the same day with the first 'Abigail Parks' on my list, which is 1973-06-05.
I have 512 people who were born that day. Now I check again if SQL was able to find the right index.
It could find the right Index again. What if I say: Select * From Customers Where NameSurname = 'Abigail Parks' and Birthdate = '1973-06-05'? Which Index is going to choose? Smart choice would be choosing the NameSurname Index since it has fewer rows. Let's see which one it's going to choose.
To optimize our query performance, we need to keep our statistics up to date. Evey time we rebuild or reorganize our indexes, these statistics get updated automatically. Now I want to confuse SQL and see what happens. I'm going to add 1000 more 'Abigail Parks' to my table with stored procedure and see which index SQL is going to choose.
After adding 1000 more 'Abigail Parks' I asked the same question again.
It was a wrong choice this time. The reason for that, those statistics are not updated. When I checked the statistics, I saw that 'Abigail Parks' was still recorded as 5 rows. So what's the price that we pay for not updated statistics? The good thing is we can choose which index to use at our query and by using SET STATISTICS IO ON formula we can see how many pages it has to read for every index.
This is the price that we pay. If my statistics were updated, SQL was going to read 107 pages for that query. But it wasn't updated and it used NameSurname column and had to read 2088 pages. If it was updated, my query would work 20 times faster.
What to do to force SQL to use the right index? Using the WITH condition every time wouldn't be effective. The only way to do it effectively is by updating statistics. We have 4 different ways to do that. We can rebuild or reorganize our indexes. But it might not be possible to do it every day, because when our system is working, rebuilding or reorganizing indexes slows our system. Luckily, we can update our statistics without touching our indexes by using SP_UPDATESTATS. This command updates every statistic in our server. Or we can choose which table's statistics to update by using the UPDATE command: UPDATE STATISTICS 'TableName'
To optimize our queries we need to update our statistics regularly. But doing it every time by writing a query might not be the best option. In my opinion, the best option is to add it to our database maintenance plan, which is a subject for another post.
Comments