Quick question regarding SQL Server 2000 Indexes

Evil Phil · 07-11-2005 07:08PM #1

Just a quick one folks but indexing on the field in the WHERE clause of a SELECT statement improves performance of said SELECT statement does it not?

<edit>
Perhaps I should elaborate a little bit, I'm indexing a varchar field, length 150, I know that all 150 chars will be included in the index but it's my understanding that indexing the field will speed things up (actually I know in this case it does).

rsynnott · 07-11-2005 11:31PM

In general, yes. (There's a price to pay, of course; it makes the database larger and marginally slows inserts, though not in a relevant way in most cases.) For instance, if you did: "SELECT bla FROM table WHERE name='Adolf Hitler';", indexing would be of great assistance. Now, some databases, and I seem to remember SQL Server being one of them, don't make use of the index very effectively when doing certain 'LIKE' searches, so "SELECT bla FROM table WHERE name LIKE '%Hitler%';" won't get as much of a boost. "SELECT bla FROM table WHERE name LIKE 'Adolf%';" may do better.

In general, though, any WHERE clause should usually be using indexed columns.

Kernel32 · 08-11-2005 03:42AM

If an index is used or not is also dependent on the size of the table(rows) and how up to date the database statistics are. If the statistics indicate that the table is small then the index will be ignored and a full table scan will be performed rather than going to the trouble of building an execution plan. If you know for a fact a table will remain small then indexes often have no affect on performance.

bonkey · 08-11-2005 09:27AM

rsynnott wrote:

Now, some databases, and I seem to remember SQL Server being one of them, don't make use of the index very effectively when doing certain 'LIKE' searches

Can you suggest a databases which - by your choice of words - does use an index effectively during "certain LIKE searches" (by which I assume you mean wilcard-at-start LIKE searching).

I'm asking this as a genuine question...I'm not aware of any relational database which can do it efficiently. Indeed, I'm not even sure I can think of a mathematical algorithm which would let you do it efficiently.

so "SELECT bla FROM table WHERE name LIKE '%Hitler%';" won't get as much of a boost. "SELECT bla FROM table WHERE name LIKE 'Adolf%';" may do better.

Not entirely accurate....at least regarding how MSSQL uses indexes.

'Adolf%' will be approximately as performant as 'Adolf', allowing for any disparity of records returned. In both cases, the index is used to find strings beginnnig with Adolf. In one case, only those containing only those 5 characters is returned, in the other anything beginning with those 5 is returned, but in both cases the index will be used in [pretty-much exactly the same manner.

'%Adolf%' will simply not use the index*. The index will have no impact whatsoever on SELECT performance in this case, other than to cause the optimiser to consider possible indexes for use that it will then discard.

Kernel32 wrote:

If an index is used or not is also dependent on the size of the table(rows) and how up to date the database statistics are.

I wasn't aware the age of the statistics had any impact whatsoever.

The selectivity of an index is also important - what percentage of data is likely to be returned for a given value. For non-clustered indexes, it can be (but doesn't have to be) wasteful to use an index with low selectivity (i.e. where a large percentage of rows are returned).

And you've just all reminded me that I need to go find out what has changed (and how) with indexes and query optimisation in SQL 2005.

jc

* If its a clustered index, then obviously the data will be read from what are the index leaves, but this is not the same as actually using the index to lookup the data.

amen · 08-11-2005 02:29PM

the age of the statistics is very important in MSQ SQL server
if you have an index on table that was build with say 1000 rows and the table now has 100,000 and the table statistics have not been updated
you may find that the index is ignored and full table scan is performed or the index is used and it is horribley slow

you can schedule the table stats to be update on a regular basis
I also run weekly job to recompile all stats, reindex the tables and recompile the sps
you'd be amazed at the performance differences between doing the above and not doing

bonkey · 08-11-2005 02:50PM

amen wrote:

the age of the statistics is very important in MSQ SQL server

Respectfully, I disagree.

if you have an index on table that was build with say 1000 rows and the table now has 100,000 and the table statistics have not been updated
you may find that the index is ignored and full table scan is performed or the index is used and it is horribley slow

Correct, but thats got nothing to do with the age of the statistics. It has to do with how well the statistics reflect the current data.

You could have statistics which are 2 years old which reflect the data perfecty. You could have statistics which are 5 minutes old which are utterly obsolete.

Thats what I was and am driving at - its not the age of the statistics per se which is the issue. Its the quantity and nature of data change since the statistics were built which is significant. The timeframe in which that change took place is irrelevant.

It might seem like a terribly pedantic thing to be clarifying, but the point I'm kaing is that I've seen too many people running weekly scheduled "re-optimisation" of indexes on static data because they were led to believe the age of the statistics was important.

jc

amen · 09-11-2005 02:45PM

You could have statistics which are 2 years old which reflect the data perfecty. You could have statistics which are 5 minutes old which are utterly obsolete

sorry Bonkey I absolutley agree with you. You managed to say what I wanted to say in clearer manner.
In my case I have a lot of data churn (millions of rows per week) and running the statistics is vital but of course I only run on those tables that have data changing all my static lookup tables are excluded (unless we have made change to the data which is rare)

ahh databases are fun

Quick question regarding SQL Server 2000 Indexes

Comments