Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Quick question regarding SQL Server 2000 Indexes

Options
  • 07-11-2005 7:08pm
    #1
    Registered Users Posts: 7,468 ✭✭✭


    Just a quick one folks but indexing on the field in the WHERE clause of a SELECT statement improves performance of said SELECT statement does it not?

    <edit>
    Perhaps I should elaborate a little bit, I'm indexing a varchar field, length 150, I know that all 150 chars will be included in the index but it's my understanding that indexing the field will speed things up (actually I know in this case it does).


Comments

  • Registered Users Posts: 4,003 ✭✭✭rsynnott


    In general, yes. (There's a price to pay, of course; it makes the database larger and marginally slows inserts, though not in a relevant way in most cases.) For instance, if you did: "SELECT bla FROM table WHERE name='Adolf Hitler';", indexing would be of great assistance. Now, some databases, and I seem to remember SQL Server being one of them, don't make use of the index very effectively when doing certain 'LIKE' searches, so "SELECT bla FROM table WHERE name LIKE '%Hitler%';" won't get as much of a boost. "SELECT bla FROM table WHERE name LIKE 'Adolf%';" may do better.

    In general, though, any WHERE clause should usually be using indexed columns.


  • Registered Users Posts: 640 ✭✭✭Kernel32


    If an index is used or not is also dependent on the size of the table(rows) and how up to date the database statistics are. If the statistics indicate that the table is small then the index will be ignored and a full table scan will be performed rather than going to the trouble of building an execution plan. If you know for a fact a table will remain small then indexes often have no affect on performance.


  • Registered Users Posts: 15,443 ✭✭✭✭bonkey


    rsynnott wrote:
    Now, some databases, and I seem to remember SQL Server being one of them, don't make use of the index very effectively when doing certain 'LIKE' searches

    Can you suggest a databases which - by your choice of words - does use an index effectively during "certain LIKE searches" (by which I assume you mean wilcard-at-start LIKE searching).

    I'm asking this as a genuine question...I'm not aware of any relational database which can do it efficiently. Indeed, I'm not even sure I can think of a mathematical algorithm which would let you do it efficiently.
    so "SELECT bla FROM table WHERE name LIKE '%Hitler%';" won't get as much of a boost. "SELECT bla FROM table WHERE name LIKE 'Adolf%';" may do better.

    Not entirely accurate....at least regarding how MSSQL uses indexes.

    'Adolf%' will be approximately as performant as 'Adolf', allowing for any disparity of records returned. In both cases, the index is used to find strings beginnnig with Adolf. In one case, only those containing only those 5 characters is returned, in the other anything beginning with those 5 is returned, but in both cases the index will be used in [pretty-much exactly the same manner.

    '%Adolf%' will simply not use the index*. The index will have no impact whatsoever on SELECT performance in this case, other than to cause the optimiser to consider possible indexes for use that it will then discard.
    Kernel32 wrote:
    If an index is used or not is also dependent on the size of the table(rows) and how up to date the database statistics are.
    I wasn't aware the age of the statistics had any impact whatsoever.

    The selectivity of an index is also important - what percentage of data is likely to be returned for a given value. For non-clustered indexes, it can be (but doesn't have to be) wasteful to use an index with low selectivity (i.e. where a large percentage of rows are returned).

    And you've just all reminded me that I need to go find out what has changed (and how) with indexes and query optimisation in SQL 2005.

    jc

    * If its a clustered index, then obviously the data will be read from what are the index leaves, but this is not the same as actually using the index to lookup the data.


  • Registered Users Posts: 2,781 ✭✭✭amen


    the age of the statistics is very important in MSQ SQL server
    if you have an index on table that was build with say 1000 rows and the table now has 100,000 and the table statistics have not been updated
    you may find that the index is ignored and full table scan is performed or the index is used and it is horribley slow

    you can schedule the table stats to be update on a regular basis
    I also run weekly job to recompile all stats, reindex the tables and recompile the sps
    you'd be amazed at the performance differences between doing the above and not doing


  • Registered Users Posts: 15,443 ✭✭✭✭bonkey


    amen wrote:
    the age of the statistics is very important in MSQ SQL server

    Respectfully, I disagree.
    if you have an index on table that was build with say 1000 rows and the table now has 100,000 and the table statistics have not been updated
    you may find that the index is ignored and full table scan is performed or the index is used and it is horribley slow
    Correct, but thats got nothing to do with the age of the statistics. It has to do with how well the statistics reflect the current data.

    You could have statistics which are 2 years old which reflect the data perfecty. You could have statistics which are 5 minutes old which are utterly obsolete.

    Thats what I was and am driving at - its not the age of the statistics per se which is the issue. Its the quantity and nature of data change since the statistics were built which is significant. The timeframe in which that change took place is irrelevant.

    It might seem like a terribly pedantic thing to be clarifying, but the point I'm kaing is that I've seen too many people running weekly scheduled "re-optimisation" of indexes on static data because they were led to believe the age of the statistics was important.

    jc


  • Advertisement
  • Registered Users Posts: 2,781 ✭✭✭amen


    You could have statistics which are 2 years old which reflect the data perfecty. You could have statistics which are 5 minutes old which are utterly obsolete

    sorry Bonkey I absolutley agree with you. You managed to say what I wanted to say in clearer manner.
    In my case I have a lot of data churn (millions of rows per week) and running the statistics is vital but of course I only run on those tables that have data changing all my static lookup tables are excluded (unless we have made change to the data which is rare)

    ahh databases are fun


Advertisement