Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Slow MySQL query

  • 02-04-2007 1:34pm
    #1
    Registered Users, Registered Users 2 Posts: 1,086 ✭✭✭


    Could someone please tell me what is the fastest way of perfroming this query. My queries seem to be running very slowly. I pasted one into my PhpMyAdmin and it took 15 seconds.

    I have 2 tables. One holding Book details and another holding Ads information.

    I want to return the 20 most recent ads placed and all their book information. The ISBNs link the ads with the book info.

    Currently I have

    SELECT ads.isbn, max(UNIX_TIMESTAMP(date_posted)) as date1, books.*
    FROM `ads`,books
    WHERE ads.isbn = books.isbn
    GROUP BY ads.isbn
    ORDER BY date1
    DESC limit 0, 20


    This is working very slow. Both tables "Books" and "Ads" are very big (7000+). Could someone please help with a quicker alternative?

    Thanks


Comments

  • Registered Users, Registered Users 2 Posts: 7,468 ✭✭✭Evil Phil


    Are there any indexes (or is it indices) set on both tables, if so what are they? It's not obvious where date_posted is coming from which is probably due to my lack of mySql knowledge. Are the max and UNIX_TIMESTAMP functions/methods slowing the query down with additional overhead?


  • Registered Users, Registered Users 2 Posts: 1,086 ✭✭✭Peter B


    Evil Phil wrote:
    Are there any indexes (or is it indices) set on both tables, if so what are they?

    I havn't set the indexes. Maybe I should set them on date_posted (ads table) and ISBN (books table)
    Evil Phil wrote:
    It's not obvious where date_posted is coming from which is probably due to my lack of mySql knowledge.

    Sorry, should have made that clearer. the date_posted value is coming from the ads table.
    Evil Phil wrote:
    Are the max and UNIX_TIMESTAMP functions/methods slowing the query down with additional overhead?

    This is true. Maybe I should deal with these values after I extract the resultset from the PHP.


    Does anyone know of a good way to check query speeds? The problem is MySQL is commiting them to cache so the second time running they are much quicker. This means I cannot compare them a multiple amount of times.

    Also I was thinking would left join (I think) be a better way of getting this info.

    Thanks


  • Registered Users, Registered Users 2 Posts: 15,443 ✭✭✭✭bonkey


    Peter B wrote:
    I want to return the 20 most recent ads placed and all their book information. The ISBNs link the ads with the book info.

    Currently I have

    SELECT ads.isbn, max(UNIX_TIMESTAMP(date_posted)) as date1, books.*
    FROM `ads`,books
    WHERE ads.isbn = books.isbn
    GROUP BY ads.isbn
    ORDER BY date1
    DESC limit 0, 20

    I'm not a MySql expert, but...

    I don't think you need a GROUP BY at all, nor a max function.

    IF you think about it, the 20 most recent ads would be given by:

    SELECT ads.*
    FROM `ads`
    ORDER BY ads.date_posted DESC
    limit 0, 20



    (I'm guessing on the limit here....I dunno how they're done in MySQl so I'm taking your code).

    Now...given that books.isbn should be unique, you should be able to just join in books to this query to get your results:


    SELECT ads.*, books.*
    FROM `ads`,books
    WHERE ads.isbn = books.isbn
    ORDER BY ads.date_posted DESC
    limit 0, 20

    This is working very slow. Both tables "Books" and "Ads" are very big (7000+).
    7000 isn't a lot of data. 7,000,000 might be a lot of data.

    Regardless of the size, though, if you don't have indexes, you should. At a guess, you need one on books.isbn, one on ads.isbn, and maybe one on ads.date_posted.

    jc


  • Registered Users, Registered Users 2 Posts: 2,931 ✭✭✭Ginger


    Also dont use .* use the actual names its quicker as the DBMS doesnt have enumerate the fields.. Indices when used (especially in SQL Server 200X) are ignored if the fields that are used in them are not specified in the query or are specified in the wrong order in the query


  • Registered Users, Registered Users 2 Posts: 3,280 ✭✭✭regi


    Can you run an EXPLAIN plan and give us the output?


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 15,443 ✭✭✭✭bonkey


    Ginger wrote:
    Indices when used (especially in SQL Server 200X) are ignored if the fields that are used in them are not specified in the query or are specified in the wrong order in the query

    Indexing is a complex field which differs in implementation from DB to DB.

    Even if what you are saying is correct for MSSQL (and I would argue that its not entirely correct even there), it may be totally incorrect in a different DB system.


  • Registered Users, Registered Users 2 Posts: 2,931 ✭✭✭Ginger


    Well in SQL 2000 if you specified an index based on firstname,secondname,city and searched by secondname,firstname the index would be ignored..

    Similarly if you created the same index and searched by secondname or city it would be ignored. It will work with firstname.

    I do agree that indexing strategies vary by implementation from DB to DB tho, it was showing an example of creating indexes and how just creating them will not always increase the speed of your query.


  • Closed Accounts Posts: 19,777 ✭✭✭✭The Corinthian


    Are either ads.isbn or books.isbn primary keys? If not index them.


  • Registered Users, Registered Users 2 Posts: 15,443 ✭✭✭✭bonkey


    Ginger wrote:
    Well in SQL 2000 if you specified an index based on firstname,secondname,city and searched by secondname,firstname the index would be ignored..

    Assuming you mean a search of the form: WHERE secondname = X AND firstname = Y, then you're wrong. The index will be used in this situation, assuming there are enough records in the table to make index-usage more optimal over a table-scan. IF its not used in this situation, then it also wouldn't be used if you changed your WHERE clause to read WHERE firstname = Y AND secondname = X.

    The MSSQL optimiser is smart enough to test alternate field-order. You can test this easily with SHOWPLAN output. Write a query doing what you think will work, test that its using the index, then rearrange the WHERE clause. The same indexes should still be used.

    This limitation is one I've heard levelled at almost every RDBMS I've used. I can't think of a situation where it has been correct, unless one is talking about a version of the system which is over 10 years old.
    Similarly if you created the same index and searched by secondname or city it would be ignored.
    Not entirely correct. The index might be used, but only in special circumstances which are typically only ever discussed in stuff on advanced optimisation.

    If the only fields being used *anywhere* in the query were contained within the index, then scanning the index could be more optimal than scanning the table. In such a case, the index will be read in preference to the table and a SHOWPLAN would show an INDEX SCAN rather than a TABLE SCAN.


  • Registered Users, Registered Users 2 Posts: 2,931 ✭✭✭Ginger


    Actually retract all that and fully read your post.. boneky is correct in this case ..


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 3,375 ✭✭✭kmick


    How about something along these lines (and I am no sql expert)

    select * from books where isbn in
    (Select isbn
    from ads
    order by date_posted DESC
    LIMIT 20);

    Your problem is you are return every row back in the select because of max(UNIX_TIMESTAMP(date_posted)) as date1


  • Registered Users, Registered Users 2 Posts: 2,781 ✭✭✭amen


    Regardless of the size, though, if you don't have indexes, you should.

    I don't totally agree with this statement. In certain cases on tables
    with a very small amount of data an index can actually slow the query


  • Registered Users, Registered Users 2 Posts: 15,443 ✭✭✭✭bonkey


    amen wrote:
    I don't totally agree with this statement. In certain cases on tables
    with a very small amount of data an index can actually slow the query

    Agreed, but only to the extent it would take the optimiser to say "no, I won't use this index cause a table-scan would be faster".


  • Registered Users, Registered Users 2 Posts: 1,086 ✭✭✭Peter B


    Thanks for all the help.

    I put in the indexes on all my tables and as far as I can tell it has reduced the the time from 15 seconds to 0.5 seconds.

    This is without any other changes.


Advertisement