Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

SQL "NOT IN" alternative

  • 16-07-2012 10:03am
    #1
    Registered Users, Registered Users 2 Posts: 1,832 ✭✭✭


    I'm adding/updating approx 50,000 products into a table from an Excel sheet. I want to mark the other approx 200,000 products in the table as "OLD". My initial SQL statement was;
    UPDATE ProductTable SET
    [flagfield] = 'OLD'
    WHERE ProductIdColumn NOT IN (List of Imported ProductIDs)
    

    But this isn't suitable for the number of products now being imported. Any alternatives? A join perhaps?


Comments

  • Registered Users, Registered Users 2 Posts: 1,712 ✭✭✭neil_hosey


    I'm adding/updating approx 50,000 products into a table from an Excel sheet. I want to mark the other approx 200,000 products in the table as "OLD". My initial SQL statement was;
    UPDATE ProductTable SET
    [flagfield] = 'OLD'
    WHERE ProductIdColumn NOT IN (List of Imported ProductIDs)
    

    But this isn't suitable for the number of products now being imported. Any alternatives? A join perhaps?

    yeah do a join on the same table.


  • Moderators, Society & Culture Moderators Posts: 9,689 Mod ✭✭✭✭stevenmu


    One option would be to set everything to 'OLD' first, then when you add/update you can flag those records as 'NEW' (or whatever value you use).

    Other than that, I think you'd need some type of criteria on the rows by which you can determine which ones to flag as old. The obvious ones would be to have either a date modified field, then you can just flag everything where date modified is not equal to today, or maybe have a field for a batch id that you set on each batch, and then you can flag everything without that batch id as old.


  • Registered Users, Registered Users 2 Posts: 1,456 ✭✭✭FSL


    You could just do

    UPDATE ProductTable SET
    [flagfield] = 'OLD'
    WHERE ProductIdColumn NOT IN (SELECT productidfield from yourExcelSpreadsheet)


  • Closed Accounts Posts: 857 ✭✭✭rozeboosje


    UPDATE PT
    SET PT.[flagfield] = 'OLD'
    FROM ProductTable PT
    LEFT OUTER JOIN YOUREXCELSPREADSHEET ST
    ON ST.[productidfield] = PT.[productidcolumn]
    WHERE ST.[productidfield] IS NULL


  • Registered Users, Registered Users 2 Posts: 1,832 ✭✭✭CountingCrows


    FSL wrote: »
    You could just do

    UPDATE ProductTable SET
    [flagfield] = 'OLD'
    WHERE ProductIdColumn NOT IN (SELECT productidfield from yourExcelSpreadsheet)

    How do I reference the Excel Spreadsheet directly in the query - use OPENROWSET?


  • Advertisement
  • Closed Accounts Posts: 857 ✭✭✭rozeboosje


    'Fraid I can't help you with THAT one. OPENROWSET? Yeah. Maybe.


  • Closed Accounts Posts: 857 ✭✭✭rozeboosje


    Alternatively, I trust you have already created some kind of "import" routine that already works, no? So why not adapt that so that it imports the content of the excel spreadsheet into your database into a *different* table, and then Left Outer Join to that using something like the SQL I posted earlier. I would recommend using the left outer join though. I would imagine that that would be a LOT more efficient than the "NOT IN" syntax.


  • Registered Users, Registered Users 2 Posts: 1,832 ✭✭✭CountingCrows


    rozeboosje wrote: »
    Alternatively, I trust you have already created some kind of "import" routine that already works, no? So why not adapt that so that it imports the content of the excel spreadsheet into your database into a *different* table, and then Left Outer Join to that using something like the SQL I posted earlier. I would recommend using the left outer join though. I would imagine that that would be a LOT more efficient than the "NOT IN" syntax.

    Changing the existing database structure is really what I'm trying to avoid but looks like I might have little choice as performance is critical.


  • Closed Accounts Posts: 857 ✭✭✭rozeboosje


    You should be able to do this without changing the existing database structure, i.e. any of the tables that are already there; the "different" table would just be used for temporarily storing your spreadsheet data while you're importing it.


  • Closed Accounts Posts: 2,930 ✭✭✭COYW


    rozeboosje wrote: »
    You should be able to do this without changing the existing database structure, i.e. any of the tables that are already there; the "different" table would just be used for temporarily storing your spreadsheet data while you're importing it.

    I think this is the best and standard option. Once you are finished with this table OP you can just delete it.

    UPDATE ProductTable SET
    [flagfield] = 'OLD'
    WHERE ProductIdColumn NOT IN (SELECT productidfield from ProductTableExcel)

    DROP TABLE ProductTableExcel

    Finally, no harm in doing some normalizing by creating a small table for the 'flagfield' and joining that table to your ProductTable.


  • Advertisement
  • Registered Users, Registered Users 2 Posts: 1,456 ✭✭✭FSL


    I always use opendatasource to access Excel Spreadsheets eg

    OpenDataSource( 'Microsoft.Jet.OLEDB.4.0',
    'Data Source="D:\spreadsheetname.xls";Extended properties=Excel 8.0')...[Sheet1$]


Advertisement