Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

XPath vs SQL

Options
  • 20-02-2008 11:02am
    #1
    Registered Users Posts: 2,931 ✭✭✭


    Hi folks

    A difficult one.

    I am developing a large Windows Workflow that will be hosted in Sharepoint 2007. Part of this workflow will be to submit information to another system in the form of XML and files. The other system is a Linux based web application.

    Sample of the workflow is that the file along with its metadata in XML will be placed on the Linux computer. The web app will then use the XML meta data to allow people to search for the file. Make sense so far?

    The other developer will be using PHP mostly which is fine. What he wants me to do (and the major thing I have a problem with) is that he wants the XML file that I will be sending him to be maintained by me.. Ok.. not really.

    This XML file will be all the meta data for all the files that I send to him or have sent to him in the past. As in I will maintain his XML datastore rather than me sending just the XML data for the current file and possibly an update file (which would contain instructions to remove older data, update other parts for example)

    Now the XML file that he wants me to send is growing at the rate about 2K per file that is detailed in it, and there is an expected usage of 10,000 files. So that is 20,000K or 20Mb. That is problem number 1.

    Problem 2 as I see it, based on his sample code, is that he will use XPath to search the XML file for the metadata. Now I am thinking this on a web application that will have a worst case estimated load of 1000 simultaneous users and avarage of 200 is a bit nuts. Also looking at his code, its sub optimal as its return multiple records and then looping through it to evaluate it, rather than just returning 1 record as how I would do it. No server side sorting where I can help it, let the database format everything for you where practical.

    I suggested the following solution. I send the files along with an XML instruction file that would allow him to process to the files and pit the XML into a database that would have an indexing strategy to ensure quick retrieval. That way I dont need to send a massive file each time and he just has to query the database. Unfortunately he sees this as too complex.

    So my questions

    Is the solution above too complex?

    When a file is opened with doc() is the full file loaded into memory? Also is it possible to allowed shared access to the file, so that its only loaded once.

    These are meant to be loosely coupled solutions, and I am thinking the instruction file solution allows us to change the whole system on either side provided the XML contracts are maintained.

    Any advice would be welcomed.


Comments

  • Closed Accounts Posts: 81 ✭✭dzy


    I agree. I think it makes sense just to send the diff. Like you said, over time the size of the data to send might become too large.

    It sounds like he is being lazy. Perhaps as a compromise you could work together on building an updatable storage system.

    I'd also go with the database storage option. Keeping the documents as a large XML object in memory, you'll have to deal with update concurrency which could be a bit of a pain. Plus, you'll need to make sure you have enough memory to hold the object. I would imagine lookup times would be faster when using a database than with XPath. Plus, it seems a more standard and common sense approach to use a database.


  • Registered Users Posts: 2,931 ✭✭✭Ginger


    Time to organise a meeting me thinks.. I am fairly sure when I propose this there will be resistance from him but overall it will be easy enough to convince the main sponsor!


Advertisement