Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

scraping a website for info

  • 29-05-2012 7:54am
    #1
    Registered Users, Registered Users 2 Posts: 6,088 ✭✭✭


    Apologies if this has been asked before.

    I'm in the process of having a website redeveloped & would like to be able to scrape some other sites for info (product pricing) to display on my own site.

    In the manner of "how long is a piece of string", would this be a difficult thing to do ? Would it be expensive to have developed into the site ?

    Your advice/assistance is appreciated.


Comments

  • Registered Users, Registered Users 2 Posts: 11,264 ✭✭✭✭jester77


    Scraping could leave you with some legal issues unless you have permission from the sites owner. If you have permissions then ask them to expose the data through an API. Not only would it be quicker to access the data, it would also be easier to develop and you don't have to worry about UI changes on the site.

    I've not used any scraping tools, I have used HTMLUnit for testing but you could easily use that for scraping. It's very straightforward to use and you can access attributes by IDs, CSS or XPATH.


  • Registered Users, Registered Users 2 Posts: 648 ✭✭✭Freddio


    It would all depend on how good the html you are scraping is
    The total is &#8364; <span id="total">99.99</span>
    

    is a format that would have more consistant results than:
    The total is &#8364; 99.99
    

    If the sites you are scraping change you will have to rewrite your parsing code.

    Having said that, it doesn't take very long to write scraping code (in php anyway)


  • Registered Users, Registered Users 2 Posts: 6,088 ✭✭✭OU812


    Thank you both for your replies. I'm not in the development industry & would preferable like to speak to someone face to face / over the phone before diving in. Is there an area on here where I can contact someone to explain exactly what is needed prior to getting a quote for work ?


  • Registered Users, Registered Users 2 Posts: 648 ✭✭✭Freddio


    pm sent


Advertisement