Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Scraping info from website

  • 05-11-2010 7:52pm
    #1
    Registered Users, Registered Users 2 Posts: 207 ✭✭


    I was hoping to start downloading information from a website on a daily basis in order to analyse the data. The files are published in html and in xml format. I then want to store the data on the hard drive of a laptop that has gone passed its usefulness and operate it almost like a network.

    My query is there a “scraper” that can allow me to download this publically available information from the website on a daily basis, and put it in a database for analysis. I am mildly technically minded and obviously use computers on a daily basis but this is pushing the realms of my abilities and I apologise if the query is a bit naive

    Any help is appreciated

    Shakeydude


Comments

  • Registered Users, Registered Users 2 Posts: 2,370 ✭✭✭Knasher


    There is probably an application that can parse websites in some sort of automated way, but unless somebody can come up with a better suggestion I'd recommend just using regular expressions to parse the raw xml/html and pull the data you need into a database yourself. Provided they publish the data in a reasonably standardized way (which their use of xml would suggest they do) it really shouldn't be all that difficult.


  • Registered Users, Registered Users 2 Posts: 1,530 ✭✭✭CptSternn


    Yeah, if the data is already in XML format, what would you need a scraper for?

    Just write a script that will download the XML and put it into a database.


Advertisement