Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

Screen scraping web program

  • 05-06-2003 08:42PM
    #1
    Registered Users, Registered Users 2 Posts: 237 ✭✭


    Would anyone know how easy/difficult it would be to write a program to interrogate a website & download all fares, eg. ryanair.com , put in a route & then get all dates & prices over a 1 month range ?
    aerlingus.com is a bit painful coz it's slow & I have to keep putting in the towns and dates each time.


Comments

  • Registered Users, Registered Users 2 Posts: 7,742 ✭✭✭mneylon


    I don't know how hard it would be to write it tbh, but you could look at some of the scrapers available for JSP


  • Banned (with Prison Access) Posts: 16,659 ✭✭✭✭dahamsta


    Your best bet for a question like this is ILUG, there's some world-class scraper-writers on the list. Justin Mason's pretty hot at it, as I recall.

    adam


  • Closed Accounts Posts: 304 ✭✭Zaltais


    As regards to this specific application I don't know how valuable it would be as airline rates are based on availabilty and your information would become 'stale' very quickly.....


  • Registered Users, Registered Users 2 Posts: 1,842 ✭✭✭phaxx


    Take a look at www.flytowork.ie - isn't that what you're trying to do?


  • Registered Users, Registered Users 2 Posts: 7,812 ✭✭✭jmcc


    Originally posted by lukegriffen
    Would anyone know how easy/difficult it would be to write a program to interrogate a website & download all fares, eg. ryanair.com , put in a route & then get all dates & prices over a 1 month range ?
    aerlingus.com is a bit painful coz it's slow & I have to keep putting in the towns and dates each time.

    Not particularly difficult (if you have a good knowledge of the language you want to use, a good knowledge of HTTP and a good knowledge of REGEXP) unless sessions are involved. It basically is four operations:

    Creating the query.
    Presenting the query.
    Grabbing the results page.
    Processing the results.

    If you are running a comparison of fares over routes, then you would need some sort of database to do it properly. Ryanair's site seems to be well integrated. The problem with the data is that it is volatile - seats/booking data changes are live etc.

    Is this just a sporadic idea or is it intended to be part of a website. Also what language/platform will be used?

    The application is a bit more complex than a simple scraper/spider such as the ones used by search engines. (Writing spiders can be a bit more complex than writing scrapers ;) as the data changes so often and some webdevs invent their own META data categories.)

    This link may be useful but you would need a Perl capable box, some Perl familiarity and perhaps some Perl heads to sort out a handler (scraper). http://www.newsclipper.com/

    There is a tendency among SysAdmins to block scrapers.

    Regards...jmcc

    Regards…jmcc



  • Advertisement
  • Registered Users, Registered Users 2 Posts: 237 ✭✭lukegriffen


    The information would be just for my own use, for booking weekends away, so once I got the info , I'd then book within the hour. It would just save a lot of time trying to key in different dates on different routes.

    Thanks for all the replies.


Advertisement