Advertisement
Help Keep Boards Alive. Support us by going ad free today. See here: https://subscriptions.boards.ie/.
If we do not hit our goal we will be forced to close the site.

Current status: https://keepboardsalive.com/

Annual subs are best for most impact. If you are still undecided on going Ad Free - you can also donate using the Paypal Donate option. All contribution helps. Thank you.
https://www.boards.ie/group/1878-subscribers-forum

Private Group for paid up members of Boards.ie. Join the club.

My I have a suggestion of web scraper to capture information from popup links?

  • 19-04-2015 01:46AM
    #1
    Registered Users, Registered Users 2 Posts: 74 ✭✭


    I have been web scraping with import.io for the past few days and it was working perfectly, but now I have arrived at a business directory which has popup links, and you can only see the information on the page when you click, meaning it cannot be opened in a new tab or anything like that. Import.io does not recognize the information when you select it.

    I downloaded the free trial of Outwit, and it looks a bit promising, you can see the information that I want in a list, which is one step of the way, but it is not grouped in with the rest of the relevant info. But, at least it recognizes it.

    I sent Outwit an email asking them if I purchase the full version will I be able to do it, but I said I'd ask here too to get your opinion, or if you would know of another place I could go to get such an answer, I'm very new to web scraping myself.

    Thanks

    Thomas :)


Comments

  • Registered Users, Registered Users 2 Posts: 7,188 ✭✭✭Talisman


    Do you have some coding experience? If you can write some code implementing your own solution wouldn't be that difficult.

    If the page that you want to scrape the data from is using JavaScript then the window.open() method will be called to open the popup window. You can override window.open() with your own implementation which could be as simple as opening the url in the current window.

    Something like:
    window.open = function (open) {
        return function (url, name, specs) {
            window.top.location = url;    
            return false;
        };
    }(window.open);
    

    There are web scraping frameworks that can process JavaScript (see Web Scraping Ajax and Javascript Sites) and there are some written in JavaScript e.g. pjscrape or CasperJS.


Advertisement