Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Ireland wide DNS resolution issues

Options
  • 03-03-2015 11:47am
    #1
    Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭


    Folks,

    Not sure if here or Net & comms is the best place, but I guess that as it's primarily a broadband issue, I'll start here, and we'll see where this goes.

    I do computer support work for a number of people, and in recent weeks, I've been seeing an issue with broadband usage that goes across pretty much all Irish ISP's, and on that basis, I'm looking for any network techs that may be able to help with guidance and maybe logs that will assist with analysing the problem.

    The issue is that selecting a web page is often failing. with either an error message or a long (close on a minute) delay before the page eventually displays. If I hit stop, and refresh, I can usually guarantee that the display will then happen very rapidly.

    I am personally using OpenDNS as my DNS service, rather than the ISP (Imagine WiMAx) servers, and it's less of a problem here than it is for some of the other users that are getting the problem, they are using their ISP DNS service.

    This seems to be happening on a wide range of ISP's, both wired, wireless and cable/fibre, and across the spectrum of operating systems and browsers, on both Windows and smartphone devices ( I don't get involved with Apple Macs) and the only thing that seems to be common to them is that it seems to be related to DNS resolution .

    So, how many other users out there are seeing this issue, and have you had any luck in resolving it?

    My own suspicion is that it's either a server in the INEX loop around Dublin that's being overloaded or DDOS attacked, or there is a server or router in the core loop that is corrupting packets and not replying correctly to the server that issued the request, resulting in a thread hang until timeout, at which point, the command is re issued or fails, depending on the type of request, and gets through on the next attempt.

    The problem with analysing this issue is that it's probably outside of the ISP networks, and somewhere upstream in the servers that are beyond the ISP's and part of the overall web structure, and as such, the end user is going to have problems getting enough information, and adequately analysing that information in enough depth to determine what's going wrong, and if it's not within the ISP's local network, they may be less inclined to get deeply involved in trying to resolve the issue, unless enough users are screaming at them.

    In order to pin down a bit more information, I am guessing that using something like Wireshark would get some information about the point in the transaction where the hang is happening, but I doubt that will give enough detail about the intermediate servers in the link that are all involved in handling and routing the packets, and analysing Wireshark logs to the depth needed is going to be a long process.

    So, am I barking up the wrong tree here, or is this an issue that other users are seeing on a regular basis?

    Shore, if it was easy, everybody would be doin it.😁



«1

Comments

  • Registered Users Posts: 2,213 ✭✭✭MajesticDonkey


    Now that you mention it....

    Over the last few days I'm having terrible issues with my connection akin to what you described - web pages (most of the time) not loading at all, and sometimes taking a long time to connect and then loading fine.


  • Registered Users Posts: 36,166 ✭✭✭✭ED E


    I've only just skimmed this but to jump in quickly, the ISP DNS servers wont be live querying, they'll cache results, so you should only see that behavior with obscure sites if they were being held up further along.

    I'll grab my samknows data now.

    Last 6 months of DNS queries on UPC:

    vTYdqNh.png


  • Registered Users Posts: 57 ✭✭nickhilliard


    I am personally using OpenDNS as my DNS service, rather than the ISP (Imagine WiMAx) servers, and it's less of a problem here than it is for some of the other users that are getting the problem, they are using their ISP DNS service.

    if you're seeing the same sort of dns resolution failures on opendns as you're seeing on ISP DNS servers, that provides a pretty authoritative answer to your suspicion. I.e. if there is a problem, it's not a problem which is related to irish dns infrastructure because the nearest opendns server to the Imagine network is based in London, and any dns request you make is unlikely to use dns infrastructure over here.
    My own suspicion is that it's either a server in the INEX loop around Dublin that's being overloaded or DDOS attacked
    Eh no. There are two different root DNS server systems connected to INEX (I-root and J-root), and also a large number of top-level domains (.com, .net, .org, .biz, .ie and lots of other cctlds). None of these systems is overloaded.
    or there is a server or router in the core loop that is corrupting packets and not replying correctly to the server that issued the request
    You're not getting the degree to which DNS is a highly distributed service. If a single server or router starts acting up in a dns resolution request, it will often be temporarily cut out of future dns resolution requests by the dns server.

    If you want to diagnose this further, the tool to use is called "dig". It's available on any unix lookalike system (you may need to install BIND). E.g. this command will do a full trace for what happens when you look up www.boards.ie:
    dig a www.boards.ie +trace
    

    Nick


  • Registered Users Posts: 57 ✭✭nickhilliard


    ED E wrote: »
    I've only just skimmed this but to jump in quickly, the ISP DNS servers wont be live querying, they'll cache results
    all dns resolvers cache results, regardless of whether it's opendns, google or your ISP's DNS. This is an explicit part of the DNS protocol and the entire dns system would fall over within milliseconds if result caching didn't exist.

    Nick


  • Registered Users Posts: 57 ✭✭nickhilliard


    I am personally using OpenDNS as my DNS service, rather than the ISP (Imagine WiMAx) servers
    On a separate issue, you will generally do yourself and your clients a massive favour by using the ISP's DNS server that they hand out when you connect to the service. The reason for this is that many Content Distribution Networks use the IP address of the DNS server to decide how to send traffic to you. E.g. if you use your local ISP's DNS server, the CDN will figure out that the request is coming from Ireland and it will attempt to serve you from a nearby network.

    On the other hand, if you use DNS servers in the UK or the US, the CDN may try to serve you from a UK or a US based CDN deployment. This may result in a measurable drop in performance.

    If you want to technical details, they're all here: http://www.cdnplanet.com/blog/which-cdns-support-edns-client-subnet/

    The CDN support table is still mostly accurate.

    Nick


  • Advertisement
  • Registered Users Posts: 7,325 ✭✭✭jmcc


    The issue is that selecting a web page is often failing. with either an error message or a long (close on a minute) delay before the page eventually displays. If I hit stop, and refresh, I can usually guarantee that the display will then happen very rapidly.
    The problem could be the multitude of web beacon, analytics and other external site calls in a web page. If one of these is slow or hangs, it can stop a webpage loading. It may not be a DNS issue at all.

    Regards...jmcc


  • Banned (with Prison Access) Posts: 31 yumyum10


    all dns resolvers cache results, regardless of whether it's opendns, google or your ISP's DNS. This is an explicit part of the DNS protocol and the entire dns system would fall over within milliseconds if result caching didn't exist.

    Nick
    Why do you put your name at the end of every post?


  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    Nick, (and other)

    Lot there to think about, and appreciated.
    Open DNS is a lot less sticky than the local services, and my understanding of the later versions of Firefox (which is not authoritative) is that the timer spins anticlockwise while it's getting the resolution information, and then clockwise when it's getting the information it needs to display the page, and most of the time when it hangs, or fails to display, it never gets beyond the resolution phase.

    I don't have a non windows box here I can use right now, but I think it will be worth setting one up just to be able to give dig a run and see what's happening.

    The comments about using the local service DNS is interesting, in that with the recent issues, using OpenDNS has been providing faster responses with fewer errors than using the Imagine DNS servers, which is strange based on the other comments about performance.

    Perhaps I need to give the Imagine servers another chance, and see if it makes any difference or not.

    I guess the bigger issue that this thread is (hopefully) getting out into the open is that there is some sort of issue going on at the moment that's wider than just an individual or even a selection of ISP's, and I'm not sure that we're at that point yet of knowing if there is a web wide issue for Ireland at the moment.

    At the moment, I'm aware of similar issues with Imagine, Vodafone, UPC, Ripplecom and Sky. Two of them are wireless providers, UPC is cable, and Vodafone is Non fibre, the Sky side of things I don't know what services are affected there, only that it's being discussed as a problem on their network.

    So, I guess the best thing is to let this run for a while, and see how many responses we get to this, and then see how to investigate further

    Shore, if it was easy, everybody would be doin it.😁



  • Registered Users Posts: 36,166 ✭✭✭✭ED E


    UPC have specifically redirected to mobile cdns before so I bypass them and use google now. (Off topic)


  • Banned (with Prison Access) Posts: 31 yumyum10


    Its all congestion, whether its the local exchange or the routing in Europe that the ISPs use and not paying or investing for enough bandwith. That's the problem.


  • Advertisement
  • Registered Users Posts: 7,325 ✭✭✭jmcc


    Nick, (and other)

    Lot there to think about, and appreciated.
    Open DNS is a lot less sticky than the local services, and my understanding of the later versions of Firefox (which is not authoritative) is that the timer spins anticlockwise while it's getting the resolution information, and then clockwise when it's getting the information it needs to display the page, and most of the time when it hangs, or fails to display, it never gets beyond the resolution phase.
    It might be pointless, but did you try similar tests on problematic websites with Chrome? (From what I read, Chrome seems to handle Javascript a lot better than older versions of Firefox.)

    Regards...jmcc


  • Closed Accounts Posts: 5,361 ✭✭✭Boskowski


    Same problem here. Particularly off throwing was that it coincided with my fibre activation. Anyway.
    Putting the Google DNS servers into my config seems to serve as a workaround.


  • Registered Users Posts: 7,325 ✭✭✭jmcc


    Getting back to the point that Nick raised earlier, if some of the sites are doing IP sniffing to determine the location of the user in order to serve country-specific content, they might be using one of those dodgy IP>country databases which often only go down to a /24 or Class C. Some new IP ranges used by ISPs are often not included or are included in the IP allocations for the ISP's parent company and thus appear as having the same country as the parent company. (UPC had problems with this and I think that some sites were reporting the user's location as being Austria rather than Ireland because these databases hadn't been updated.)

    Regards...jmcc


  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    jmcc wrote: »
    It might be pointless, but did you try similar tests on problematic websites with Chrome? (From what I read, Chrome seems to handle Javascript a lot better than older versions of Firefox.)

    Regards...jmcc

    Yes. tried a number of browsers, and as for Firefox, the machine here (and most other users I'm involved with) is on the latest non beta release, which is V 36,

    Shore, if it was easy, everybody would be doin it.😁



  • Registered Users Posts: 57 ✭✭nickhilliard


    At the moment, I'm aware of similar issues with Imagine, Vodafone, UPC, Ripplecom and Sky. Two of them are wireless providers, UPC is cable, and Vodafone is Non fibre, the Sky side of things I don't know what services are affected there, only that it's being discussed as a problem on their network.
    then you have your answer: these service providers use different transit providers, different DNS recursor server software, different physical transport media (vdsl/adsl/wireless/cable/etc) and are run by different people. In fact, there is pretty much nothing in common between all the companies, other than the fact that they all happen to operate on our favourite lump of rock in the atlantic ocean. This suggests that if you're seeing a problem common to all, it's not likely to be a problem on the service provider side.

    Nick


  • Registered Users Posts: 9,152 ✭✭✭limnam


    kardasians on another photoshoot?

    Did you take a pcap of the issue and see where it's "slow" ?


  • Registered Users Posts: 36,166 ✭✭✭✭ED E


    As nick has said, theres no commonality there. UPC is totally independent with HFC - own core - own transit. The DSL providers share transit and some infrastructure, and the WISPs piggy back on several different wholesalers. To see an issue with all of them is unusual.

    Time to get wireshark out at several of the sites and try and find a pattern.


  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    Looks that way, will be interesting to see what the response to this thread is from other users and other areas of the country.

    Shore, if it was easy, everybody would be doin it.😁



  • Registered Users Posts: 14,330 ✭✭✭✭jimmycrackcorm


    Interesting


  • Registered Users Posts: 14,330 ✭✭✭✭jimmycrackcorm


    Interesting thread as I'd noticed some web page failures and unusual slowness on my upc connection. I was putting it down to my laptop under stain but if it happens again I might try changing the dns


  • Advertisement
  • Moderators, Computer Games Moderators, Technology & Internet Moderators, Help & Feedback Category Moderators Posts: 25,101 CMod ✭✭✭✭Spear


    This was caused by Firefox 36's release.

    They changed the DNS behaviour it, which caused it to make spurious DNS requests for ANY records. This led to all sorts of issues, particularly in cases where some ISPs DNS servers cached dud responses. You can read about some of the fallout here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=1093983


  • Registered Users Posts: 5,684 ✭✭✭jd


    Boskowski wrote: »
    Same problem here. Particularly off throwing was that it coincided with my fibre activation. Anyway.
    Putting the Google DNS servers into my config seems to serve as a workaround.
    It could be the router was set up to act as a proxy for dns. Try putting your ISPs dns servers directly in the config rather than Google's


  • Registered Users Posts: 57 ✭✭nickhilliard


    Interesting thread as I'd noticed some web page failures and unusual slowness on my upc connection. I was putting it down to my laptop under stain but if it happens again I might try changing the dns
    if you want better web performance, use a fast browser with a low memory footprint, install adblock, ghostery and noscript on your browser and leave your dns settings to be whatever is auto-assigned by the ISP. And get more memory on your computer + an SSD, and make sure no-one else is using upstream bandwidth for bittorrenting or uploading photos+videos from their phones to icloud/facebook/instagram or whatever.

    In other words, fix the things which have an actual link to performance.

    Nick


  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    Spear wrote: »
    This was caused by Firefox 36's release.

    They changed the DNS behaviour it, which caused it to make spurious DNS requests for ANY records. This led to all sorts of issues, particularly in cases where some ISPs DNS servers cached dud responses. You can read about some of the fallout here:

    https://bugzilla.mozilla.org/show_bug.cgi?id=1093983

    My brain hurts :D There's clearly something strange and unexpected going on with Firefox, and they've decided it needs to be backed out sooner rather than later, and it looks like it's gone into the beta version, so I will do some checking, and see what the performance is like with that.

    Appreciate the heads up on that, looks like it was worth creating this thread, as finding that bug would not have been easy.

    Shore, if it was easy, everybody would be doin it.😁



  • Moderators, Computer Games Moderators, Technology & Internet Moderators, Help & Feedback Category Moderators Posts: 25,101 CMod ✭✭✭✭Spear


    My brain hurts :D There's clearly something strange and unexpected going on with Firefox, and they've decided it needs to be backed out sooner rather than later, and it looks like it's gone into the beta version, so I will do some checking, and see what the performance is like with that.

    Appreciate the heads up on that, looks like it was worth creating this thread, as finding that bug would not have been easy.

    This was the production release of Firefox that went live on the 24th or so, with the issue spreading over the next few days as people updated. You may not see any improvement, since the issue means that you're at the whim of other users, since it's random whether or not a Firefox 36 user makes the DNS request that ends up with a dud answer cached on an ISPs DNS server. So either the DNS servers get reconfigured/patched, or Mozilla pushes out a patch for Firefox promptly.


  • Registered Users Posts: 57 ✭✭nickhilliard


    Spear wrote: »
    since it's random whether or not a Firefox 36 user makes the DNS request that ends up with a dud answer cached on an ISPs DNS server. So either the DNS servers get reconfigured/patched, or Mozilla pushes out a patch for Firefox promptly.
    Interesting. The bug discussion makes it clear that this is not a dns server problem, but a bug in FF36 caused by the FF developers misunderstanding what a DNS ANY query does. Andrew Sullivan's analysis is - as always - spot-on: https://lists.dns-oarc.net/pipermail/dns-operations/2015-February/012896.html

    Good catch @Spear! Mozilla will need to fix this. Alternatively it should go away if you drop back to FF35.

    @Irish Steve, this problem affects FF36 on windows only, not any other application or any other operating system.

    Nick


  • Registered Users Posts: 36,166 ✭✭✭✭ED E


    This is kinda like the 512K bug again, one hiccup in software causing distributed intermittent performance issues, though this time its surprisingly in a browser.


  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    OK, thanks for the follow up guys, I'm for sure glad that even if (my) the analysis was wrong, the potential for wide ranging effects was clearly significant, and I would be prepared to bet that it's been a factor in quite a few of the issues that have been reported in the last number of days.

    I'm going to go and have a closer look at the FIrefox patch availability, and if it's not available as a patch for now, either use the beta that will have the changes already in it, or revert to 35, and hopefully, the discussion here will have given a lot of other users a heads up about it, along with some of the network techs that are probably delighted to have something specific that can be blamed for the issues of the last week.

    Now if I could only get 3 to sort their phone network issues out....................

    Cheers

    Shore, if it was easy, everybody would be doin it.😁



  • Moderators, Motoring & Transport Moderators Posts: 6,521 Mod ✭✭✭✭Irish Steve


    Update for anyone that's using Firefox 36, which is causing some of the issues with page hangs and slow performance.

    Using Firefox, go into about:config, accept the warning if it gives you one, and then search for network.dns.get-ttl and change this setting to false

    According to the support team at Mozilla, this will make Firefox work in the way it was before the version 36 release, and a patch update due in the next update cycle ( due in a few days) should resolve this issue.

    Once this change has been applied, it should also be OK to revert to using the local ISP DNS servers, as there can be local content that is only served correctly by the ISP servers. If everything is working correctly, there's no essential requirement to change back to local DNS.

    After applying the patch, Firefox has to be closed and restarted to make the change to the config work.

    Hope this helps some of the people out there that have been having issues over the last week

    Shore, if it was easy, everybody would be doin it.😁



  • Advertisement
  • Registered Users Posts: 57 ✭✭nickhilliard


    Once this change has been applied, it should also be OK to revert to using the local ISP DNS servers, as there can be local content that is only served correctly by the ISP servers. If everything is working correctly, there's no essential requirement to change back to local DNS.

    just to clarify, the DNS ANY query bug in Firefox 36 has got absolutely nothing whatsoever to do with what DNS servers are used.

    It's simply a bug in the windows build of FF36 which has now been fixed in FF36.0.1: https://www.mozilla.org/en-US/firefox/36.0.1/releasenotes/

    Nick


Advertisement