Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

Webgraphs and the Irish Webscape

  • 15-05-2013 3:03pm
    #1
    Registered Users, Registered Users 2 Posts: 7,521 ✭✭✭


    I've been working on building an Irish Search Engine (still). One of the aspects is modeling the webgraph of the Irish webscape. What is astounding is the way that Social Media has changed things. The preliminary stats are below:
    |Link |% of all sites|% of active sites |
    | www.facebook.com | 10.9835 | 25.1549 |
    | twitter.com | 6.9223 | 15.8538 |
    | www.youtube.com | 3.0860 | 7.0677 |
    | www.linkedin.com | 1.5461 | 3.5409 |
    | www.twitter.com | 1.4191 | 3.2500 |
    | www.blacknight.com | 1.4053 | 3.2184 |
    | www.adobe.com | 1.2411 | 2.8425 |
    | wordpress.org | 0.9411 | 2.1554 |
    | plus.google.com | 0.7059 | 1.6166 |
    | validator.w3.org | 0.7028 | 1.6095 |

    It is a very crude survey and I haven't analysed anchor text or deep site links. These are index page only linkages. The inactive/hacked/holding pages/for sale/PPC parked/in zone redirects sites have been removed. Some of the FB and Twitter links may be the "Like This" or "Tweet this" links.

    The methodology is quite crude and I am working on a better link extractor/processor in Python. Much of this came about through building a better web spam filter.

    Regards...jmcc


Comments

  • Registered Users, Registered Users 2 Posts: 300 ✭✭Speculator


    jmcc wrote: »
    I've been working on building an Irish Search Engine (still). One of the aspects is modeling the webgraph of the Irish webscape. What is astounding is the way that Social Media has changed things. The preliminary stats are below:
    |Link |% of all sites|% of active sites |
    | www.facebook.com | 10.9835 | 25.1549 |
    | twitter.com | 6.9223 | 15.8538 |
    | www.youtube.com | 3.0860 | 7.0677 |
    | www.linkedin.com | 1.5461 | 3.5409 |
    | www.twitter.com | 1.4191 | 3.2500 |
    | www.blacknight.com | 1.4053 | 3.2184 |
    | www.adobe.com | 1.2411 | 2.8425 |
    | wordpress.org | 0.9411 | 2.1554 |
    | plus.google.com | 0.7059 | 1.6166 |
    | validator.w3.org | 0.7028 | 1.6095 |

    It is a very crude survey and I haven't analysed anchor text or deep site links. These are index page only linkages. The inactive/hacked/holding pages/for sale/PPC parked/in zone redirects sites have been removed. Some of the FB and Twitter links may be the "Like This" or "Tweet this" links.

    The methodology is quite crude and I am working on a better link extractor/processor in Python. Much of this came about through building a better web spam filter.

    Regards...jmcc

    How are you getting on with the search engine?


  • Registered Users, Registered Users 2 Posts: 7,521 ✭✭✭jmcc


    Speculator wrote: »
    How are you getting on with the search engine?
    Have a beta in operation since August but I had to run a few website > IP address surveys of all websites in com/net/org/biz/info/mobi/asia/us.

    Regards...jmcc


Advertisement