Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Marathon results data

Options
  • 02-11-2014 10:26pm
    #1
    Registered Users Posts: 18


    Hi,

    I'm doing a data analytics project in college where I'm comparing the results of the Dublin marathon to other marathons such as Chicago, New York etc.

    Does anybody know of any good sources for archive data or know of any scripts/tools I could use to extract the data from the race websites themselves?

    The problem I'm coming across is the layout of the various sites and the fact that you often have to page through the results e.g. on the Dublin marathon site its 10 results at a time.

    I've also take a look at MarathonGude.com but they don't always have the level of detail I'm looking for e.g. split times etc.


Comments

  • Registered Users Posts: 10,446 ✭✭✭✭Murph_D


    I would imagine the first thing any good researcher would do is to contact the organisations running these various marathons and ask them for the data in a suitable form, as well as asking permission to use the data in the manner you wish. If it's all public domain, which most of it seems to be, I'd imagine there would be few objections in principle, although you would probably have to be pretty persistent in your approach. Best of luck. What kind of things are you hoping to find?


  • Registered Users Posts: 15,704 ✭✭✭✭RayCun


    and go direct to the timing companies, not the races


  • Registered Users Posts: 6,184 ✭✭✭crisco10


    http://www.tdl.ltd.uk/race-results.php?year=2014&event_prefill=1806

    There is a link here to "Download All" but it doesn't work for DCM for me...? I wanted to have a look at the data too!


  • Registered Users Posts: 10,567 ✭✭✭✭28064212


    Cross-posted to the DCM thread, but pretty relevant here too. To get the data, I built a scraper that would pull down all the results from the TDL website and put it in a file

    Inspired by Krusty_Clown's post last year, I threw together some stats for DCM 2014
    Stat explanations (all times are chip only):
    • Runners - Total number of runners
    • N/A - No split available (no chip time registered at half)
    • Equal Split - Same time taken for first and second half
    • Neg Split - Number of runners who ran the second half quicker
    • Pos Split - Number of runners who ran the second half slower
    • % Positive - Percentage of runners who had a positive split
    • Avg Neg - Of the runners who ran a negative split, what was the average
    • Avg Pos - Of the runners who ran a positive split, what was the average
    • /km 10k - Average pace (mins/km) for runners up to 10k
    • /km Half - Average pace (mins/km) for runners from 10k to 21.098k
    • /km 30k - Average pace (mins/km) for runners from 21.098k to 30k
    • /km End - Average pace (mins/km) for runners from 30k to finish
    • Avg Time - Average finishing time
    Stat|Sub 3|3 to Sub 4|4+|Top 1000|All
    Runners|||||12576
    DNF|||||354
    Finishers|316|4434|7472|1000|12222
    N/A|3|12|15|4|95
    Equal Split|0|1|0|0|1
    Neg Split|40|327|185|98|552
    Pos Split|273|4094|7272|898|11639
    % Positive|86.39%|92.33%|97.32%|89.80%|95.23%
    Avg Neg|-00:00:56|-00:02:22|-00:09:11|-00:01:25|-00:04:33
    Avg Pos|00:04:36|00:12:21|00:25:47|00:07:17|00:20:34
    /km 10k|00:03:59|00:05:01|00:06:11|00:04:10|00:05:47
    /km Half|00:03:57|00:04:58|00:06:19|00:04:08|00:05:52
    /km 30k|00:04:05|00:05:13|00:07:05|00:04:23|00:06:19
    /km End|00:04:12|00:05:38|00:07:40|00:04:40|00:06:50
    Avg Time|02:51:20|03:38:09|04:48:01|03:05:11|04:19:39

    And for comparison, 2013 (note that the 3rd split was at 20 mile, not 30k):

    Stat|Sub 3|3 to Sub 4|4+|Top 1000|All
    Runners|||||N/A
    DNF|||||N/A
    Finishers|387|5121|6776|1000|12284
    N/A|2|8|13|3|152
    Equal Split|0|10|0|0|10
    Neg Split|100|820|472|165|1392
    Pos Split|285|4283|6291|832|10859
    % Positive|73.64%|83.64%|92.84%|83.20%|88.40%
    Avg Neg|-00:01:37|-00:02:29|-00:04:47|-00:01:37|-00:03:12
    Avg Pos|00:03:25|00:08:48|00:20:00|00:05:15|00:15:09
    /km 10k|00:04:01|00:05:00|00:06:20|00:04:14|00:05:42
    /km Half|00:04:02|00:05:00|00:06:27|00:04:15|00:05:47
    /km 20m|00:04:03|00:05:09|00:07:04|00:04:18|00:06:11
    /km End|00:04:12|00:05:32|00:07:29|00:04:35|00:06:34
    Avg Time|02:51:42|03:37:53|04:48:13|03:02:56|04:15:14

    And 2012:
    Stat|Sub 3|3 to Sub 4|4+|Top 1000|All
    Runners|||||12377
    DNF|||||301
    Finishers|427|5318|6331|1000|12076
    N/A|10|59|93|17|292
    Equal Split|0|7|0|1|7
    Neg Split|137|1180|563|268|1880
    Pos Split|280|4072|5675|714|10027
    % Positive|65.57%|76.57%|89.64%|71.40%|83.03%
    Avg Neg|-00:01:26|-00:02:28|-00:05:00|-00:01:34|-00:03:09
    Avg Pos|00:02:50|00:07:36|00:19:15|00:04:01|00:14:04
    /km 10k|00:04:01|00:05:03|00:06:23|00:04:15|00:05:44
    /km Half|00:04:01|00:05:00|00:06:29|00:04:15|00:05:46
    /km 20m|00:04:02|00:05:09|00:07:03|00:04:20|00:06:10
    /km End|00:04:09|00:05:26|00:07:27|00:04:27|00:06:26
    Avg Time|02:51:00|03:37:26|04:48:50|03:01:21|04:13:13

    Just in case you needed any more confirmation that 2014 was a particularly tough year! Significantly slower times across the board, significant increase in the number of people running positive splits at all levels

    Boardsie Enhancement Suite - a browser extension to make using Boards on desktop a better experience (includes full-width display, keyboard shortcuts, dark mode, and more). Now available through your browser's extension store.

    Firefox: https://addons.mozilla.org/addon/boardsie-enhancement-suite/

    Chrome/Edge/Opera: https://chromewebstore.google.com/detail/boardsie-enhancement-suit/bbgnmnfagihoohjkofdnofcfmkpdmmce



  • Registered Users Posts: 15,704 ✭✭✭✭RayCun


    Thanks for posting that.
    Interesting that in previous years, the 3+ runners have a noticeable slowdown from the halfway mark, and the 4+ runners even more, but the sub3 runners tended to hold it together more until the third checkpoint. This year everyone cracked earlier


  • Advertisement
  • Registered Users Posts: 2,116 ✭✭✭Peterx


    Humidity was sky high this year, I think this was the single biggest factor in people tiring earlier than the other years. And it was very windy.


Advertisement