Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Analysing pedigrees

24

Comments

  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I'm dragging up an old thread.
    For the past few months I've been doing a bit of database programming. The idea is to compare flat (not fences or hurdles) pedigrees to race results.

    At this stage I have not done the comparison, but I have the basics on the pedigree side.
    I have a program that analyses pedigrees. I can load any number of horses, start it, and come back later for the results.
    I'll be tinkering with it a little, adding bits and pieces, putting in timing to find slow spots, and rewriting to speed things.

    What gave me a boost was a poster asked me by PM to look at the Tattersalls February catalogue. It has 488 lots.
    I could look at each lot on screen in my commercial pedigree program (TesioPower) and pick out the best imo.
    Instead I decided to put in some extra effort to finish the database program.

    It analysed the 488 Tattersalls lots in 44.53 seconds or approx 0.9 seconds a horse.
    That was on my slow seven year old PC.
    I bought a new PC in November and that runs things in 32% of the time of the old PC, so it will analyse a horse pedigree in under 0.3 of a second.
    First it gathers the 126 ancestors in the first six generations (2,4,8,16,32,64), then analyses them.
    Fwiw in a sales catalogue they print the first three generations (14 ancestors).

    Plan
    Add new features to the six generation analysis (group winners, Derby winners, group winner producer).
    Compare the analysis to race performance.
    Give up if no link found between pedigree and performance :), or makes changes.
    My next plan will be to analyse 7, 8, 9, 10, 11, 12 generations. This should be easy as I only have to increase the size of the ancestor database.
    Of course if I go from 6 generations from 7 generations the data doubles.
    A 12 generation analysis is much bigger (and slower) than a 6 generation, 8,190 horses v 126 horses, 65 times the size.

    When running the program a byproduct is it gave some strange results for a few horses, many in the 1800s.
    These are horses with incomplete pedigrees (some of these are half-bred non-thoroughbred).
    I go back, fix the data if I can, and run it again.
    The reason to test the program against so many pedigrees if to test it against complex pedigrees.

    I am very interested in full siblings in pedigrees
    (horses with same sire and dam e.g. sires Sadler's Wells and Fairy King are both by the sire Northern Dancer out of the dam Fairy Bridge).
    And I look for 3/4, 7/8 siblings.
    Examples of recent 3/4 siblings are:
    Frankel; Highland Reel; Intello; Roderic O'Connor; Sir Isaac Newton; Teofilo, all by Galileo out of a Danehill dam.

    Below is a summary decade by decade of full siblings in 6 generation pedigrees from 1800 to now.
    It gives an indication that horses were much more closely inbred in the past, probably because you walked your sire to a local mare.
    (full siblings A and B: one of horse A, and four of horse B is counted as five)
    Horse populations in the early 1800s stayed in the same area (as did humans).
    When trains were invented (1830s) you could travel your mare.
    Motor transport (1890s) made travel even easier.
    Now you can fly the mare or stallion anywhere.

    In the table below you will see a spike in full siblings in pedigrees in the 1860s and 1870s.
    My guess is brothers Stockwell (1849) ("the emperor of stallions") and Rataplan (1850) are heavily involved.
    Galileo traces back in direct male line to Stockwell, as does almost everything else running today.
    The low full sibling numbers for the 1990, 2000s, 2010s might be because many of these are low quality running horses, not breeding horses.

    Decade Horses Full Sib Average
    180? 6 23 3.8
    181? 40 121 3.0
    182? 311 863 2.8
    183? 948 2407 2.5
    184? 1285 3642 2.8
    185? 1826 8846 4.8
    186? 2374 19072 8.0
    187? 3145 26993 8.6
    188? 4612 26988 5.9
    189? 5622 24467 4.4
    190? 6543 31694 4.8
    191? 7831 35914 4.6
    192? 9965 30673 3.1
    193? 11835 32875 2.8
    194? 16631 40744 2.4
    195? 25365 38647 1.5
    196? 27277 36406 1.3
    197? 36611 82902 2.3
    198? 50620 128464 2.5
    199? 69801 107813 1.5
    200? 83274 52267 0.6
    201? 15249 6288 0.4

    381171 3.4


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    This week I completed database programs that analysed 159,660 horse pedigrees, and threw out the results.
    Next is trying to learn from the results if there are differences between better horses and lesser horses, and if significant.

    The results data is 4116 rows by 40 columns = 164,640 cells.
    37 of the 40 columns are features in each pedigree, 16 from duplicated stallions, 16 from duplicated mares (duplicated mares are rare).
    Very few of the 37 fields from 4116 rows are filled (37x4116 = 152,292). 125.695 cells have a zero result (82.5%). Only 17.5% of cells are filled.

    It is a inbreedings/linebreedings analysis of six generations counting number of duplicated horses, groups of duplicated horses, sex of offspring of duplicated horses, siblings in pedigrees and so on.
    If the results are useless I will move on to other ideas.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I don't know much about statistics but I think the Chi Squared test might be the way to see if my results are relevant.
    I've been swotting up on it in the last few days so caution is advised.

    First I split the data into three files: colts, geldings, fillies.
    There are 37,057 colts and that is where I started the tests.
    I split the colts into two groups 10,212 and 26,842 = 37,054 (3 missing?).

    Group A are the lower rated colts, Group B the higher rated colts.
    Of course the highest rated colt in group A will only be a fraction below the lowest rated colt in Group B.
    In hindsight I could have picked a higher point so that the split would be closer to 50/50.

    Actual Frequency Group A Group B Total
    0 4453 11394 15847
    1 3876 10158 14034
    2 1505 4116 5621
    3 317 983 1300
    4 58 171 229
    5 3 20 23
    10212 26842 37054


    Expected Frequency Group A Group B Total
    0 4,367.40 11,479.60 15847
    1 3,867.74 10,166.26 14034
    2 1,549.14 4,071.86 5621
    3 358.28 941.72 1300
    4 63.11 165.89 229
    5 6.34 16.66 23
    10212 26842 37054

    p-value 0.01806328 <--- chitest (actual: expected)


    Chi-Square Terms Group A Group B
    0 1.68 0.64 = (4453-4367.4)^2/4367.4
    1 0.02 0.01 ( differences squared
    2 1.26 0.48 ( to turn them positive
    3 4.76 1.81 ( then divide by expected
    4 0.41 0.16
    5 1.76 0.67

    Chi-Square 13.64 <--- sum above Chi-Squared Terms values
    Degrees of Freedom 5 <--- (data rows -1) * (data columns -1)
    Alpha 0.01 <--- 1% level
    Critical Value 16.81 <--- critical chi-square value (x2 distribution table)

    Decision Reject - Group A & Group B are not different at 1% level

    Explanation ( If "Chi-Square" number is bigger than "Critical Value"
    ( then differences are large between the groups
    ( and not caused by chance


    This is from an Excel spreadsheet.
    The actual numbers are at the top.
    Then there is a calculation of the "expected" numbers.
    (4367.40 is 4453 x 15847/37054 and so on)
    The Chi-Squared Terms are the differences squared (to turn negative differences positive).
    The idea is if the difference are big then Group A and Group B are very different, and the difference is not due to chance.

    The "Decision" near the end is a comment saying if Group A and Group B are similar or different.
    What I want is the pedigree factor I am testing to show a difference between the groups.
    I want the Group B to have more of the factor, and that difference to be so large that if is not caused by chance.
    The above test "failed" to prove that there is a one in a hundred chance that Group B are better due to the factor.
    But if I change the Alpha to 0.05 it changes to
    "Accept - Group A and Group B are different, - significant at 5% level"
    or in other words there is only a one in twenty chance that Group B has more of the factor due to chance.

    Then I thought I would see what the average ratings were for the colts with this factor.
    Please remember that there are 37 pedigree factors, and this is just one of the 37 factors, a factor I think might produce better colts.
    The other 36 factors are not yet tested statistically.
    I will try to write a database program to calculate the statistical result at 1% and 5% for colts, geldings, fillies, and for all 37 factors.
    It might be worthwhile to split the data further into 10% chunks from slowest horses to fastest.
    Another possible is to use the random factor to split the horses randomly so that I do not use my opinions to select groups to test.

    What would happen if I calculated the average ratings of the 37,057 colts who have the factor one, twice, three times and upwards.
    Almost half the colts do not have this factor in their first six generations.
    The colts without this factor have an average rating of 74.36.
    The higher the number of factors it appears the higher the rating.
    These are the ratings of the 23 horses with 5 occurences that average 106.40
    45, 65, 74, 83, 88, 90, 96, 107, 111, 112, 115, 118, 120, 121, 124, 127, 129, 131, 132, 140
    I should point out that much of the data is of the best horses over the past forty years, and small numbers are unreliable.


    occurrence Count Average Diff
    0 15,847 74.36
    1 14,034 76.32 1.96
    2 5,621 79.12 2.80
    3 1,300 85.05 5.93
    4 229 89.69 4.64
    5 23 106.40 16.71
    6 3 91.00 -15.40

    37,057


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I posted this (username amadán) on Irish Bloodstock Forums
    http://www.irishbloodstock.com/phpBB2/viewtopic.php?sid=236adfb3b370d928596f1c03e1bd7b3c&t=4342

    This is an old thread so this post might be overlooked.

    In Jan/Feb 2017 I wrote programs to compare a horse six generation pedigree with rating.
    To analyse deeper into a pedigree (7,8,9,10,11,12 gens) just needs the program to use a bigger ancestor database (I have those ready).
    The present 6 gen ancestor database is 126 horses (2+4+8+16+32+64).

    Many of these programs were written in earlier years but dusted off and completed after I finished the endless data collection at the end of 2016.
    The pedigree and rating data was collected over 23 years.

    I checked the program results statistically (by writing programs to do chi squared tests).
    The program records 9 sire and 9 dam “factors” in each pedigree, (plus a few other extra factors) and the count of those factors in each pedigree.

    The data was 159,222 horses.
    These were four groups: colts; fillies; geldings; colts & geldings.
    The first tests were for those four sex groups.
    The second tests compared nine groups by racing quality (within the sex groups)
    e.g. compare lowest group with second lowest group, all the way to comparing lowest group to highest group (9 groups is 9 x 8 /2 = 36 comparisons).

    About four of the nine factors are positive (two very), a few neutral, and a couple negative. I need time to review the results files.

    It should be possible to analyse pedigrees in volume and rank horses.
    I recently analysed a sales catalogue of ~450 lots in about 40 seconds.
    This was on my slow PC. My new PC is 3 times faster.
    My new PC will be useful if I want to analyse / test more generations (7,8,9,10,11,12), or do more tests.

    For increases of the count of some factors there is an increase in running rating: 0 count, 1 count, 2 count, and so on. (tested & proved statistically)
    Higher rated groups of horses have more occurences of the positive factors than lower rated groups. (tested & proved statistically)
    Counts go from 0 to 27, but usually up to about 7.

    Average ratings increase may only be a point or so for an increase in factor occurence, but this is averaged over tens of thousands of horses.
    But increasing from a count of 0 of a factor, to 1, to 2, to 3, to 4, gives a ratings increase for each jump in factor count.
    (The other eight factors (or 17 factors) might affect the ratings increase.)

    One of the results files is 5,184 lines.
    The Chi squared test "Accepts” or "Rejects” each group comparison at 5% and 1%
    i.e. a 1% Accept is the positive result has less that a 1% occurrence due to chance.

    This gives an idea of the test volumes (from one test).

    Occ ...Group A ....Group B
    0..........4453.......11394
    1..........3876.......10158
    2..........1505........4116
    3...........317..........983
    4............58...........171


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I'm staying in the house today waiting for a DHL courier.

    This are an extract from the results, and might be of interest. There are four sex groups and 18+ factors in each, so 72+ pieces of information.

    One piece of information was picked at random (and luckily it is informative)
    Average rating (factor occurence): 80.7 (0); 81.2 (1); 81.9 (2); 82.7 (3); 83.6 (4); 85.4 (5); 87.2 (6); 91.1 (7); 105.1 (8); 75.5 (9); 76.5 (10)
    Average rating increase cumulatively: n/a (0); +0.5 (1); +1.2 (2); +2.0 (3); +2.9 (4); +4.7 (5); +6.5 (6); +9.4 (7); +24.4 (8); -5.2 (9); -4.2 (10)

    You can see there is an increase in rating for each increase in this factor count, from 0 up to 8. Then there is a massive drop for counts 9 and 10.
    One of the earlier groups has over 25,000 horses so the average is reliable, group 9 has only 4 horses and group 10 has 2 horses. Group 6 has 1,496, group 7 has 313, group 8 has 32 horses.

    Some people say only one generation matters, the sire and dam. Sales catalogues show a 3 generation pedigree. I'm working with 6 generations. Is it time to look at 7, 8, 9, 10, 11, 12 generations?


  • Advertisement
  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I have often wondered if it is possible to purchase an average mare, and breed a good horse.
    The idea is to first breed a filly from the average mare, then to breed another horse that might be useful from that filly.
    A test-mating is a theoretical mating on paper (or computer) of a stallion and mare.
    I prepared test-matings on computer of an average horse picked at random, test-mating her with 393 stallions currently at stud in Irelend, England, France, Germany, Italy.
    Using those 393 test-matings I again test-mated the first offspring with the same 393 stallions. I assumed that the results of the first test-matings were all fillies.
    Of course you would not breed twice to the same sire, e.g. breed to Invincible Spirit, and breed that filly with Invincible Spirit.

    The average horse picked was Mary Sea (2000) by Selkirk out of Mary Astor by Groom Dancer.
    She didn’t win in nine starts.
    Her best Racing Post Rating was 71, and Official Rating 60.
    She has four offspring in my data: Bullyseye Babe (rated 53); Elsie Bay (69); Jamaica Grande (64); Sea Tobougie (48).

    I wanted to see if in two steps it was possible to produce a horse with many of the factors mentioned in my earlier post.
    The test sample was 154,842 (393 x 393)+393.
    The result was: 1 horse (9 count of the above mentioned factor); 2 (8 count); 75 (7 count); 754 (6); 5394 (5); 21701 (4); 45713 (3); 53685 (2); 25222 (1); 2295 (0).
    You can see how difficult it is to produce a good horse (theoretically), and this mirrors the reality.

    A strange outcome is it might be possible to produce a horse with an “8 count”.
    One of the offspring of Mary Sea is the filly Sea Tobougie (2007) by Tobougg out of Mary Sea.
    A test-mating of that combination with the current sire Zambezi Sun (fee €3k) gives an “8 count”.
    Sea Tobougie was only rated RPR 47, OR 40, and failed to win in twelve starts.
    In fact any brooodmare by Toubougg mated with Zambezi Sun would give a similar (not same) result.

    Is it possible to take a very average filly like Sea Tobougie and breed a good horse? It seems unlikely.
    But a Japanese breeder, Kihachiro Watanabe, bought Irish three-year-old filly, Saddlers Gal (RPR 52, nine starts, no wins, no places, earnings £0).
    He bred from her El Condor Pasa, who was second in the Prix de l’Arc de Triomphe (8 wins, 3 seconds from 11 starts).


  • Registered Users Posts: 16,352 ✭✭✭✭Francie Barrett


    diomed wrote: »
    I have often wondered if it is possible to purchase an average mare, and breed a good horse.
    The idea is to first breed a filly from the average mare, then to breed another horse that might be useful from that filly.
    A test-mating is a theoretical mating on paper (or computer) of a stallion and mare.
    I prepared test-matings on computer of an average horse picked at random, test-mating her with 393 stallions currently at stud in Irelend, England, France, Germany, Italy.
    Using those 393 test-matings I again test-mated the first offspring with the same 393 stallions. I assumed that the results of the first test-matings were all fillies.
    Of course you would not breed twice to the same sire, e.g. breed to Invincible Spirit, and breed that filly with Invincible Spirit.

    The average horse picked was Mary Sea (2000) by Selkirk out of Mary Astor by Groom Dancer.
    She didn’t win in nine starts.
    Her best Racing Post Rating was 71, and Official Rating 60.
    She has four offspring in my data: Bullyseye Babe (rated 53); Elsie Bay (69); Jamaica Grande (64); Sea Tobougie (48).

    I wanted to see if in two steps it was possible to produce a horse with many of the factors mentioned in my earlier post.
    The test sample was 154,842 (393 x 393)+393.
    The result was: 1 horse (9 count of the above mentioned factor); 2 (8 count); 75 (7 count); 754 (6); 5394 (5); 21701 (4); 45713 (3); 53685 (2); 25222 (1); 2295 (0).
    You can see how difficult it is to produce a good horse (theoretically), and this mirrors the reality.

    A strange outcome is it might be possible to produce a horse with an “8 count”.
    One of the offspring of Mary Sea is the filly Sea Tobougie (2007) by Tobougg out of Mary Sea.
    A test-mating of that combination with the current sire Zambezi Sun (fee €3k) gives an “8 count”.
    Sea Tobougie was only rated RPR 47, OR 40, and failed to win in twelve starts.
    In fact any brooodmare by Toubougg mated with Zambezi Sun would give a similar (not same) result.

    Is it possible to take a very average filly like Sea Tobougie and breed a good horse? It seems unlikely.
    But a Japanese breeder, Kihachiro Watanabe, bought Irish three-year-old filly, Saddlers Gal (RPR 52, nine starts, no wins, no places, earnings £0).
    He bred from her El Condor Pasa, who was second in the Prix de l’Arc de Triomphe (8 wins, 3 seconds from 11 starts).
    It'd be interesting to do a distribution of the various combinations.

    Good mare/good stallion.
    Average mare/good stallion.
    Good mare/average stallion.
    Average mare/average stallion.

    Maybe you could run a query, get the average ratings of the offspring for each combination?

    I know you've talked about buying a horse, has your data analysis narrowed down the criteria on what you'd be using to select something?


  • Registered Users Posts: 2,702 ✭✭✭tryfix


    diomed wrote: »
    I have often wondered if it is possible to purchase an average mare, and breed a good horse.
    The idea is to first breed a filly from the average mare, then to breed another horse that might be useful from that filly.
    A test-mating is a theoretical mating on paper (or computer) of a stallion and mare.
    I prepared test-matings on computer of an average horse picked at random, test-mating her with 393 stallions
    Is it possible to take a very average filly like Sea Tobougie and breed a good horse? It seems unlikely.
    But a Japanese breeder, Kihachiro Watanabe, bought Irish three-year-old filly, Saddlers Gal (RPR 52, nine starts, no wins, no places, earnings £0).
    He bred from her El Condor Pasa, who was second in the Prix de l’Arc de Triomphe (8 wins, 3 seconds from 11 starts).

    Just on the moderateness of Saddlers Gal. The only thing moderate about her was her performance on the track, she's a blue blood through and through.

    Saddlers Gal is by the mighty Sadlers Wells, out of the mare Glenveagh ( Seattle Slew x Lisadell ). Glenveagh is a half sister to Gp1 winners Fatherland ( National Stakes ) and the mighty Yeats ( multiple GP 1 winner ) both by Sadlers Wells.

    Sending Glenveagh to Sadlers Wells was a matter of sending her to a Stallion that had producded 2 Gp1 winners from that nick, sending Saddlers Gal to Kingmambo was once again repeating the inbreeding to the supremely influential broodmare Special.

    El Condor Pasa was a vindication of brilliant bloodlines, not the result of some random mating.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    It'd be interesting to do a distribution of the various combinations.

    Good mare/good stallion.
    Average mare/good stallion.
    Good mare/average stallion.
    Average mare/average stallion.

    Maybe you could run a query, get the average ratings of the offspring for each combination?

    I know you've talked about buying a horse, has your data analysis narrowed down the criteria on what you'd be using to select something?
    “Good” and “average” are not easy to define. I tend to use numbers. I produced foal averages for stallions. People would be surprised at how little the averages differ, only a few points. And of course we do not know how many foals were culled. We may just be seeing the best on the racecourse. It is a business.

    A while back I grouped mares into 5 rating points bands (e.g. 1-6, 6-10, 11-15 up to 140+). The higher the dam ratings the higher the average foal rating. I confess I made errors and would have to run it again. The link between sire rating and foal rating was similar. Most sires are 120+. The dam is the weak part of the pedigree, and the dam’s dam the weakest.

    The variation in foal ratings for a sire’s lifetime crop is very large, and most of the variation imo due to good/bad pedigrees instead of the dam rating. For bad pedigree I mean little in common between the horses in the sire and in the dam pedigree i.e. little or no duplications, or male duplications of a sire only. The concept of good sires, good broodmares does not make sense to me. Good horses are the product of good pedigrees that match the ancestors of sire and dam (others may disagree).

    Ratings used might be suspect. They are a combination of well known ratings, the highest gained by the runner as a 3yo or older. Earlier ratings (1960s, 1960s, 1970s) seem to have been reviewed and lowered, so if you take them from old books you might get 135 for a horse, and if you see the same horse now it might be a 127. And 2yo ratings (free handicaps) were used when no 3yo+ rating found. An example here is Fairy Bridge, the dam of Sadler’s Wells. She only ran as a 2yo, was rated 124 (actually stones and pounds) in the Irish Free handicap, and retired to stud.
    The USA experimental free handicap has an upper limit of 126 which might under rate those horses. The introduction of International Classifications helps even out the ratings of the major racing countries. Some countries may have been a optimistic in the past, and their ratings useful in promoting local bloodstock.

    Has my analysis narrowed down my criteria?
    (1) avoid numerous duplications of a sire that produces only male offspring e.g. Northern Dancer, especially if these are the only or the majority of duplications in the pedigree.*
    (2) if buying to breed buy fillies with horses in their 3rd, 4th 5th generation that are full siblings of horses in the pedigrees of stallions at stud (or ¾, 7/8 siblings). These are not as common as you might think.
    (3) breed the filly on paper with all sires at stud before you buy her (test-matings).

    * I started to record the sire lines for each of the 393 stallions now at stud, and the sire line of the dams of those 393 stallions. After 22 stallions I stopped to do other work. 17 of those 22 stallions were Northern Dancer sire line, and 12 of their dams were Northern Dancer sire line (9 had both). Careful pedigree planning is needed to avoid this.

    The example I gave of El Condor Pasa should be examined. His dam Saddlers Gal has the full siblings Special and Lisadell (both By Forli out of Thong) in her pedigree 3 x 2, too close to make her a good runner, but this close inbreeding often makes the mare a good producer. The Japanese breeder knew what he was doing. He bred her to the sire Kingmambo, who also has Special in his pedigree.

    The next work is to go beyond the basic 6 generation analysis and prepare something that goes deeper into the pedigree, isolating the features that top males have and top females have (identified visually on screen). The analysis so far was a ready reckoner.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Saddlers Gal sold for 22,000 guineas as a yearling in 1990, the lowest price for a Sadler's Wells yearling that year. Then she ran nine times with a best placing 5th of 6. I can't find her sales price but I think she might have been entered in a Tattersalls mare sale without selling. My guess is she sold to her Japanese owner for a lot less than her original sale price.


  • Advertisement
  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Just got knocked out of a tournament on PokerStars (5:20 am). Time for a cup of coffee and a donut.

    Found this on the internet a few minutes ago

    Takashi Watanabe's interest in horseracing is driven by his knowledge of pedigrees which resulted in his breeding of El Condor Pasa, the newest Japanese star to shine in Europe. The owner of a trucking and transportation company in Tokyo, Watanabe was introduced to the sport by his father.
    In 1992, he commissioned an agent to attend the Tattersalls December Sales to purchase the mare Saddlers Gal-a daughter of Sadlers Wells-who had failed to win in nine starts in Ireland.
    The mare was withdrawn from the catalogue, but he was so determined to secure her that the representative, Morio Sakurai, was ordered to track her down.
    ''Mr Watanabe asked me to find her,'' said Sakurai. ''I located her on a farm in Ireland and he told me to buy her.''
    The subsequent mating between Saddlers Gal and the top French miler Kingmambo was to produce El Condor Pasa, who Watanabe named after a Simon and Garfunkel hit.
    The colt was placed with Yo****aka Ninomiya and last year became the first three-year-old to win the Japan Cup, doing so by two and a half lengths-the widest margin ever.
    Ninomiya, a trainer since 1990, has a team of just 10 horses.
    Approaching 50 years of age, he had previously served as assistant to renowned horseman Teruo Hashimoto for 12 years.
    But he has also gained valuable experience for El Condor Pasa's current international programme during spells with Sir Michael Stoute in Newmarket and with US trainer Bruce Headley in California.

    * Yo****aka (Yosh1taka)


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I analyse a number of things in pedigrees, including duplications.
    This is an example of a six generation pedigree with duplicated horses in colour.
    Some of the analysis factors I mention would be what you see here.

    Modern pedigree are much less inbred, often with three or four highlighted horses. Here over twenty are highlighted.
    The Tetrarch, foaled in Co Kildare, raced in England and was unbeaten.
    Note that there is nothing duplicated in the first three generations. Sales catalogues show three generations.
    A picture is worth a thousand words.

    6034073


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    This is the pedigree of Mary Sea, the average filly I chose for the test-mating experiment mentioned in previous posts.
    Note the few connections between the pedigrees of her sire Selkirk and dam Mary Astor.

    6034073


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    My plan now is to mix it up a bit and work on the ideas in my ideas file.

    I've always thought that the lower down in a pedigree the weaker the pedigree.
    For example, in the above pedigree of Mary Sea (rated (71) the dam line of Mary Astor (86), Djallybrook (105), Hollybrook (rating not found), La Vagabonde (non runner) was not too weak. This area is often a lot weaker than the rest of the pedigree, full of minor sires.

    I'm going to find out if possible the average rating of all the 126 positions (2+4+8+16+32+64) in the six generation pedigrees in my files. I'm not sure what good that is, but if I redo it for different ratings bands e.g. (0-20, 20-40, 60-80, 80-100, 100-120, 120+) there might be a lesson.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    These are the average ratings for each pedigree position for a large number of pedigrees.
    Shown are average rating & number of rated horses averaged for that position.
    What is amazing to me is that the average for sires in the 3rd, 4th, 5th, 6th generation are all over 130.
    The 1st generation at 124.3, and 2nd generation at 129.6 and 127.6 suggest that many of the current horses will not survive in future pedigrees.
    Of course many of the current horses making up the 78.9 average are geldings or colts that will not be used for breeding.
    Lower counts for dams probably reflect: unraced dams; dams of one foal and rating not found / investigated; imported dams from USA and other.
    The lowest average dam ratings are on the bottom dam line as anticipated.

    6034073


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Numbers in above pedigree chart

    Rated horses: 159,230
    Ancestors: 20,213,071
    (159,230 x 127 = 20,222,210)
    (difference of 9,139 is incomplete pedigrees, non-thoroughbreds)

    Rated ancestors: 10,132,337
    Rated sire ancestors: 9,190,434
    Rated dam ancestors: 941,903
    (Unrated sire ancestors: 836,575)
    (Unrated dam ancestors: 9,244,159)

    Sire & dam ancestors born before 1960: 15,720,422 (77.77%)
    (ratings unlikely before 1960)
    Rated sires before 1960: 7,099,930
    Rated dams before 1960: 61,687
    Rated sires 1960 and after: 2,090,504
    Rated dams 1960 and after: 880,216

    Count of Northern Dancer 203,614 (2.03% of male ancestors)
    Count of Hyperion 174,978 (1.75%)
    Count of Native Dancer 246,062 (2.45%)
    Count of Nasrullah 242,636 (2.42%)
    Count of Princequillo 121,707 (1.21%)
    Count of Sadler’s Wells 18,648 (0.19%)
    Count of Galileo 1,446 (0.01%)

    Count of horses rated 130+: 7,650,747 (number of horses rated 130+: 1,845)
    Count of horses rated 120-129: 1,559,099
    Count of horses rated 110-119: 308,594
    Count of horses rated 100-109: 130,723
    (explanation: horses rated 130+ produced many offspring, and their sons and daughters also produced many offspring)


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    These are a few more ideas that are work in progress.

    PREPOTENT
    You often hear about prepotent horses (showing great effectiveness in transmitting hereditary characteristics to its offspring).
    The term chef-de-race is used for these horses in the Dosage Index calculation. They are super sires.
    The Dosage Index calculation uses chef-de-race sires in the first four generations of a pedigree.
    Some of these sires have sired winners of great races.
    Is there evidence that their influence lasts many generations? I have not seen the analysis.

    I have a big file of the six generation pedigrees of 159,230 horses.
    The file is about 20 million lines (159,230 x 127).
    The idea I have is to give all the 127 horses in a six generation pedigree the same rating.
    I’ll use the example of Mary Sea (2000) above to explain. Her rating is 71.
    I’ll give her sire Selkirk 71, and her dam Mary Astor 71, and all the other horses in her first six generations 71 for this exercise.
    Selkirk is actually 129 and Mary Astor 86.

    Selkirk will get a different rating for every pedigree he appears in as a sire, whatever generation he is in.
    For Mary Sea he gets 71, for Altieri he gets 122, for Emerging (by Mount Nelson) he gets 90 (Selkirk is in the pedigree of Mount Nelson).
    At the end of the work all the Selkirk ratings are averaged, and Selkirk gets a number.
    The average of all the 159,230 horses is about 80.
    If a sire is “prepotent” his average should pop up above all others like a cork in water.
    His influence will have boosted the rating of every pedigree in which he appears.
    This should work for dams too, although they appear much less in pedigrees as they produce fewer offspring.
    Sires and dams that appear in only a few pedigrees will be filtered out of the results.
    I have about half the work done on this, the difficult part.


    AGE
    In this week’s Irish Field trainers and breeders were asked “would they buy the produce of an old mare”. A variation on this is do you believe birth order is important.

    I could easily produce information on these questions, but I think other factors are ignored in these studies.
    I’ve seen a study where the sample was only a few hundred, and the conclusion was the status quo: earlier birth rank is better; younger broodmares are better.

    My guess is the owner of a new broodmare sends his expensive purchase to the best sire he can afford, and probably repeats this for a few years.
    After a few disappointing foals he loses faith in the mare (and loses his money) and sends her to cheaper stallions, and still cheaper stallions until the end of her productive life.
    Have you every heard of a mare owner starting her out with the cheapest stallion, and sending her to the champion sire in her old age?

    I am interested not in the age of the sire and dam but in the dates of birth in the pedigree.
    A few numbers might throw some light on this.
    These are the average age of all horses (by generation) in my large data sample.
    There are more horses in the 6th generation so I'll say the average age of all in about 10.5 years.
    I'm not using this average age for anything at present. I just thought people might be interested in it.

    Gen Age
    0
    1 11.25
    2 10.02
    3 10.00
    4 10.11
    5 10.34
    6 10.51



    The table below needs a bit of explaining.
    These eight horses (A B C D E F G H) are four super sires, three disappointing sires, and one unproven recent sire.
    Five of the sires were excellent on the racecourse, but two of these disappointed at stud.
    One of the super sires was a disappointment on the racecourse.

    The numbers are the dates of birth of horses in the same pedigree positions, in the same generation of their pedigrees.
    The first date of birth (DOB) is in the first quarter of the pedigree, the next in the second quarter (1st and 2nd quarters are in the sire pedigree).
    The last two DOBs are in the dam pedigree of these horses.
    I could have listed all the DOBs in that generation but am using four only for demonstration.

    A B C D E F G H
    Sire's sire 1/4 1925 1935 1913 1920 1913 1935 1935 1920
    Sire's dam 1/4 1920 1961 1935 1919 1935 1935 1927 1935
    Dam's sire 1/4 1908 1954 1913 1942 1942 1919 1919 1937
    Dam's dam 1/4 1909 1954 1913 1934 1930 1917 1928 1935


    Comments:
    A) notice the imbalance between the 1925, 1920 and the 1908, 1909 in the dam side. This is like the Leaning tower of Pisa.
    My untested theory is that this imbalance often acts like an outcrossed pedigree, and might be a reason for the high ability.
    But my idea is this imbalance is a negative when that horse goes to stud.
    Half his pedigree doesn't match the pedigree of his mares because it is too old.
    B) Unproven sire as yet. Excellent runner.
    C) A super runner and super sire. Good DOB balance for three of the four ancestors.
    D) A very good runner, but a major disappointment as a sire.
    The sire side of his pedigree is much older, about twenty years.
    E) A high class runner and super sire. One horse is a good bit older (1913 DOB), but three are about the same age.
    The big worry is when the two on the sire side, and the two on the dam sire differ greatly.
    F) This was a very good juvenile, but a failure as a sire.
    Again note the large gap between the sire side (1935, 1935) and the dam side (1919, 1917).
    G) A poor runner but a sensational sire.
    H) A good runner, not a top runner, and a very successful sire.

    I haven't given the name of the horses as I don't want to offend people.
    This is just an idea from looking at many pedigrees, and trying to figure out why some very good horses didn't make the grade as sires.
    These sires and runners did not get their class just from a few matching dates of birth.
    My point is unmatched DOBs might be a negative.

    It might be a tricky task to program and test this. How do I extract "bad" DOBs?


  • Registered Users Posts: 2,484 ✭✭✭Peintre Celebre


    Diomed re a good horse from a bad mare. Hyperion's grand dam won a seller


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Diomed re a good horse from a bad mare. Hyperion's grand dam won a seller
    Good horses have run in sellers (and bad ones).

    Virago (1851) ran in a seller as a two year old and lost.
    In John Porter's autobiography he says " I have in my time seen many great fillies, but I regard Virago as perhaps the greatest of them all".
    Virago won the English 1000 Guineas as a 3yo.
    John Porter trained 21 English Classic winners.

    "The filly's participation in the event (a 2yo selling race) was a colossal piece of bluff, the purpose of which was to deceive those whose duty it was to frame the big handicaps of the following spring. .... "entered to be sold for £80, a bit of bunkum ... Day's head man accompanied the filly to the starting-post, ostensibly with a view to ensuring her getting well away ..... Goater appeared to be taken by surprise when the starter dropped his flag, and Virago was "left" a long way behind the others. She of course finished "nowhere" as intended. .... She was not among the first three, though she could have carried eleven stone and won. She could not have been bought for £5,000."

    I have many horses in my data with the runner rated over 100 and its dam in the 20s. Often a dig into the low rating shows a horse who ran well but disappointed many times later and ended the year with a low rating (that I've used) although it ran at a higher rating during the year. The most unusual case was a rating of 3 as a three year old. The horse finished its two year old racing with a 99 rating, but finished last (?) every time as a 3yo and was retired.

    I use very large volumes of data in an attempt to even out the questionable ratings.


  • Registered Users Posts: 8,611 ✭✭✭Mooooo


    Totally out of my field here but in dairy breeding genomics is really taking hold, would each foal not be genomically tested before sale, and is there an index at which those figures are judged on in terms of performance and heritable traits. Ireland has EBI, economic breeding index, for cows which has subindexes given different weightings for milk, fertility, etc. NZ has similar with different weightings called the BW, breeding worth. I dunno the data available for horses in terms of performance, accuracy etc but Is there not somethings similar for horses and if so how does it fit in with your figures?


  • Advertisement
  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Mooooo wrote: »
    Totally out of my field here but in dairy breeding genomics is really taking hold, would each foal not be genomically tested before sale, and is there an index at which those figures are judged on in terms of performance and heritable traits. Ireland has EBI, economic breeding index, for cows which has subindexes given different weightings for milk, fertility, etc. NZ has similar with different weightings called the BW, breeding worth. I dunno the data available for horses in terms of performance, accuracy etc but Is there not somethings similar for horses and if so how does it fit in with your figures?
    Equinome have a gene speed test at €590 a horse.
    Equinome have an Elite Performance Test at €1,450 a horse.

    would each foal not be genomically tested before sale?
    I don't know. Perhaps some horses are tested before sale but I do not know if the results must be revealed. They might be entered in a sale if they are tested and the results are unfavourable.

    is there an index at which those figures are judged on in terms of performance and heritable traits?
    Equinome have results for horses they tested. It looks like customers get results for one horse, not data for their horse and all other horses tested.
    There are general statistics on the Equinome site but not with horse names.

    Is there not somethings similar for horses and if so how does it fit in with your figures?
    That data is valuable and I have nothing. My guess is you pay for one horse information only.

    Genetic testing categorises a living foal by distance suitability and quality.
    I'm trying to predict quality by analysing ancestors in a pedigree, and predicting potential quality before a foal is bred, or if already bred and unraced.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Further up the thread I mentioned the idea of prepotent sires, and if it might be possible to identify them.

    "If a sire is “prepotent” his average should pop up above all others like a cork in water.
    His influence will have boosted the rating of every pedigree in which he appears. "


    I had to move the data files to my new fast PC as the number crunching task was large.
    The program took each of 285,693 ancestors, found him/her in the 20,203,071 file that contained the six generation pedigree of 159,230 rated horses, and calculated an average rating and a count of the number of times the horse was in 6 gen pedigrees.
    An example result is: NORTHERN DANCER1961, average 79.30, count 203,614
    On the new fast PC it took eight hours for averages, and another eight house for counts.

    Below is a sample of the results. When I saw them I realised Australian and Japanese horse featured prominently. Why?
    Because I will only have the highest rated horses from those countries in my data gleaned from year-end international classifications, and none of the the ordinary horses from those countries.
    Another "feature" of the results is many females appear only because they are the dam of famous sires. In the list below Wind In Her Hair gets in because she is the sire of Japanese horse Deep Impact.
    I think the analysis will only be useful for Ireland and England horses with a good few crops / horses who have have finished their stud careers.
    It is their influence in pedigrees I'm looking for, not as current sires.


    This extract is DOB >= "1990" AND rate_count > 200 AND rate_avg > 85
    The overall average is about 78.9

    name sex rate_avg rate_count
    DEHERE1991 M 92.55 201
    DISTORTED HUMOR1993 M 90.19 255
    DUBAI MILLENNIUM1996 M 86.88 453
    DUBAWI2002 M 89.69 333
    ENCOSTA DE LAGO1993 M 97.26 239
    FLYING SPUR1992 M 89.07 270
    FRENCH DEPUTY1992 M 97.14 213
    GALILEO1998 M 87.04 1446
    HELSINKI1993 F 87.38 396
    OCTAGONAL1992 M 86.86 275
    PULPIT1994 M 88.96 301
    REDOUTE'S CHOICE1996 M 97.81 339
    SHAMARDAL2002 M 87.57 385
    SHANTHA'S CHOICE1992 F 97.84 350
    SMART STRIKE1992 M 89.24 388
    SPINNING WORLD1993 M 85.13 377
    STREET CRY1998 M 86.85 342
    TALE OF THE CAT1994 M 87.99 257
    THUNDER GULCH1992 M 86.95 315
    UNBRIDLED'S SONG1993 M 91.78 332
    WIND IN HER HAIR1991 F 94.33 270
    ZOMARADAH1995 F 90.01 341



    rate_count >= 60000

    name sex rate_avg rate_count
    ALMAHMOUD1947 F 79.47 154064
    BOLD RULER1954 M 79.25 73470
    FAIR TRIAL1932 M 74.95 72787
    GEISHA1943 F 79.02 105516
    HAIL TO REASON1958 M 80.91 69518
    HYPERION1930 M 77.43 174978
    LADY ANGELA1944 F 79.02 134678
    LALUN1952 F 78.85 64813
    MAHMOUD1933 M 79.87 87270
    MUMTAZ BEGUM1932 F 78.12 92140
    NASRULLAH1940 M 78.34 242636
    NATALMA1957 F 79.35 201694
    NATIVE DANCER1950 M 79.08 246062
    NEARCO1935 M 77.66 305727
    NEARCTIC1954 M 79.23 192901
    NOGARA1928 F 78.71 97612
    NORTHERN DANCER1961 M 79.3 203614
    PHAROS1920 M 79.24 111520
    POLYNESIAN1942 M 78.98 117049
    PRINCE ROSE1928 M 78.42 61264
    PRINCEQUILLO1940 M 79.13 121707
    RAISE A NATIVE1961 M 80.18 81709
    RAISE YOU1946 F 80.26 68205
    SOMETHINGROYAL1952 F 78.54 63531
    TOM FOOL1949 M 79.19 66444
    TUDOR MINSTREL1944 M 75.08 68115
    TURN-TO1951 M 79.28 99468


    Northern Dancer has a 79.30 average in all pedigrees.

    His influence was
    1st gen 113.92 (109 rated horses)
    2nd gen 83.57 (6,760)
    3rd gen 77.96 (40,880)
    4th gen 78.84 (79,505)
    5th gen 79.92 (57,181)
    6th gen 80.45 (19,178)

    Is the trend similar for all sires?
    Why is there a gradual increase from gen 3 to gen 6?


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    The Northern Dancer example above is misleading. He was born in 1961.
    His first generation horses in my data (Gen 1) are probably his expensive foals imported to Europe.
    I do not have ratings for most of his foals.

    This example is better. Sadler's Wells, Darshaan, and Rainbow Quest were all born in 1981.
    They all became sires, and their stud careers are over.
    They had similar ratings, 132, 133, 134, and actually finished 1,2,3 in a race.
    The 90.00 for Rainbow Quest in Gen 5 should be ignored (only two horses).

    Gen Sadler's Wells.. Darshaan....... Rainbow Quest..
    0
    1 88.28 87.18 84.20
    2 79.80 79.31 79.24
    3 78.58 77.15 79.24
    4 77.99 76.95 81.30
    5 77.64 75.00 90.00
    6


    Super sires do not produce crops much above the average.
    But they, and other sires, produce some exceptional horses.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    PREPOTENT

    I'm moving on from the question of do sires have an effect that lasts through generations.
    I've sliced and diced the info a number of ways, and will review it in the future. A problem is the data.
    The data for Ireland and UK sires will be more reliable.
    The averages for USA, Japan, Australia sires will be higher as it will probably contains very few low rated.
    avg1 is the average rating of the rated foals of the sires (or dams)

    This is a sample for generations 0 (the sire or dam) to generation 6
    (extract is count >= 1500 and dob >= 1980)

    name sex avg_1to6 avg0 avg1 avg2 avg3 avg4 avg5 avg6 coun0 coun1 coun2 coun3 coun4 coun5 coun6
    Alzao 1980 M 77.8 117 80.0 74.2 81.1 77.6 101.0 0.0 1 545 1074 795 141 1 0
    Barathea 1990 M 75.9 129 76.9 74.8 78.6 78.0 0.0 0.0 1 601 979 153 1 0 0
    Bluebird 1984 M 74.0 125 73.7 70.8 80.6 73.2 0.0 0.0 1 327 938 458 81 0 0
    Cadeaux Genereux 1985 M 75.3 131 79.6 73.4 74.5 70.6 0.0 0.0 1 579 1153 465 11 0 0
    Caerleon 1980 M 80.6 132 87.3 80.6 79.7 78.4 74.3 0.0 1 425 1507 1523 621 40 0
    Chief's Crown 1982 M 75.3 141 85.4 75.5 75.6 70.1 49.3 0.0 1 93 1286 1315 318 3 0
    Cozzene 1980 M 79.3 132 92.1 83.1 82.3 74.7 81.4 0.0 1 136 400 901 1333 39 0
    Dancing Brave 1983 M 82.6 140 86.8 84.0 81.5 83.5 67.7 0.0 1 96 437 1292 549 6 0
    Danehill Dancer 1993 M 79.3 118 82.7 75.9 78.6 0.0 0.0 0.0 1 887 915 18 0 0 0
    Danehill 1986 M 82.0 126 91.4 82.0 79.6 79.3 0.0 0.0 1 790 7898 3131 34 0 0
    Darshaan 1981 M 78.8 133 87.2 79.3 77.2 77.0 75.0 0.0 1 429 2256 2191 594 8 0
    Diesis 1980 M 76.9 133 84.8 75.1 76.4 79.9 110.0 0.0 1 444 1800 1363 134 1 0
    El Gran Senor 1981 M 81.4 136 89.9 76.8 82.1 81.3 0.0 0.0 1 144 395 750 288 0 0
    Fairy King 1982 M 77.2 132 82.5 75.7 78.0 80.2 0.0 0.0 1 317 1536 630 25 0 0
    Gone West 1984 M 80.9 129 87.8 82.6 78.8 78.4 83.0 0.0 1 252 2102 2313 324 2 0
    Green Desert 1983 M 77.9 127 82.5 77.9 76.0 77.4 77.2 0.0 1 758 4702 1905 187 82 0
    Groom Dancer 1984 M 74.2 128 76.6 73.7 74.6 72.1 75.4 0.0 1 268 778 612 295 8 0
    Gulch 1984 M 80.4 127 81.3 80.5 79.9 83.9 0.0 0.0 1 183 764 866 34 0 0
    Highest Honor 1983 M 77.5 130 85.2 77.0 73.7 84.3 0.0 0.0 1 283 1217 441 12 0 0
    In The Wings 1986 M 80.3 128 82.9 79.6 79.9 82.0 0.0 0.0 1 374 1236 372 6 0 0
    Indian Ridge 1985 M 74.8 123 81.0 72.9 75.1 75.0 0.0 0.0 1 564 1959 377 27 0 0
    Kahyasi 1985 M 81.2 135 79.3 80.3 83.9 77.2 58.0 0.0 1 269 302 854 407 1 0
    Kingmambo 1990 M 81.9 125 88.8 81.4 79.6 76.0 0.0 0.0 1 328 1928 547 1 0 0
    Last Tycoon 1983 M 80.8 131 78.8 80.5 81.8 79.9 0.0 0.0 1 235 1419 1042 194 0 0
    Lear Fan 1981 M 80.2 130 87.3 78.4 80.5 75.8 69.0 0.0 1 208 448 813 234 1 0
    Linamix 1987 M 80.1 127 86.4 78.5 77.7 85.1 0.0 0.0 1 369 867 414 15 0 0
    Lomond 1980 M 78.1 128 82.0 76.2 78.6 77.1 78.6 0.0 1 132 377 1138 383 51 0
    Machiavellian 1987 M 80.3 125 85.8 78.1 82.1 91.8 0.0 0.0 1 410 2167 1188 28 0 0
    Mendez 1981 M 80.0 133 81.3 85.8 78.4 77.6 85.1 0.0 1 11 385 899 416 15 0
    Night Shift 1980 M 77.0 126 77.8 76.4 76.9 79.7 0.0 0.0 1 656 1187 459 141 0 0
    Pivotal 1993 M 78.2 124 83.2 75.1 81.4 0.0 0.0 0.0 1 758 1248 39 0 0 0
    Polar Falcon 1987 M 77.6 126 79.6 80.3 74.8 81.8 0.0 0.0 1 243 1122 1383 40 0 0
    Rahy 1985 M 83.4 115 87.6 80.2 85.5 83.3 94.0 0.0 1 245 928 852 939 5 0
    Rainbow Quest 1981 M 80.1 134 84.2 79.2 79.2 81.3 90.0 0.0 1 583 1798 1774 528 2 0
    Red Ransom 1987 M 80.1 136 83.6 79.0 78.2 82.3 0.0 0.0 1 460 975 328 15 0 0
    Royal Academy 1987 M 80.5 133 83.4 79.1 80.1 88.4 0.0 0.0 1 458 1034 422 15 0 0
    Sadler's Wells 1981 M 79.8 132 88.3 79.8 78.6 78.0 77.6 0.0 1 1380 7999 7765 1470 33 0
    Seeking The Gold 1985 M 85.0 135 88.3 83.6 85.3 88.8 0.0 0.0 1 175 682 952 19 0 0
    Shareef Dancer 1980 M 78.4 135 75.8 74.0 78.2 84.3 80.8 0.0 1 234 618 619 576 16 0
    Storm Cat 1983 M 84.9 119 92.8 85.2 83.8 84.9 71.1 0.0 1 261 2807 2659 209 16 0
    Sunday Silence 1986 M 98.8 132 106.5 98.0 97.9 102.0 0.0 0.0 1 176 1687 176 1 0 0
    Unfuwain 1985 M 76.6 131 81.0 76.1 74.3 83.6 0.0 0.0 1 307 810 483 25 0 0
    Waajib 1983 M 75.4 121 70.6 73.7 77.4 80.6 0.0 0.0 1 127 777 638 134 0 0
    Warning 1985 M 73.2 136 84.5 71.8 72.4 74.7 0.0 0.0 1 234 1587 784 103 0 0
    Woodman 1983 M 78.5 126 83.9 76.8 79.1 81.6 0.0 0.0 1 326 1740 1305 148 0 0
    Zafonic 1990 M 78.8 130 84.5 77.3 78.2 88.0 0.0 0.0 1 317 1104 275 1 0 0

    Annie Edge 1980 F 79.4 118 99.1 83.2 76.5 77.1 0.0 0.0 1 8 604 725 184 0 0
    Brocade 1981 F 76.5 121 98.3 77.7 75.2 78.7 78.0 0.0 1 10 643 1006 156 1 0
    Coup De Folie 1982 F 79.9 112 106.8 83.7 78.0 81.2 89.9 0.0 1 8 598 2420 1344 33 0
    Fearless Revival 1987 F 77.8 105 79.0 83.1 74.6 81.4 0.0 0.0 1 5 768 1315 39 0 0
    High Hawk 1980 F 80.4 124 98.8 82.8 79.6 79.9 82.0 0.0 1 9 387 1239 372 6 0
    La Papagena 1983 F 75.9 0 89.5 77.8 74.9 72.7 0.0 0.0 0 11 546 819 127 0 0
    Marie D'Argonne 1981 F 77.7 121 84.6 79.7 80.4 74.8 81.8 0.0 1 7 258 1172 1383 40 0
    Miesque 1984 F 81.7 133 106.9 85.4 81.4 79.5 76.0 0.0 1 8 432 2147 550 1 0
    Mira Adonde 1986 F 79.3 0 99.0 82.7 75.9 75.8 0.0 0.0 0 7 913 921 20 0 0
    Park Appeal 1982 F 79.6 122 105.0 82.5 77.0 74.2 78.0 0.0 1 8 775 784 88 1 0
    Razyana 1981 F 81.8 69 105.7 87.9 81.9 79.6 79.3 0.0 1 9 931 7945 3138 34 0
    Stufida 1981 F 77.8 0 97.5 71.8 83.1 74.6 81.4 0.0 0 2 14 779 1316 39 0
    Urban Sea 1989 F 86.9 124 120.0 88.5 83.6 0.0 0.0 0.0 1 9 1013 582 0 0 0
    Zaizafon 1982 F 78.8 119 110.2 81.9 77.1 78.3 88.0 0.0 1 5 640 1205 278 1 0



    This is some extra info on the females listed
    Annie Edge 1980 - dam of Selkirk
    Brocade 1981 - dam of Barathea
    Coup De Folie 1982 - dam of Exit To Nowhere, Machiavellian
    Fearless Revival 1987 - dam of Pivotal
    High Hawk 1980 - dam of In The Wings
    La Papagena 1983 - dam of Grand Lodge
    Marie D'Argonne 1981 - dam of Polar Falcon
    Miesque 1984 - dam of Kingmambo, Miesque's Son
    Mira Adonde 1986 - dam of Danehill Dancer
    Park Appeal 1982 - dam of Cape Cross
    Razyana 1981 - dam of Danehill
    Stufida 1981 - 2nd dam of Pivotal
    Urban Sea 1989 - dam of Galileo, Sea The Stars
    Zaizafon 1982 - dam of Zafonic, Zamindar



  • Registered Users Posts: 2,484 ✭✭✭Peintre Celebre


    Diomed I have an interest in breeding its above your normal punter but nowhere near a pro. Are there any racing books you'd recommend on the history of the breed to get into? Have read A Cenutry of Champions many Ines after struggling to get my hands on it. E mailed the author when I couldn't find one as to how I'd get one 'I'll tell you the same thing I told John Oxx and Lord Derby when they asked could I get one for them'. Like gold dust an outstanding book.


  • Registered Users Posts: 2,484 ✭✭✭Peintre Celebre


    P s if anyone wants to hear one of the profiles of a horse I'll gladly post it. One of my favourite lines in the whole book as he describes Secretsriat '...that magical afternoon in New York'.

    They rated Sea Bird the best of the previous century


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Diomed I have an interest in breeding its above your normal punter but nowhere near a pro. Are there any racing books you'd recommend on the history of the breed to get into? Have read A Cenutry of Champions many Ines after struggling to get my hands on it. E mailed the author when I couldn't find one as to how I'd get one 'I'll tell you the same thing I told John Oxx and Lord Derby when they asked could I get one for them'. Like gold dust an outstanding book.
    As you say many books sell out and then are not offered second-hand.
    My guess is they come on the market second-hand in estate sales.
    I've been lucky in that I often get fliers in the post when a new book is published.

    I had about 800 books on thoroughbred horses at last count.
    These have been accumulated over about twenty five years.
    Perhaps one third are stud books, one third form books, and the rest biographies and pedigrees analysis books.
    One thing to keep in mind is that books and pedigrees have errors.

    I bought the bulk of my books from Way Books https://www.way-books.co.uk/ Newmarket, England (Greg Way).
    Previously I bought from J A Allen (acquired by Hale Books in 1999 http://halebooks.com/ )
    Racing Post (for Michael Church books, and Century Of Champions etc)
    The Russell Meerdink Company (http://www.horseinfo.com/ good for USA authors, and Ken McLean books)
    Weatherbys for new copies of the General Stud Book (£270 every four years), and Statistical Record etc

    Just before Christmas I bought 32 second-hand books from Way Books, mostly biographies.
    A few minutes ago I e-mailed Greg Way asking if he could supply four Italian Stud Books (£25 each, 1925-29, 1948-51, 1952-55, 1960-63).

    I think a good general book on the history of the breed is Sir Charles Leicester's "Bloodstock Breeding" published in 1957 cost £30
    http://halebooks.com/shop/j-a-allen/al5/bloodstock-breeding/
    I think he lived in Co Meath and died in Bray.
    It analysed each year's Derby and top races, a chapter for each year.
    But there are many other chapters full of useful insights.
    He was imo remarkable in his analysis examining many of the old wives tales about age of mares and sires, birth rank, and the usual fairy tales.
    He was not afraid to roll up his sleeves and examine things in detail.
    I'm not a fan of the saying "breed the best to the best and hope for the best". That saying tells you to not analyse things, just pay the stud fee.

    Another for old time stuff is "The History Of The Racing Calendar and Stud-Book by C M Prior (1926).
    This would not help with breeding or racing but is full of interesting trivia:
    a weaver given a present of a filly in the 1700s walked 300 miles to collect her and then walked home with her
    a mare ridden 300 miles in three days to win a wager
    if you didn't pay when you lost a bet in a club you were put in a basket, hauled up by rope and left there
    the first mention of jockeys colours in 1716 and the seventeen owners and colours
    before c 1784 jockey term was originally used to mean owner but meaning changed when owners ceased to ride their horses
    Pocahontas filly dob 1837 perhaps the most important horse of either sex was sold for 14 guineas, small compared with 2,500 guineas for a smart 2yo years earlier
    a stud groom told to shoot a mare who was useless but shot the owner's good mare

    My most useful book is The Thoroughbred Breeders Handbook by New Zealander Clive Harper published in 1997.
    It is only 107 pages including appendix, bibliography, index.
    It explains how to analyse pedigrees by analysing the ancestor inbreeding / linebreeding. His work follows the ideas of Harold Hampton.
    My analysis earlier in this thread is very similar, although I've a few extra ideas still to analyse.
    You will be lucky if you locate a copy.
    In December I e-mailed Clive Harper's widow as I was told she had copies of his last book, Pattern Of Patterns in Thoroughbred Pedigrees. No reply. Clive Harper died in 2012.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    Here is some summary data for horses, sires, dams grouped in ratings bands.

    rate_band name_count name_avg_s sire_count name_avg_d dam_count
    145 - 149 1 120.0 5 0.0 0
    140 - 144 6 89.7 448 0.0 0
    135 - 139 45 81.9 4686 92.2 58
    130 - 134 167 82.4 14357 90.9 280
    125 - 129 384 78.3 20786 93.3 467
    120 - 124 776 74.6 13767 91.3 1389
    115 - 119 1681 72.9 8656 88.5 2474
    110 - 114 2200 69.5 2797 86.4 3506
    105 - 109 3122 67.8 998 84.5 4656
    100 - 104 3603 68.0 490 82.8 4992
    095 - 099 3863 66.9 248 80.2 4844
    090 - 094 4164 62.7 197 78.5 5111
    085 - 089 4793 61.5 111 77.7 5352
    080 - 084 5646 63.4 189 75.9 6028
    075 - 079 6110 49.3 21 74.0 5781
    070 - 074 6196 69.8 99 72.5 5617
    065 - 069 5759 55.4 16 71.8 4884
    060 - 064 5057 64.1 23 70.8 3978
    055 - 059 4306 58.0 4 70.5 2838
    050 - 054 3537 49.9 7 69.6 2001
    045 - 049 2643 34.7 3 67.3 1169
    040 - 044 1538 0 65.4 823
    035 - 039 783 58.0 1 65.3 435
    030 - 034 511 0 68.2 277
    025 - 029 329 0 68.5 170
    020 - 024 216 64.0 2 63.3 113
    015 - 019 146 0 66.3 55
    010 - 014 206 48.8 5 74.9 532
    005 - 009 81 0 69.2 64
    000 - 004 47 0 64.1 22

    count 67916 67916 67916
    average 78.94 123.60 85.44

    average dob 1999.8 1988.5 1989.9


    Comments:
    These are 67,916 runners with ratings, who have sires with ratings, and dams with ratings.
    Some great horses are missing e.g. Derby winner Golden Horn. His unraced dam has no rating.
    Although the average sire rating is 123.60, and the average dam rating 85.44, the average foal rating is only 78.94.

    It is a bit tricky to read so I'll give examples
    1,169 dams who were themselves rated in the band 045-049 (dam_count column) produced runners with average rating 67.3
    13,767 sires who were themselves rated 120-124 (sire_count column) produced runners with average rating 74.6.
    name_count is the count of runners in each rate band. 3,603 runners were rated 100-104.

    You might notice large numbers of horses in the 010-014 band.
    The majority of these are raced but unrated horses. They ran, but ran so badly they were not given a rating.
    I gave them 10 to distinguish them from all the unraced horses (0 rating) or horses who raced abroad where no rating was available (0 rating).

    A bit of extra analysis of the 17,822 dams rated 100+
    dams rated 100+ mated with sires under 100 ... foal average 76.18 (49 dams)
    dams rated 100+ mated with sires 100-109 ..... foal average 85.07 (179 dams)
    dams rated 100+ mated with sires 110-119 ..... foal average 82.51 (1485 dams)
    dams rated 100+ mated with sires 120-129 ..... foal average 84.38 (8198 dams)
    dams rated 100+ mated with sires 130+ .......... foal average 88.05 (7911 dams)

    Conclusion:
    Breeding the best to the best will on average produce a horse above the 78.94 average, but not much above that average.

    For every increase in dam rating there is (almost) always an increase in foal rating.
    For every increase in sire rating there is (almost) always an increase in foal rating.
    For the same rating band for sire and dam the dams appears to produce a better result
    e.g. at the 120-124 band the sires produced foals average 74.6 while the dams produce foals average 91.3.
    This does nor mean that similarly rated dams outperform sires.
    The sires in 120-124 are mated with many much lower rated dams, rated 85.44 on average, while the 120-124 rated dams will on average be rated with sires rated 123.60 on average.


  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I keep a database of Group 1, Group 2, Group 3 race winners for Ireland, England, France, Germany, Italy, USA (Grade 1 only).
    I also have the winners of these races from 1900 before they were designated "Group races".
    Group races were introduced in Europe in 1971 and in the USA in 1973.
    A horse who finished in the first three in the Group 1, Group 2, Group 3, Listed races can be entered in a sales catalogue in bold type ("black type").
    For a few years fourth place finishers were also "black type" horses but that is now limited to fourth in Group 1 only.

    I don't keep records of Listed race winners unless the race was at one time a Group 3 race or higher.
    Group races can go up and down in Grade based on the quality (rating) of the horses running in them.
    I don't record 2nd, 3rd, 4th place finishers. The reason is it would take too much time, and often it is only easy to find the winner.
    Any volunteers to find the 2nd and 3rd place finishers (and their pedigrees) in the Group 3 Italian race Premio Carlo E Francesco Aloisi at Capannelle on 18/11/73 (strangely I have a book with this info)?

    I'm not a fan of "black type". It is imo a lazy way of analysing a sales catalogue.
    Just look at the page and if you see plenty of bold type it is a good horse!
    I'm reading the Mark Johnston biography at the moment. I remember I was at The Curragh when his 2yo filly Millstream won the Group 3 Curragh Stakes on 09/07/94.
    There were three runners so they all qualified for black type, even Sharpness In Mind who was beaten 13 lengths in 3rd.
    The first two finishers were fillies, the 3rd was a gelding.
    If the 3rd finisher was a filly would you know in a sale catalogue that "her" black type was as a 13 length 3rd of three runners. Not a chance.
    (I also remember Mark Johnston being interviewed by the press in the ring after the race (was it that day?) in the rain. The press all had umberellas, and Mark did not. It was a lengthy interview, and he was getting wet in his expensive suit.)

    Analysis to come
    Over the next week I'll try to put up some stuff about black type winners (and nothing about black type placed).
    Are black type winners produced by black type dams? Most people (mistakenly?) think they are.
    How about the second dam and the third dam of black type winners?
    You see these dams in sales catalogues backing up the sales lot.
    Is there any difference between the second and third dams of black type winners and the rest of the racing population?
    I'm sure I'll think of a few more questions as I work on the data.

    The following table has a few major handicaps that I'll filter out (Wokingham, Lincoln)

    1990 to 2015: 36,144 races, 21,113 individual Group race winners
    country group_1 group_2 group_3 listed grade_1 grade_2 grade_3 handicap
    IRELAND 731 612 1628 325 0 0 0 0
    ENGLAND 2242 2521 4118 734 0 0 0 548
    FRANCE 2322 2068 5283 393 0 0 0 0
    GERMANY 647 939 1343 1 0 0 0 0
    ITALY 867 553 961 698 0 0 0 21
    USA 0 0 0 6 6199 335 49 0


  • Advertisement
  • Closed Accounts Posts: 4,744 ✭✭✭diomed


    I posted this preliminary work on Group race winners and their dams a few minutes ago.

    Some stats on Group race winners and their dams 1900 to 2015: Ireland, England, France, Germany, Italy, USA (Grade 1 only)

    36,122 Group races, 21,059 individual winners

    Group 1, Group 2, Group 3, Listed (a few), Grade 1, Grade 2 (a few), Grade 3 (a few), Handicaps (a few).
    The "a few" comment refers to races that drifted between listed and Group 3. The same for the USA, the sample was Grade 1 only but grades in those races vary over time.
    I treat races before 1971 (pattern introduction) as Group races i.e. The Epsom Derby from 1900 to 1970 is a Group 1.

    I've been tidying up the data before analysis so the above may change.

    dam's foals how many races? dams races runners

    dam's foals won 0 group races 0 132 119 *
    dam's foals won 1 group races 9525 9525 9525
    dam's foals won 2 group races 3497 6994 4324
    dam's foals won 3 group races 1705 5115 2454
    dam's foals won 4 group races 1004 4016 1604
    dam's foals won 5 group races 553 2765 967
    dam's foals won 6 group races 330 1980 631
    dam's foals won 7 group races 215 1505 435
    dam's foals won 8 group races 137 1096 287
    dam's foals won 9 group races 88 792 215
    dam's foals won 10 group races 56 560 137
    dam's foals won 11 group races 39 429 106
    dam's foals won 12 group races 17 204 53
    dam's foals won 13 group races 14 182 41
    dam's foals won 14 group races 12 168 31
    dam's foals won 15 group races 11 165 30
    dam's foals won 16 group races 10 160 33
    dam's foals won 17 group races 2 34 7
    dam's foals won 18 group races 3 54 8
    dam's foals won 20 group races 3 60 17
    dam's foals won 21 group races 4 84 14
    dam's foals won 22 group races 1 22 6
    dam's foals won 23 group races 1 23 4
    dam's foals won 25 group races 1 25 7
    dam's foals won 32 group races 1 32 4
    Totals 17229 36122 21059

    * I have the names of the 119 runners but not their dams.

    Which dam had four runners that won 32 Group races?
    Buonamica (1943) dam Of Barbara Sirani (10 Wins); Bonnard (1 Win); Botticelli (14 Wins); Braque (7 Wins) = 32 wins,
    almost all in Italy at San Siro and Capannelle, except two in England, Ascot Gold Cup, Doncaster Cup.

    How many dams of Group winners 1900-2015 won a Group race ............... 1,827 (10.6%)
    How many dams of Group winners 1900-2015 did not win a Group race .... 15,402 (89.4%)


    (dams of the "group" winners in the first years of the 20th century obviously could not have won Group races in the 20th century but could have won "group" races in the 19th century)
    (many races run today were not inaugurated until recently so dams of those winners my also have had fewer opportunities)

    A count of dams of Group winners born 1970+:
    won a Group race themselves .............................................871 (11.1%);
    didn't win a Group race themselves ...................................6,972 (88.9% )


    (it makes me wonder why sales catalogues have dams in black type)
    (my guess is it is mostly placed horses in Gr 1, Gr 2, Gr 3, listed. I ignore these cheap "black types" . They are not in my analysis)

    Conclusion:
    Horses that win group races do not get that ability from their dam's racing ability.


Advertisement