Dosage Index

diomed · 24-05-2015 07:12PM #1

History
Many people use the Dosage Index to decide if a horse will last the Epsom Derby 1m 4f.
For about twenty years I have had doubts about the Dosage Index.
About 100 years ago the Aga Khan’s pedigree adviser, J J Vuillier identified the important sires that appeared in the 12-generation pedigrees of major winners.
In the 1960/1970s Franco Varola built on the work, but marked important sires with one of five categories: brilliant, intermediate, classic, stout, professional. i.e from sprinter to classic to stayer.
And of course he added the important sires for the years since the earlier Col Vuillier work.
In 1977 an article about dosage by Abram Hewitt appeared in the Blood-Horse magazine.
In the 1980s(?) Steven Roman introduced his modern Dosage Index.

Disagreement
When Roman's system came into operation there was disagreement (Hewitt/Roman?) as to which aptitudinal category some sires belonged, so they split those sires 50/50 between two categories.
So we have split sires: brilliant/classic sires (18); classic/professional (7) [ two categories apart]; and others that are split brilliant/intermediate (18); classic/solid (14); intermediate/classic (22); solid/professional (2)[adjacant categories].
And of course we have the plain brilliant, intermediate, classic, solid, professional sires where the two experts agreed.
81 sires were split into two categories due to a disagreement (?) from a total of about 180 sires.

Missing Sires
When questions are asked as to why some important sires are not chef-de-race sires and not in the Dosage Index the answer will be something like “they are not prepotent for one of the five stamina categories” i.e. they produce good horses but at all distances. This is strange, as they already had no problem splitting 81 horses into two classifications.

Different Calculation
I am going to prepare my own index. This is Europe, but imo the Dosage Index market is the USA and the Dosage Index is dominated by USA sires.
I haven’t yet done a count of the number of Group winners, just the Group wins. As far as I know Steven Roman does his calculation on all runners by a sire. I don’t have that data, but I have the Group winners. I think the chef-de-race sires are claimed to do better with a dam than an "ordinary" sire did, and that improvement is what identifies them.

My pedigree database has 316,000 horses. There are 4,598 sires that produced a Group race winner in Ireland, England, France, Italy, Germany, USA (Gr 1 only). I counted the number of wins by the offspring of each sire at each distance from 5f to 20f (1f increments), and also did a calculation of the average winning distance.
For example Caerleon: 5f(1),6f(6),7f(13),8f(20),9f(8),10f(21),11f(4),12f(29),14f(5),15f(4),18f(1) – average 9.9f. There are few races at 9f and 11f, so I might adjust for that.

These seven sires have 100+ Group wins but are not chef-de-race sires: Caerleon; Danehill Dancer; Danehill; Dansili; Galileo; Pivotal; Storm Cat.
There are 48 non-chef-de-race sires with 50+ Group wins. Alleged has 93 wins (awd 11.3f), and Green Desert 92 wins (awd 7.1f).
Then you have the problem of good sires like Troy (7 offspring group wins) who had very short sire careers.

Dosage Index discrepancies
The dosage index ignores many major sires but I am not going to ignore them. I am also going to review the Dosage Index chef-de-race classifications against my average winning distance calculations as I am not happy that almost half chef-de-race classifications were decided because two pedigree experts could not agree.

Danzig with 167 wins and an awd of 7.7f has an IC (intermediate/classic) designation, but so has Riverman with 120 wins and an awd of 9.1f.
Nureyev with 164 wins and an awd of 8.1f is a C (classic) designation, but Pharis with 94 offspring Group wins and an awd of 9.9 is a B (brilliant) classification
i.e. Pharis is categorised as a sire of speedier stack than Nureyev although his group winners were at 1.8f longer distance.
Pharis awd 9.9 (a brilliant sire) has 22 12f wins from 94 wins and 11 wins greater than 12f.
Nureyev awd 8.1f (a classic sire) has 12 12f wins from 164 wins with no wins greater than 12f.

tryfix · 24-05-2015 07:38PM

Good stuff Diomed, let us know how you get on.

diomed · 24-05-2015 10:54PM

I'm surprised I got so many thanks. When I re-read it just now I had difficulty understanding it.

diomed · 26-05-2015 02:51PM

Done.
It took the mother of all programming sessions.

It uses 942 sires which is more than the ~ 220 sires used by the Dosage Index. When I have time I will look at the results in detail and see if there are any major gaps.
I already have one sire in mind that was not selected due to few runners in my data, but those few runners have since produced a large number in later generations.
An obvious next step is to run the calculation against my Group winners database, 1900 to 2014, 39744 winners.
Then I can see if the sprint winners got low furlong numbers, the milers got mile numbers, the middle distance and cup horse got high numbers.

Some of my thinking does not deem logical. I have a handful of sires with a best distance of less than two furlongs (my idea), and some sires with a best distance of over twenty furlongs.
When I was arriving at this conclusion I remembered something in Tim Fitzgeorge-Parker's book, Training The Racehorse.
He said "One word of warning -the old adage of "speed to speed" is sound, but do not take the chance when it has been overdone; for example, the result of a sprint horse out of a sprint mare will probably fly for three furlongs, but unfortunately there are no three furlong races!"

I used the winners produced by sires and the distances at which they won.
Now 5f sires can never produce hundreds of winners that have an average winning distance of 5.0f. Some of his progeny might try a race at 6f or longer and ruin his stats.
The same with a 12f sire. Some of his winners will win at distances less than 12f.
A sprint sire's runners will always have an awd greater the the sire's contribution, and a 12f sire will always have runners with an awd less than his contribution.

How could a sire with a best distance of 1.8f ever sire a winner? With a miler mare.
I am not talking about a sire who could only run two furlongs, it is a sire who produces runners who have difficulty getting five furlongs.

Anyway this is only the Mark I version. I'm sure you would like some numbers. It calculates 667 runners in under 18 seconds, or about 40 runners a second.

Did you expect the calculation to give 5.0f for a five furlong runner, and give 12.0f for a twelve furlong runner?
To get a result like that you would need sprinters to have ancestors who only produced 5f sprinters, and middle-distance horses whose ancestors only produced 12f horses.

These are the numbers for the 2015 derby field (in furlongs), and the Derby winners from 1960 to 2014.

Derby 2015
Best Of Times (2012) Derby 2015 [ 8.7f ]
Carbon Dating (2012) Derby 2015 [ 9.1f ]
Elm Park (2012) Derby 2015 [ 8.9f ]
Epicuris (2012) Derby 2015 [ 8.8f ]
Giovanni Canaletto (2012) Derby 2015 [ 9.5f ]
Gleneagles (2012) Derby 2015 [ 9.3f ]
Golden Horn (2012) Derby 2015 [ 7.9f ]
Great Glen (2012) Derby 2015 [ 10.5f ]
Hans Holbein (2012) Derby 2015 [ 11.3f ]
Jack Hobbs (2012) Derby 2015 [ 10.3f ]
Kilimanjaro (2012) Derby 2015 [ 10.3f ]
Moheet (2012) Derby 2015 [ 9.2f ]
Prince Gagarin (2012) Derby 2015 [ 9.4f ]
Rocky Rider (2012) Derby 2015 [ 10.1f ]
Rogue Runner (2012) Derby 2015 [ 10.3f ]
Storm The Stars (2012) Derby 2015 [ 9.0f ]
Success Days (2012) Derby 2015 [ 8.0f ]
Sumbal (2012) Derby 2015 [ 9.2f ]
Zawraq (2012) Derby 2015 [ 8.6f ]
Average 9.4f

Derby winners
St Paddy (1957) Derby winner [ 12.2f ]
Psidium (1958) Derby winner [ 11.1f ]
Larkspur (1959) Derby winner [ 10.6f ]
Relko (1960) Derby winner [ 11.1f ]
Santa Claus (1961) Derby winner [ 11.5f ]
Sea-Bird (1962) Derby winner [ 10.1f ]
Charlottown (1963) Derby winner [ 11.5f ]
Royal Palace (1964) Derby winner [ 11.0f ]
Sir Ivor (1965) Derby winner [ 9.7f ]
Blakeney (1966) Derby winner [ 9.7f ]
Nijinsky (1967) Derby winner [ 8.3f ]
Mill Reef (1968) Derby winner [ 9.2f ]
Roberto (1969) Derby winner [ 9.1f ]
Morston (1970) Derby winner [ 11.5f ]
Snow Knight (1971) Derby winner [ 11.7f ]
Grundy (1972) Derby winner [ 9.2f ]
Empery (1973) Derby winner [ 10.9f ]
The Minstrel (1974) Derby winner [ 8.2f ]
Shirley Heights (1975) Derby winner [ 10.3f ]
Troy (1976) Derby winner [ 9.7f ]
Henbit (1977) Derby winner [ 9.6f ]
Shergar (1978) Derby winner [ 10.0f ]
Golden Fleece (1979) Derby winner [ 9.5f ]
Teenoso (1980) Derby winner [ 11.2f ]
Secreto (1981) Derby winner [ 8.3f ]
Slip Anchor (1982) Derby winner [ 10.9f ]
Shahrastani (1983) Derby winner [ 9.3f ]
Reference Point (1984) Derby winner [ 9.3f ]
Kahyasi (1985) Derby winner [ 10.3f ]
Nashwan (1986) Derby winner [ 9.3f ]
Quest For Fame (1987) Derby winner [ 10.3f ]
Generous (1988) Derby winner [ 9.4f ]
Dr Devious (1989) Derby winner [ 8.6f ]
Commander In Chief (1990) Derby winner [ 8.7f ]
Erhaab (1991) Derby winner [ 7.9f ]
Lammtarra (1992) Derby winner [ 8.9f ]
Shaamit (1993) Derby winner [ 10.9f ]
Benny The Dip (1994) Derby winner [ 9.6f ]
High-Rise (1995) Derby winner [ 11.8f ]
Oath (1996) Derby winner [ 8.6f ]
Sinndar (1997) Derby winner [ 9.1f ]
Galileo (1998) Derby winner [ 9.7f ]
High Chaparral (1999) Derby winner [ 10.1f ]
Kris Kin (2000) Derby winner [ 10.1f ]
North Light (2001) Derby winner [ 9.3f ]
Motivator (2002) Derby winner [ 10.0f ]
Sir Percy (2003) Derby winner [ 10.1f ]
Authorized (2004) Derby winner [ 11.7f ]
New Approach (2005) Derby winner [ 9.4f ]
Sea The Stars (2006) Derby winner [ 8.7f ]
Workforce (2007) Derby winner [ 10.0f ]
Pour Moi (2008) Derby winner [ 11.1f ]
Ruler Of The World (2010) Derby winner [ 9.5f ]
Australia (2011) Derby winner [ 9.5f ]
Average 10.0f

Derby winners (ten short runners)
Erhaab (1991) Derby winner [ 7.9f ]
The Minstrel (1974) Derby winner [ 8.2f ]
Nijinsky (1967) Derby winner [ 8.3f ]
Secreto (1981) Derby winner [ 8.3f ]
Dr Devious (1989) Derby winner [ 8.6f ]
Oath (1996) Derby winner [ 8.6f ]
Commander In Chief (1990) Derby winner [ 8.7f ]
Sea The Stars (2006) Derby winner [ 8.7f ]
Lammtarra (1992) Derby winner [ 8.9f ]
Roberto (1969) Derby winner [ 9.1f ]

Derby winners (ten long runners)
Relko (1960) Derby winner [ 11.1f ]
Pour Moi (2008) Derby winner [ 11.1f ]
Teenoso (1980) Derby winner [ 11.2f ]
Santa Claus (1961) Derby winner [ 11.5f ]
Charlottown (1963) Derby winner [ 11.5f ]
Morston (1970) Derby winner [ 11.5f ]
Snow Knight (1971) Derby winner [ 11.7f ]
Authorized (2004) Derby winner [ 11.7f ]
High-Rise (1995) Derby winner [ 11.8f ]
St Paddy (1957) Derby winner [ 12.2f ]

diomed · 27-05-2015 01:37PM

diomed wrote: »

I have a handful of sires with a best distance of less than two furlongs (my idea), and some sires with a best distance of over twenty furlongs.

An example might illustrate my thinking.

I calculate the likely best running distance of a horse based on the 31 sires in its 5 generation pedigree of 62 ancestors. Most people would look at the Racing Post website and look at the average winning distance of the sire. Job done. But the RP awd is based on hundreds of matings with mares that probably ran at every distance from 5f to 20f. I am only betting on one horse, and I can know a lot more about this horse than I can know about all those mares.

Montjeu
Calculation of sire distance contribution
Average winning distance of all 36,000+ Group winners 9.3f.
Average winning distance of Montjeu Group winners 11.8f
( Montjeu contribution + Group winner average) / 2 = 11.8f
( Montjeu contribution + 9.3f ) / 2 = 11.8f
( Montjeu contribution + 9.3f ) = 11.8f x 2
Montjeu contribution = 23.6f – 9.3f
Montjeu contribution = 14.3f

Compton Place
( Compton Place contribution + 9.3f) / 2 = 5.6f
Compton Place contribution = 1.8f
I know Chookie Hamilton, sired by Compton Place, won four times at 14f in very low grade races.
When I use the Compton Place 1.8f I also use the numbers for the other 30 sires in the 5 generation pedigree.

Sharp thinkers will spot a fudge in my calculation. I assume the mares producing the 113 Montjeu’s offspring Group wins are the same/similar to the mares producing all the 35,000+ Group wins.
I also assume a 12f sire and a 9f mare produce a 10.5f horse. This is more likely than they produce a 5f horse or a 16f horse.

Montjeu’s racing record in Group races: 1x10f; 1x11f; 7x12f = average 11.7 f.
Montjeu’s dam, Floripedes, won a 15f Group 3 and was 2nd in a 15.5f Group 1.
Montjeu’s average winning distance of the Racing Post website is 11.6f which is similar to my calculation of Montjeu’s contribution of 14.3f + population average 9.3f = 11.8f

I use my calculated contribution from each of the 31 sires in the 5 generation pedigree as I think it is more accurate.
Example - Hans Holbein: 1st gen Montjeu 14.2f; 2nd gen Sadler’s Wells 12.6f & 2nd gen Shirley Heights 13.4 and so on.

Francie Barrett · 29-05-2015 03:36PM

Just want to say thanks for doing these numbers. I guess the obvious thing that leaps out at you is just how low an index that Golden Horn gets. It's so low, that it defies belief that this could be such a hot market favourite. Of course, if anyone watched the Dante a few weeks ago, it's a bit more clearer as to why this is so well backed. The manner of defeat that was inflicted on Jack Hobbs on an initial view looks quite decisive. To throw a spanner into the works though, that race looked like it was perfect for Golden Horn. I don't have the numbers, but the early pace looked that bit slow, so the sprint finish played nicely into William Buick's hands that day.

Looking further down through the runners, I find it hard to find anything of interest. Zarwaq by your numbers has a slightly better index than Golden Horn, but there's a few other negatives in its form guide (has not run over 8f, will not have had a run in 55) and I find it hard to be enthused by Dermot Weld's Derby record. Elm Park didn't look near the level of Jack Hobbs and nothing in the form guide or breeding suggests that a step up in trip would help. That doesn't leave you with a whole pile, the O'Brien horses look like an afterthought (what the heck is Kilimanjaro doing in the Derby). The only one left in the pack I can't write off is the French horse, but only because I know nothing about it.

Seems to me that you just could not go wrong with Jack Hobbs at 7/1 e/w. Even if it does not win, the step up should keep it there or thereabouts.

diomed · 29-05-2015 05:04PM

Thanks for the comments Francie.
I also thought the Dante was slow early, because of comments in running: Nafaqa "took fierce hold", and Elm Park "keen early". The time final time was a good Dante time.

Since I worked out the "best" distance of each sire mentioned above I've noticed some discrepancies between those numbers and "best" distance numbers from another calculation I do. When time permits the next job is to line the two sets up and investigate the bigger differences.

I had a bet on Jack Hobbs at 15.5 after the Dante as I think he is the most solid in a confusing Derby. Another bet is on Storm The Stars at an average 66, and a small bet on Prince Gagarin at 466. Storm The Stars has had too many races in a short time, with a shortish break to the Derby. He might not run. Also small bets on Kilimanjaro and Hans Holbein at good odds to cover.

Fwiw have a look at Jack Hobbs training on the track at the Press day. He looked good.

diomed · 29-05-2015 05:18PM

https://www.youtube.com/watch?v=JZTM_2o3-Ms

Francie Barrett · 29-05-2015 11:43PM

Francie Barrett wrote: »

I don't have the numbers, but the early pace looked that bit slow, so the sprint finish played nicely into William Buick's hands that day.

Wanted to find more about this, thankfully Google saved me getting the stopwatch out. Timeform say that the last 3 furlongs for this year's Dante was quick. However, I still a fear that the Derby this year could be set at a wretched early pace.

https://www.timeform.com/Racing/Articles/sectional-debrief-york,-dante-meeting-days-one-and-two-1552015

diomed · 25-06-2015 12:26AM

The work earlier in the thread was on sire distance.
Below is a bit of information on sire quality calculations.

First sire quality calculation.
Average all the sire offspring's ratings.

Example
Sadler's Wells (b.1981) has 1,241 rated horses and 249 unrated horses in the database. His 249 unrated horses probably raced abroad, or never raced (usually fillies).
His sire average is calculated as 86.93 from the ratings of his 1,241 rated offspring.
When this number is compared with the population average rating for over 100k horses, the Sadler's Wells rating is increased as follows:
Sadler's Wells rating + population average rating = 86.93 (or 86.93*2 - population average = Sadler's Wells average rating).
This assumes the mares mated with Sadler's Wells are population average, but most people would think this unlikely.

Second sire quality calculation.
Each dam in the database has offspring with individual ratings.
Calculate the average rating of the dam offspring group. [dams must have >=3 rated offspring]
Divide the individual offspring rating by the dam offspring group rating.
The idea is to find out which sire performed best with the mare, an indication of sire quality.

Example
Aiglonne (b.1987) has five offspring in the database, four with ratings (72,84,79,105 = 85 avg. rating).
1. Democrate (rated 105) by Dalakhani, so 105/85 = 1.24 rating for Dalakhani.
2. Ciceron (rated 84) by Pivotal, so 84/85 = 0.99 rating for Pivotal.
3. Crosswing (rated 79) by Cape Cross, so 79/85 = 0.93 for Cape Cross.
4. Apophis (rated 72) by Rainbow Quest gives a 72/85 = 0.85 for Rainbow Quest.

Sire ratings are averaged for the database (e.g. Sadler's Wells 1241 rated; Dalakhani 173, Pivotal 537, Cape Cross 520, Rainbow Quest 539).[sires must have >= 10 rated horses]
The average tells how a sire did with all the mares he was mated with (in competition with the other sires she was mated with).
This should eliminate the advantage of sires who were only mated with top class mares.
A sire with a number > 1.0 for all his mares has done better than average with the mares.

diomed · 25-06-2015 01:42AM

Example

Sire____________________rated__unrated__(sire_qual)__(qual_avg)
American Post______________57______2______(79.91)_____(0.94)
Antonius Pius_______________78______3______(61.93)_____(0.99)
Azamour___________________83_____25______(93.79)_____(1.02)
Bachelor Duke_______________53_____2______(75.97)_____(1.00)
Dubawi____________________146____81_____(101.45)_____(1.03)
Footstepsinthesand__________115____17______(84.51)_____(1.04)
Haafhd_____________________96_____5______(79.09)_____(0.99)
Hurricane Run________________67____12______(94.05)_____(1.01)
Iceman_____________________51_____1______(69.69)_____(0.98)
Iffraaj______________________65____24______(86.81)_____(1.04)
Kheleyf____________________122____17______(68.17)_____(1.03)
King Kamehameha____________55_____0_____(128.95)_____(0.97)
Lucky Story_________________60_____6______(60.95)_____(1.04)
Motivator___________________82____18______(94.45)_____(1.01)
One Cool Cat_______________134_____2______(74.81)_____(0.98)
Oratorio___________________124_____13______(80.75)_____(0.99)
Pastoral Pursuits_____________52______9______(73.75)_____(1.10)
Shamardal_________________153_____61_____(105.63)_____(1.08)
Whipper____________________73_____14______(86.91)_____(1.02)

diomed · 18-11-2016 01:59AM

http://www.chef-de-race.com/farewell.htm

diomed · 18-11-2016 02:08AM

diomed wrote: »

Disagreement
When Roman's system came into operation there was disagreement (Hewitt/Roman?) as to which aptitudinal category some sires belonged, so they split those sires 50/50 between two categories.

My error. The disagreement was between Abram S Hewitt and Franco Varola.
I don't think it was a face-to-face disagreement, probably Hewitt placed sires in categories that differed from the earlier Varola books.

Original Daily Racing Form newspaper article
http://www.chef-de-race.com/dosage/drf_series/original_drf_pt1.htm

Dosage Index

Comments