Comments on: Tony O’Hagan Responds (Not on Behalf of RMS) http://cstpr.colorado.edu/prometheus/?p=5162 Wed, 29 Jul 2009 22:36:51 -0600 http://wordpress.org/?v=2.9.1 hourly 1 By: An Elicitation of Expert Monkeys: (Hurricane related.) | The Blackboard http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13706 An Elicitation of Expert Monkeys: (Hurricane related.) | The Blackboard Sat, 02 May 2009 19:10:42 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13706 [...] My uncertainty about my inability to distinguish hurricane experts from monkeys is the fruit of a computationally intensive calculation inspired by a long conversation that began with Inexpert Elicitation by RMS on Hurricanes, by Roger Pielke Jr. and continued in Tony O’Hagan Responds (Not on Behalf of RMS). [...] [...] My uncertainty about my inability to distinguish hurricane experts from monkeys is the fruit of a computationally intensive calculation inspired by a long conversation that began with Inexpert Elicitation by RMS on Hurricanes, by Roger Pielke Jr. and continued in Tony O’Hagan Responds (Not on Behalf of RMS). [...]

]]>
By: Mark Bahner http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13696 Mark Bahner Thu, 30 Apr 2009 22:28:41 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13696 Hi Roger, It seems to me this whole exchange has been somewhat unfortunate. It seems to me that both you and Dr. O'Hagan are honest men, making honest points. Regarding your points: 1) You question the value of the expert elicitation, if 9 times out of 10 it produced a prediction indistinguishable from monkeys. You write that makes you "raise an eyebrow." Well, I agree with that. There are only a few instances where I would say that it *would* be important to get that 1-out-of-10 difference. For example, if there were asteroids 1 km across that the monkeys said every time that the probability that the asteroid would strike the earth was less than 1 in 10,000, and the experts agreed 9 out of 10 times, but one time thought that the probability was greater than 50/50, I'd definitely want to get the experts' opinions all 10 times. 2) You write, "What I have argued is that the landfall rate produced by the elicitation of experts has been in every instance (not “one”) indistinguishable (not “similar”) to a landfall rate provided by monkeys." This is where I'm a bit handicapped by not caring very much about this whole subject. ;-) Exactly how many elicitations have there been? Two? Three? Ten? Scores? Hundreds? (I'm guessing not "hundreds". ;-) ) 3) You write, "Perhaps there will be exceptions to this in the future." Well, there's a hugely important question. So it seems to me that Lucia's question about the width of the distribution of model predictions is important. If the distribution of model predictions is very narrow, then it wouldn't be expected that there'd ever be a significant difference. And if there have been scores or hundreds of previous elicitations, and every one has returned a predicted landfall number that is essentially identical to what monkeys would do, it would be reasonable not to expect future exceptions, under the Albert Einstein insanity definition. (I realize the previous sentence may contain an error in logic, but I expect geek points for invoking “the Albert Einstein insanity definition.” ;-) ) Best wishes, Mark Hi Roger,

It seems to me this whole exchange has been somewhat unfortunate. It seems to me that both you and Dr. O’Hagan are honest men, making honest points.

Regarding your points:

1) You question the value of the expert elicitation, if 9 times out of 10 it produced a prediction indistinguishable from monkeys. You write that makes you “raise an eyebrow.” Well, I agree with that. There are only a few instances where I would say that it *would* be important to get that 1-out-of-10 difference. For example, if there were asteroids 1 km across that the monkeys said every time that the probability that the asteroid would strike the earth was less than 1 in 10,000, and the experts agreed 9 out of 10 times, but one time thought that the probability was greater than 50/50, I’d definitely want to get the experts’ opinions all 10 times.

2) You write, “What I have argued is that the landfall rate produced by the elicitation of experts has been in every instance (not “one”) indistinguishable (not “similar”) to a landfall rate provided by monkeys.”

This is where I’m a bit handicapped by not caring very much about this whole subject. ;-) Exactly how many elicitations have there been? Two? Three? Ten? Scores? Hundreds? (I’m guessing not “hundreds”. ;-) )

3) You write, “Perhaps there will be exceptions to this in the future.”

Well, there’s a hugely important question. So it seems to me that Lucia’s question about the width of the distribution of model predictions is important. If the distribution of model predictions is very narrow, then it wouldn’t be expected that there’d ever be a significant difference. And if there have been scores or hundreds of previous elicitations, and every one has returned a predicted landfall number that is essentially identical to what monkeys would do, it would be reasonable not to expect future exceptions, under the Albert Einstein insanity definition. (I realize the previous sentence may contain an error in logic, but I expect geek points for invoking “the Albert Einstein insanity definition.” ;-) )

Best wishes,
Mark

]]>
By: ¿Se puede distinguir entre un experto en huracanes y un mono? « PlazaMoyua.org http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13693 ¿Se puede distinguir entre un experto en huracanes y un mono? « PlazaMoyua.org Thu, 30 Apr 2009 20:57:23 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13693 [...] Mi duda viene de un intenso cálculo computacional inspirado por una vieja discusión que comenzó con Inexpert Elicitation by RMS on Hurricanes, de Roger Pielke, y fue continuada en Tony O’Hagan Responds (Not on Behalf of RMS). [...] [...] Mi duda viene de un intenso cálculo computacional inspirado por una vieja discusión que comenzó con Inexpert Elicitation by RMS on Hurricanes, de Roger Pielke, y fue continuada en Tony O’Hagan Responds (Not on Behalf of RMS). [...]

]]>
By: SteveF http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13688 SteveF Thu, 30 Apr 2009 18:38:45 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13688 39 - Roger: Thanks for the reference. It looks to me like a self-serving hodge-podge produced by RMS staffers, which is not so surprising. Every model prediction is above the long term average, and every model prediction is above the medium term average.... There is no possibility the expert-weighted average prediction could be for anything except substantial increases in landfalls. Could a desire for profits at RMS somehow be involved in this research? I hope their customers call to complain 5 years from now about the accuracy of these predictions. Of course, it won't be much of a conversation if RMS cuts cost and a monkey answers the phone. 39 – Roger:

Thanks for the reference. It looks to me like a self-serving hodge-podge produced by RMS staffers, which is not so surprising. Every model prediction is above the long term average, and every model prediction is above the medium term average…. There is no possibility the expert-weighted average prediction could be for anything except substantial increases in landfalls. Could a desire for profits at RMS somehow be involved in this research? I hope their customers call to complain 5 years from now about the accuracy of these predictions.

Of course, it won’t be much of a conversation if RMS cuts cost and a monkey answers the phone.

]]>
By: KevinUK http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13686 KevinUK Thu, 30 Apr 2009 18:04:29 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13686 37 - stevef "I fear Roger will not be hired again as an expert by RMS. " Thanks to Lucia sterling work I can now confirm that I have been given the gig! Cheers Lucia, its always good to take full credit for other peoples work as Gavin (I'm not interested in the Super Bowl) Schmidt knows. Roger didn't stand a chance this year as sadly he is an expert and not an opinionated monkey like me. Having made the business case to RMS, it became a 'no brainer' for them. After all didn't you all know that there is a credit crunch on? Why pay good cash to be told what you want to hear when peanuts will do. Sorry Roger! KevinUK 37 – stevef

“I fear Roger will not be hired again as an expert by RMS. ”

Thanks to Lucia sterling work I can now confirm that I have been given the gig! Cheers Lucia, its always good to take full credit for other peoples work as Gavin (I’m not interested in the Super Bowl) Schmidt knows.

Roger didn’t stand a chance this year as sadly he is an expert and not an opinionated monkey like me. Having made the business case to RMS, it became a ‘no brainer’ for them. After all didn’t you all know that there is a credit crunch on? Why pay good cash to be told what you want to hear when peanuts will do. Sorry Roger!

KevinUK

]]>
By: lucia http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13685 lucia Thu, 30 Apr 2009 18:00:41 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13685 If they based forecasts on elicitations and not one has ever come up with result different from what we would get if the panel consisted of monkeys, then that is worth noticing. Only after noticing can anyone investigate why this might occur. As I commented above, because the field of "hurricane studies" consists of a relatively small number of people and the experts are, to some extent, the originators or promoters of the models discussed in the literature, there is some circularity in the entire process. In broader fields where the you can separate those who develop models from those weighting them, elicitation would probably be very useful. But in a small field? What if the entire fields consists of 20 experts, each of whom proposed 20 models. The 20 experts might even be subdivided into "teams", as in "William Gray's students" vs. "The Florida Crowd" and "Unaffiliated". Then RMS picks 7 experts of of the 20 and each just casts partisan votes. Will this look different than if 20 models exist and RMS gets to pick 7 experts out of 10,000 existing experts? I don't know the answer to this. But it seems to me that this sort of thing could affect the value of elicitation. In particular, it's possible that in small fields, one will tend to see the results of forecasts based on expert elicitation about models that cannot be differentiated from those based on elicitations of monkeys. Mark Bahner-- I agree that Tony has shown the weights assigned by the experts could not have been cast by 7 monkeys. If they based forecasts on elicitations and not one has ever come up with result different from what we would get if the panel consisted of monkeys, then that is worth noticing. Only after noticing can anyone investigate why this might occur. As I commented above, because the field of “hurricane studies” consists of a relatively small number of people and the experts are, to some extent, the originators or promoters of the models discussed in the literature, there is some circularity in the entire process.

In broader fields where the you can separate those who develop models from those weighting them, elicitation would probably be very useful. But in a small field?

What if the entire fields consists of 20 experts, each of whom proposed 20 models. The 20 experts might even be subdivided into “teams”, as in “William Gray’s students” vs. “The Florida Crowd” and “Unaffiliated”. Then RMS picks 7 experts of of the 20 and each just casts partisan votes. Will this look different than if 20 models exist and RMS gets to pick 7 experts out of 10,000 existing experts?

I don’t know the answer to this. But it seems to me that this sort of thing could affect the value of elicitation. In particular, it’s possible that in small fields, one will tend to see the results of forecasts based on expert elicitation about models that cannot be differentiated from those based on elicitations of monkeys.

Mark Bahner–
I agree that Tony has shown the weights assigned by the experts could not have been cast by 7 monkeys.

]]>
By: Roger Pielke, Jr. http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13683 Roger Pielke, Jr. Thu, 30 Apr 2009 17:45:40 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13683 -37-SteveF The models used in 2006 and 2007 (a subset fo those used in 2008) can be seen here: http://www.rms-research.com/references/Jewson.pdf No, I don't imagine I'll be invited back ;-) -37-SteveF

The models used in 2006 and 2007 (a subset fo those used in 2008) can be seen here:

http://www.rms-research.com/references/Jewson.pdf

No, I don’t imagine I’ll be invited back ;-)

]]>
By: Roger Pielke, Jr. http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13682 Roger Pielke, Jr. Thu, 30 Apr 2009 17:36:54 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13682 -Mark- Prof O'Hagan is correct about many things, among them: 1. The experts produce a PDF entirely distinguishable from that of monkeys 2. In theory the process would allow a result from the experts different from that of monkeys However, we disagree about a few things: 3. He writes: "Even if 9 times out of 10 the results were to be indistinguishable from a simulated monkey exercise, I’d still want to use the experts for that 1 time in 10 when they give a noticeably different prediction." The problem is that every time the elicitation has been done in this manner it has resulted in average landfall rates indistinguishable from moneys. If the elicitation failed to distinguish results from monkeys (i.e., 90% chance of being no different from random), yes I think that was problematic. So too I think would many people who rely on RMS. 4. He writes: "[in Roger's] opinion the only thing that matters is the final predicted landfalling rate" Well no. I think a lot more matters, but as far as I am aware RMS uses only the final predicted landfall rate as input to its model. So while there are many of academic sort of things that could be said about the (potential) value of the elicitation, my focus has been narrowly on its practical application by RMS. 5. He writes: "To argue as Roger does, that the experts are just like monkeys because on one occasion they came up with a very similar final prediction, is just perverse." This is not what I've argued. (Just look at me and you can see I am no monkey;-) What I have argued is that the landfall rate produced by the elicitation of experts has been in every instance (not "one") indistinguishable (not "similar") to a landfall rate provided by monkeys. Perhaps there will be exceptions to this in the future. Does 3, 4, 5 above make me raise an eyebrow about the process? You bet. Thanks all for the exchange. -Mark-

Prof O’Hagan is correct about many things, among them:

1. The experts produce a PDF entirely distinguishable from that of monkeys

2. In theory the process would allow a result from the experts different from that of monkeys

However, we disagree about a few things:

3. He writes: “Even if 9 times out of 10 the results were to be indistinguishable from a simulated monkey exercise, I’d still want to use the experts for that 1 time in 10 when they give a noticeably different prediction.”

The problem is that every time the elicitation has been done in this manner it has resulted in average landfall rates indistinguishable from moneys. If the elicitation failed to distinguish results from monkeys (i.e., 90% chance of being no different from random), yes I think that was problematic. So too I think would many people who rely on RMS.

4. He writes: “[in Roger's] opinion the only thing that matters is the final predicted landfalling rate”

Well no. I think a lot more matters, but as far as I am aware RMS uses only the final predicted landfall rate as input to its model. So while there are many of academic sort of things that could be said about the (potential) value of the elicitation, my focus has been narrowly on its practical application by RMS.

5. He writes: “To argue as Roger does, that the experts are just like monkeys because on one occasion they came up with a very similar final prediction, is just perverse.”

This is not what I’ve argued. (Just look at me and you can see I am no monkey;-) What I have argued is that the landfall rate produced by the elicitation of experts has been in every instance (not “one”) indistinguishable (not “similar”) to a landfall rate provided by monkeys. Perhaps there will be exceptions to this in the future.

Does 3, 4, 5 above make me raise an eyebrow about the process? You bet.

Thanks all for the exchange.

]]>
By: SteveF http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13681 SteveF Thu, 30 Apr 2009 17:34:47 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13681 An interesting exchange. The discussion of (statistical) types of monkeys is also quite entertaining. I am sorry that Tony O'Hagan is gone, since his contribution was really positive and informative, once he got over his (apparent) anger at Roger's comments; I fear Roger will not be hired again as an expert by RMS. Certainly Tony is correct that a panel of experts is better able to evaluate the technical merits of the models used to predict the number of hurricane landfalls than are an equal number of monkeys, even Lucia's highly opinionated monkeys, although I can understand Lucia's approach, since the majority of experts I have met seem to have very little skill and very strong opinions. But I think this all really misses the point: do the MODELS, however originally selected by RMS and no matter how they were weighted by experts, really have any significant skill at predicting hurricane landfalls? Perhaps if Tony (or RMS) could release a list of the models' predicted landfall rates, without revealing the identities of the models or the experts' weightings of those models, we could at least see if the range of possible predictions is reasonable. Assuming that all of the models are technically serious (i.e., not based on how the New York Mets play between April 1 and June 1), then does the range of predicted landfalls include the known average landfall rate for the last 50 years and the average number of landfalls for 5 year periods over the last 50 years? If most or all the predictions are above or below the historical rates, then it seems quite possible, if not likely, that the expert-weighted combination of the models will overstate or understate the risk of future landfalls, since the historical landfall rate over 50 years does not appear to have any statistically significant trend. After all, garbage in, garbage out. An interesting exchange. The discussion of (statistical) types of monkeys is also quite entertaining.

I am sorry that Tony O’Hagan is gone, since his contribution was really positive and informative, once he got over his (apparent) anger at Roger’s comments; I fear Roger will not be hired again as an expert by RMS. Certainly Tony is correct that a panel of experts is better able to evaluate the technical merits of the models used to predict the number of hurricane landfalls than are an equal number of monkeys, even Lucia’s highly opinionated monkeys, although I can understand Lucia’s approach, since the majority of experts I have met seem to have very little skill and very strong opinions.

But I think this all really misses the point: do the MODELS, however originally selected by RMS and no matter how they were weighted by experts, really have any significant skill at predicting hurricane landfalls?

Perhaps if Tony (or RMS) could release a list of the models’ predicted landfall rates, without revealing the identities of the models or the experts’ weightings of those models, we could at least see if the range of possible predictions is reasonable. Assuming that all of the models are technically serious (i.e., not based on how the New York Mets play between April 1 and June 1), then does the range of predicted landfalls include the known average landfall rate for the last 50 years and the average number of landfalls for 5 year periods over the last 50 years? If most or all the predictions are above or below the historical rates, then it seems quite possible, if not likely, that the expert-weighted combination of the models will overstate or understate the risk of future landfalls, since the historical landfall rate over 50 years does not appear to have any statistically significant trend.

After all, garbage in, garbage out.

]]>
By: PaulM http://cstpr.colorado.edu/prometheus/?p=5162&cpage=1#comment-13680 PaulM Thu, 30 Apr 2009 17:09:43 +0000 http://sciencepolicy.colorado.edu/prometheus/?p=5162#comment-13680 Who were these 7 experts? And how were they chosen? Did they include Christopher Landsea, the hurricane expert with dozens of papers on hurricanes over 20 years, who resigned from the IPCC saying "All previous and current research in the area of hurricane variability has shown no reliable, long-term trend up in the frequency or intensity of tropical cyclones, either in the Atlantic or any other basin. ... It is beyond me why my colleagues would utilize the media to push an unsupported agenda that recent hurricane activity has been due to global warming" ? Who were these 7 experts?
And how were they chosen?
Did they include Christopher Landsea, the hurricane expert with dozens of papers on hurricanes over 20 years, who resigned from the IPCC saying
“All previous and current research in the area of hurricane variability has shown no reliable, long-term trend up in the frequency or intensity of tropical cyclones, either in the Atlantic or any other basin. … It is beyond me why my colleagues would utilize the media to push an unsupported agenda that recent hurricane activity has been due to global warming” ?

]]>