NavList:
A Community Devoted to the Preservation and Practice of Celestial Navigation and Other Methods of Traditional Wayfinding
Re: Rejecting outliers
From: Peter Hakel
Date: 2011 Jan 8, 05:56 -0800
From: George Huxtable <george@hux.me.uk>
To: NavList@fer3.com
Sent: Sat, January 8, 2011 3:47:48 AM
Subject: [NavList] Re: Rejecting outliers
[parts deleted by PH]
My "well-known skepticism" remains undimmed, I hope. I was most sceptical
about the use, in Peter's computer analysis, of a slope of 24' (over 5
minutes of time) deduced from a short sequence of highly-scattered data.
Indeed, there was so much scatter in it that a slope of 32' was,
statistically speaking, perfectly compatible with that set, if a somewhat
worse fit than the 24', deduced and employed in Peter Hakel's analysis. Is
Peter prepared to defend his value against the other?
Response from PH: You provided the best defense when you wrote:
"So let's have a look at the data that Peter Hakel's estimate was based on,
in the attachment. I would agree that the better fit, with no other
information to go on than those plotted point, would be the continuous line
at a slope of 24. Peter's program says so, an Excel fit says so, and my eye
says so."
The subject of this thread is "Rejecting outliers" which is a good reminder of why I entered this debate in the first place. You expressed discomfort with "common-sense" elimination of outliers by visual inspection on a plot with a predetermined slope. I recalled that the weighted least squares method is a technique that can attach a numerical value (the weight) related to the degree of that rejection and became curious what it could do for us here. The "free Scatter parameter" is not really free as it can be guided by the value of the normalized chi^2. Although my implementation does have a heuristic element in it (Eq2!), it does provide a justifiable method of detecting and rejecting outliers without having to plot the data.
You continued:
"I pointed to the details of the observation that was made, because it tells
us (unless there was some major error as yet undisclosed, which is always
conceivable) that indeed 32' was the known, correct, slope, and the slope
of 24' which Peter Hakel derived from that data and used in his analysis
was simply way-out from that truth. Peter Fogg's data provides a good
example of a case where precalculating slope can be useful.
Under other conditions, a different situation might well arise, in which a
very uncertain DR generates greater uncertainties in precalculating a slope
than does the observed trend of altitudes, and perhaps Peter Hakel and I
might agree that one should choose the most appropriate method depending on
the circumstances."
Yes, I agree.
Peter Hakel
From: Peter Hakel
Date: 2011 Jan 8, 05:56 -0800
From: George Huxtable <george@hux.me.uk>
To: NavList@fer3.com
Sent: Sat, January 8, 2011 3:47:48 AM
Subject: [NavList] Re: Rejecting outliers
[parts deleted by PH]
My "well-known skepticism" remains undimmed, I hope. I was most sceptical
about the use, in Peter's computer analysis, of a slope of 24' (over 5
minutes of time) deduced from a short sequence of highly-scattered data.
Indeed, there was so much scatter in it that a slope of 32' was,
statistically speaking, perfectly compatible with that set, if a somewhat
worse fit than the 24', deduced and employed in Peter Hakel's analysis. Is
Peter prepared to defend his value against the other?
Response from PH: You provided the best defense when you wrote:
"So let's have a look at the data that Peter Hakel's estimate was based on,
in the attachment. I would agree that the better fit, with no other
information to go on than those plotted point, would be the continuous line
at a slope of 24. Peter's program says so, an Excel fit says so, and my eye
says so."
The subject of this thread is "Rejecting outliers" which is a good reminder of why I entered this debate in the first place. You expressed discomfort with "common-sense" elimination of outliers by visual inspection on a plot with a predetermined slope. I recalled that the weighted least squares method is a technique that can attach a numerical value (the weight) related to the degree of that rejection and became curious what it could do for us here. The "free Scatter parameter" is not really free as it can be guided by the value of the normalized chi^2. Although my implementation does have a heuristic element in it (Eq2!), it does provide a justifiable method of detecting and rejecting outliers without having to plot the data.
You continued:
"I pointed to the details of the observation that was made, because it tells
us (unless there was some major error as yet undisclosed, which is always
conceivable) that indeed 32' was the known, correct, slope, and the slope
of 24' which Peter Hakel derived from that data and used in his analysis
was simply way-out from that truth. Peter Fogg's data provides a good
example of a case where precalculating slope can be useful.
Under other conditions, a different situation might well arise, in which a
very uncertain DR generates greater uncertainties in precalculating a slope
than does the observed trend of altitudes, and perhaps Peter Hakel and I
might agree that one should choose the most appropriate method depending on
the circumstances."
Yes, I agree.
Peter Hakel