Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Error Propagation and Combining Uncertainties in Gaussian Noise, Study notes of Statistics

The error propagation equation for calculating the uncertainty in a measured statistic, x, when multiple parameters, u, v, etc., are involved. The document also covers the case of additive uncertainties and the combination of uncertainties in a weighted sum. The text assumes a Gaussian distribution of noise and uncorrelated sources.

What you will learn

  • What is the error propagation equation for calculating the uncertainty in a measured statistic?
  • How does the equation for error propagation change when dealing with additive uncertainties?
  • What assumptions are made about the noise distribution and sources in the error propagation equation?

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

millionyoung
millionyoung 🇬🇧

4.5

(25)

242 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics Part 2: Combining Uncertainties
Background
In the last set of notes, I showed the most thorough way to estimate uncertainties: do a
full end-to-end simulation of your observation. This is the most accurate method, but can
also be very time-consuming.
In this set of notes, I show some shortcuts, which in many situations will save you a
lot of time, and allow you to make quick estimates. But be warned – these shortcuts can
be dangerous.
Assumptions
The assumptions we have to make to use these shortcuts are:
All sources of noise follow the Gaussian distribution. If you have outlying points
and don’t get rid of them effectively, the short cuts discussed here will not work.
Poisson distributions will also give you trouble, unless µ is large (> 10), in which
case the Poisson distribution looks a lot like a Gaussian!
When you apply an operation (such as multiplication or taking the log) to a
variable with Gaussian noise, the resultant number is assumed also to have
Gaussian noise. Can give you grief, as this is not always true! A classic example
is converting fluxes into magnitudes, which involves taking a log. This works fine
if the noise σ is much less than the mean flux. But if they are comparable, you can
get near-zero fluxes, or even negative ones. Taking the log of these will give you
near-infinite magnitudes!
If we make these assumptions, we can approximate the noise on any variable as a
Gaussian, which is fully specified by its standard deviation s. Which is much simpler than
needing the full probability density function for each variable.
The Procedure
1. Choose the statistic x you wish to compute – the one that tells you what you are
scientifically interested in. It will in general be a function of the quantities u, v, w
that you observe.
2. Work out what the uncertainty is in each of the observed quantities (σu, σv…)
3. Use the error propagation equation to work out what the uncertainty in your
statistic is (σx).
4. Use your understanding of the Gaussian distribution to work out whether this
predicted uncertainty is acceptable.
Why do you want to do this?
Here are some typical astronomical situations in which you’d want to go through this
procedure:
Writing a telescope proposal. You need to demonstrate what telescope you need
to solve your scientific problem, and how much exposure time you require.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Error Propagation and Combining Uncertainties in Gaussian Noise and more Study notes Statistics in PDF only on Docsity!

Statistics Part 2: Combining Uncertainties Background

full end also be very time^ In the last set of notes, I showed the most thorough way to estimate uncertainties: do a-to-end simulation of your observation. This is the most accurate method, but can-consuming. lot of time, and allow you to make quick estimates. But be warned be dangerous.^ In this set of notes, I show some shortcuts, which in many situations will save you a – these shortcuts can

Assumptions The assumptions we have to make to use these short • All sources of noise follow the Gaussian distribution. If you have outlying pointscuts are:

and don’t get rid of them effectively, the short cuts discussed here will not work. Poisson distributions will also give you trouble, unless case the Poisson distribution looks a lot like a Gaussian! μ is large (> 10), in which

  • When you apply an operation (such as multiplication or taking the log) to a variable with Gaussian noise, the resultant number is assumed also to have Gaussian noise. Can give you grief, as this is not always true! A classic example is converting fluxes into magnitudes, which involves taking a log. This works fine if the noise get near-zero fluxes, or even negative ones. Taking the log of these will give you σ is much less than the mean flux. But if they are comparable, you can

Gaussian, which is fully specified by its standard deviation^ If we make these assumptions, we can approximate the noise on any variable a^ near-infinite magnitudes! s. Which is much simpler thans a needing the full probability density function for each variable.

The Procedure 1. Choose the statistic scientifically interested in. It will in general be a function of the quantities x you wish to compute – the one that tells you what you are u, v, w

  1. 3.^ Work out what the uncertainty is in each of the observed quantities (Use the error propagation equation to work out what the uncertainty in your^ that you observe. σu, σv…)
  2. statistic is (Use your understanding of the Gaussian distribution to work out whether this predicted uncertainty is acceptable.σx).

Why do you want to do this? Here are s procedure:ome typical astronomical situations in which you’d want to go through this

  • Writing a telescope proposal. You need to demonstrate what telescope you need to solve your scientific problem, and how much exposure time you require.
  • • Processing data: y affect the final data, so that you can optimise them.Testing new models: you may have come up with a new theoretical model. Thisou need to understand how your data reduction algorithms procedure will tell you whether your model is or is not consistent wi data. th existing

Estimating your noise Your first step is to estimate what the uncertainty is in each of the things you measure. As we discussed in the last set of notes, two major sources of noise are:

The^ •^ •^ The random arrival of photons.Electronic noise.noise caused by the random arrival of photons can be computed using the Poisson distribution. Remember √ normally will look this up in a manual.μ. The amount of electronic noise depends on the details of the electronics – the standard deviation of a Poisson distribution of mean – you μ is to observe nothing repeatedly (e.g. a blan of all these measurements^ You can (and should) check the amount of predicted noise. The usual way to do this is – this is an estimate of the noise, and should agree with whatk bit of sky) and calculate the standard deviation

you compute. If it doesn’t, work hard to understand why! Combining noise

Let us say that there is some statistic directly measure out x using a function x , instead we measure a number of other parameters x = f(u, v, …). (^) For example, x we wish to compute. In most cases we do not x might be the absolute magnitude of a u, v, …, and work star, which is a function of distance to the star. Let us imagine that our measurement of each of noise, and that the noise is Gaussian with standard deviations u, the measured apparent magnitude of the star, and σ u , σv (^) , … It is fairly u, v, … is afflicted by v , the straightforward then to deduce the error propagation equation:

!

where^ "^ x^2 #^ " σ^ u^2 %^ &^ 'x^ $^ $ is the standard deviation of the (assumed Gaussian) noise in the final statistic xu (^ )^ *^2 +^ "^ v^2 %^ &^ '^ $^ $^ xv (^ )^ *^2 +^ ...+^2 "^ uv^2 %^ &^ '^ $^ $ xu (^ )^ *^ %^ &^ '^ $^ $ xv (^ )^ *^ +^ ... x. Note that the square of the standard deviation is called the Taylor ser squaring values and adding them together.ies expansion, modified by the fact that standard deviations are computed by variance. This is basically a each of partial derivatives). The last term may need more explanation.^ The first two terms are straightforward: the noise in u, v, … have, and how strongly the value of x d^ epends on that of x^ depends on how much noise σuv is the (^) covariance u, v, … (the of u and v , and is defined by the equation:

!

" uv^2 = N lim #$^ ' () N^1 &[( ui % u )( vi % v )]^ * +, , where

If u, v, … are uncorrelated, then all the covariance terms are zero. If, for example,^^ u denotes the mean value of^ u. x is the absolute ma no particular reason why the noise in a given measurement of the distance should correlate with the noise in a given measurement of the apparent brightness. In a case likegnitude of a star, u its apparent magnitude and v its distance, then there is this, the covariance will sum to zero.

2.5112 noise by a fac exposure, we would need to take 39(30-26)=39 times weaker. So to detect these objects, we will have to decrease ourtor of 39. Using the √n rule, this means that instead of a single 1 hour (^2) exposures – ie. 1521 exposures, i.e. 6 months worth of clear nights! not. Because there are two types of noise But would the result really be an image reaching 30 – the random noise (which obeys theth (^) magnitude? Almost certai √n rule)nly and systematic errors. Such as confusion noise, scattered light, flat fielding errors, weak cosmic rays, unstable biases etc. The systematic errors are relatively sm they are systematic, they can be present in the same way in every image (rather thanall, but because being uncorrelated) and hence they do not obey the will dominate, and then taking more exposures won’t help you. Back to uncertainty addition: remember that the uncertainty in a statistic is calculated √n rule and diminish. Eventually they from the sum of the consequence much more important than the rest. If, for example, you are trying to work out the age of – if there are several sources of noise, the biggest source of uncertainty squares of the individual uncertainties. This has an important is an elliptical galaxy, based on the strength of some absorption lines in its spectrum, which you compare to a theoretical model. The model depends on the metallicit not know very accurately. Let us say that the uncertain metallicity introduces a 10%y, which you do uncertainty in the age, while the noise on your measurement of the emission lines introduces a 7% uncertainty. The total uncertainty is thus 17%). Now let’s say you want to improve this measurement, to discriminate between two √(7 (^2) +102) (^) = √ 1 49 = 12.2% (not theories of elliptical galaxy formation. You could improve your measurement of the metallicity, which would drop this uncertainty by 4% (from 10% to 6%), a better measurement of the absorption lines, which would drop this uncertainty by 4% or you could get (from 7% to 3%). Which should you do? 9.2%. Working on the absorption li Working on the metallicity drops your final uncertainty from 12.2% tones, however, only drops it to √(3 (^2) +10 2 √) = 10.4%. So(7 (^2) +6 (^2) ) = the motto is on reducing that: the gains from reducing the small contributors to the uncertainty are relatively tiny. – always find out what your biggest source of uncertainty is and work hard

Optimal We Let us imagine that we have lots of measurements associated uncertainty ighting. σ xi of a particular statistic, each with an

of the Hubble Constant made by different research groups, or we migh images of a given part of the sky. We want to deduce the best possible value ofi. For example, we might have dozens of different measurementst have taken lots of x , i.e. the value with the smallest uncertainty, by combining all these different different estimates. And you would be right, You might think that the obvious thing to do is to average (ta provided that all the different measurements ke the mean) of all the xi values. had the same uncertainties. others? Surely you’d want to give them more prominence? won’t produce very pretty results. One approach would be to just take the best But what if some of the measurements were better than the Averaging bad data with good measurement and throw the rest away. But surely there is some information in all the worse measurements much larger, they should still be able to contribute. – they may have larger errors, but if the uncertainties are not THAT The answer is to do a weighted average. So instead of the standard mean:

we compute a weighted mean:^ x^ =^1^ n^ " xi

where^ x^ =^ " wwii x is the weighting of i data point number i. To be a true average, we require that

weights. But how exactly should these weights be chosen? This is a fairly straightforward"^ The better data points should be given larger weights, and the worse ones smaller^ wi =^1.

calculation using the error addition equation: it turns out that the optimum strategy is inverse variance weighting proportional to its variance. The variance, remember, is the standard deviati – ie. the weight applied to a given point should be inverselyon squared. So the optimum weights are given by

!

with the constant of proportionality set by the requirement that the sum of the weights be^ wi^ "^ #^1 i^2

  1. be ca This is very widely used in astronomy, and can make a big difference. But, as usual,reful. This assumes Gaussian independent errors. While, in principle, adding crap data to good, albeit with a low weight, will improve your result, it is often worth drawing the line at the most crap data errors which will more than outweigh the benefits. – you are probably adding in some low-level systematic

Multiplying Uncertainties If x is the weighted sum of u and v ,

then using the error propagation equation, we find that^ x^ =^ auv

!

$ % " (^) x (^) As usual, the last term disappear x & ' (^2 = # $ % " u (^) u & ' (^2 + # $ % " v (^) v & ' (^2 + 2 " uv (^) uv^2 s if the uncertainties in u and v are independent, in

which case the equation simply says that the percentage error in the percentage errors in You can derive similar equations for different functions u and v. – e.g. when your stat x is equal to the sum ofistic is the log, exponential or sine of the measured quantities: see a stats textbook or derive them yourself from the error propagation equation. But in practice, adding and multiplying are the most useful equations.

What does your final uncertainty mea OK What does this mean? – let’s say you’ve done all this maths, and computed the uncertainty in your statistic. n?

have to be clear about this. Let’s look at a couple^ This all depends on what you wanted the statistic for in the first place. As always, you of examples:

5.0 How do you use these values? Some examples will hopefully help. 99.99994%

Example 1: an image. a CCD camera, which has a known rms read Imagine that you want to take an image of some part of the sky. You are t-out noise of 5.3 counts (rms = root meanaking it with

squared deviation = standard deviation). It has 1024x1024 pixels. Based on your knowledge of the sky brightness, you expect about 10 counts (detected photons) p per second, and you plan to expose for 100 sec. er pixel the arrival of photons is a random, quantum mechanical process, you won’t get exactly 1000 in each pixel. Instea^ Thus, in the absence of any objects, you would expect to get 1000 counts per pixel. Asd, you will get a Poisson distribution of mean 1000. The standard deviation of a Poisson distribution is the square root of the mean counts. The Poisson distribution looks more and more like a Gaussian as the mean increases – for a mean of 1000 this is a pretty good approximation. – i.e. 31. will be 1000, and the standard deviation will be the quadrature sum of 31.6 (the photon Poisson noise) and 5.3 (the read^ So if you have a bunch of pixels containing no objects, the mean number of counts-out noise) – ie. 32.06 counts. Thus the read-out noise, being much smaller than the photon noise, is essentially irrelevant (a common situation in broad For simplicity sake, let us assume that the pixels are pretty big, so that all the light-band optical/IR imaging). fr set a threshold value value ofom any object will fall in a single pixel. To find all the objects in our image, we must t should we choose? t. Any pixel that exceeds t will be listed as a detected object. What scientific goal is to avoid abject humiliation when we publish a paper based on this image. To do this, we need to make sure that when we list all the objects we saw in our^ We now need to think about what we are trying^ to achieve scientifically. Normally our image, the objects in this list should be r resulted in an unusually high value. So that if someone else tries to get follow observations of one of our objects, they will find that it is really there, and they haven’teal, and not just empty pixels where the noise has-up wasted their time looking at spurious detections. This means that we should set out threshold a blank bit of sky. t high enough that we get few if any objects.^ On the other hand, we want So what value of t to use? This depends o^ t^ to be pretty small, or we will be throwing away faintn how sure you want to be that there are no spurious detections. Let’s say you want to be really sure that every object you claim to have detected is really there. In this case, you want the expected number of spurious detections to be much less than 1. You have 1024x1024 ~ 1 million pixels, so you want the probability of a pixel which contains only empty sky having a measured value greater than t to be less than one in a million. Looking at the above table, you see that 7x10- (^6) of

the time you get results m deviations, this probability drops to 6x10 either this much higher than, or this much lower than the mean. A pixel that isore than 4.5 standard deviations out, while for 5.0 standard- (^7). These are the probabilities of getting a value anomalously low will n an issue, so we can divide these probabilities in half. So to have an expected number of spurious detections less than one, you’d need to setot generate a spurious detection – it’s only the high ones that are the threshold to be five standard deviations a quite good enough. Thus your threshold should be 1000.0 + 5.0x32.06 = 1160.03. This is called “setting a five standard deviation threshold”, or a “five sigma threshold”.bove the mean: 4.5 standard deviations isn’t you might be trying to work out how many galaxies there are down to some magnitude limit. You expect to find 7000 galaxies in this image. So it wouldn’t really matter if, say,^ Alternatively, you might be prepared to tolerate a few spurious points. For example, 100 were spurious. You could work ou and subtract it from the number you actually detected but an error in the 100 (typically the square root of 100t the expected mean number of spurious galaxies – – this won’t be perfectly accurate, i.e. 10) won’t make that much difference to the 7000 galax probability of any one of the million pixels having a value above the threshold must be In this case, you need the expected number of spurious detections to be 100y counts. – i.e. the (^10) detection threshold need only be 1000.0 + 4.0x32.06 = 1128.24. We can thus detect galaxies that are 20% (0.2 mag) fainter. A four sigma detection threshold.-^4. Looking at the table, this occurs at 4.0 standard deviations above the mean, so our fas we would need to increase our exposure time. The signal from a given galaxy is^ If we desperately wanted to detect these 20% fainter galaxies, but were moretidious and hence unwilling to tolerate that any of our detections might be spurious, proportional to the exposure time, while the noise (being dominated by the Poisson photon n noise ratio of a given galaxy (the number of standard deviations it is brighter than theoise) is proportional to the square root of the exposure time. Thus the signal-to- mean) is inversely proportional to the square root of the exposure time. Thus to dete galaxies 20% fainter with five sigma confidence, we’d need to increase the exposure time by a factor of 1.2 (^2) =1.44. ct

Example 2: Testing a theory. Let us imagine that a particular theorist (Prof Zog) has run a supercomputer simulation of the formation of Milky-Way-like galaxies, using cold dark matter (CDM). He finds that

such galaxies should have a mean of 630 dwarf galaxies in orbit around them. However, the exact number depends on the merging history of the galaxy: he ran his simulations 100 times and found dwarf galaxies ranging from 112 to as high as 954. He found a standard deviation of 243 dwarf galaxies around the mean of 630. number of dwarf galaxies. You found the You have just completed a survey of 10 nearby Milky following numbers: 523, 12, 144, 15, 3, 44,-Way-like galaxies, counting the 320, 2, 0, 97. is inconsistent with the data? Or will all the CDM mafia pillory you if you do this? What you want to know is, can you publish a paper saying that Prof Zog’s simulation consistent with Prof Zog’s simulation?^ Your data have an average (mean) of 116, and a standard deviation of 165. Is this

in a million observations should be thus discrepant. Thus if all the astronomy papers ever published used 5.5 sigma results, on pretty conclusive: Zog’s theory is cactus. average none of them would be in error. So this is

Conclusions The material covered in these notes is very widely used. People often talk colloquially

about “that’s only a 2 sigma result parts, the 5 sigma detections and the 4 This approach is an approximation, only valid if everything is nice and Gaussian and- I don’t believe it”, o-5 sigma ones which should be used with caution”r “This catalogue is split into two uncorrelated, which is seldom perfectly the cas on the side of caution in choosing at what “sigma” value to publish. And if things are So be careful! Look at histograms of your data to see if they really are Gaussian. Erre in the real world. really crucial, do a full end set of notes. But for many applications, this quick and dirty approach is perfectly adequate, and-to-end Monte-Carlo simulation as discussed in the previous much quicker than the alternative!