






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Introductory view of geostatistics
Typology: Study notes
1 / 12
This page cannot be seen from the preview
Don't miss anything!
Data need not be inherently numeric to be useful in an analysis. For instance, the categories male and female are commonly used in both science and everyday life to classify people, and there is nothing inherently numeric about these two categories. Similarly, we often speak of the colors of objects in broad classes such as red and blue, and there is nothing inherently numeric about these categories either. (Although you could make an argument about different wavelengths of light, it’s not necessary to have this knowledge to classify objects by color.)
The level of detail used in a system of classification should be appropriate, based on the reasons for making the classification and the uses to which the information will be put.
to the next. A continuous variable is a variable that has an infinite number of possible values. In other words, any value is possible for the variable.
Qualitative vs. Quantitative Variables: Variables can be classified as qualitative (aka, categorical) or quantitative (aka, numeric).
■ Qualitative. Qualitative variables take on values that are names or labels. The color of a ball (e.g., red, green, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would be examples of qualitative or categorical variables.
■ Quantitative. Quantitative variables are numeric. They represent a measurable quantity. For example, when we speak of the population of a city, we are talking about the number of people in the city - a measurable attribute of the city. Therefore, a population would be a quantitative variable.
In algebraic equations, quantitative variables are represented by symbols (e.g., x, y, or z).
Discrete vs. Continuous Variables: Quantitative variables can be further classified as discrete or continuous. If a variable can take on any value between its minimum value and its maximum value, it is called a continuous variable; otherwise, it is called a discrete variable.
Some examples will clarify the difference between discrete and continuous variables.
A few examples of continuous variables/data:
Univariate vs. Bivariate Data: Statistical data are often classified according to the number of variables being studied.
Measurement is the process of systematically assigning numbers to objects and their properties to facilitate the use of mathematics in studying and describing objects and their relationships.
can refer to data measured at each level. The levels of measurement differ both in terms of the meaning of the numbers used in the measurement system and in the types of statistical procedures that can be applied appropriately to data measured at each level.
Properties of Measurement Scales: Each scale of measurement satisfies one or more of the following properties of measurement.
measurement. Values assigned to variables represent a descriptive category but have no inherent numerical value with respect to magnitude. Gender is an example of a variable that is measured on
but they cannot have negative weight. Ratio scales have all of the characteristics of the nominal, ordinal and interval scales. In addition, however, ratio scales have a true zero. This is the kind of scale that you used when you learned arithmetic in grade school. You assumed that the numbers had to mean, that they had a rank order (3 is larger than 2), that the intervals between the consecutive numbers were equal and that there was a zero. Four was twice two; eight was half of sixteen etc. There are true ratios. One can use all mathematical operations on this scale.
Examples:
score as consisting of two parts: true score (T) and error (E). This is expressed by the following formula: X = T + E where X is the observed measurement, T is the true score, and E is the error. However, both T and E are hypothetical constructs. In the real world, we seldom know the precise value of the true score and therefore cannot know the exact value of the error score either. Much of the process of measurement involves estimating both quantities and maximizing the true component while minimizing error. For instance, if you took a number of measurements of one person’s body weight in a short period (so that his true weight could be assumed to have remained constant), using a recently calibrated scale, you might accept the average of all those measurements as a good estimate of that individual’s true weight. You could then consider the variance between this average and each individual measurement as the error due to the measurement process, such as slight malfunctioning in the scale or the technician’s imprecision in reading and recording the results.
from the (unobservable) true value of a quantity of interest (for example, a population
means).
A statistical error (or disturbance ) is the amount by which an observation differs from its expected value, the latter being based on the whole population from which the statistical unit was chosen randomly. For example, if the mean height in a population of 21-year-old men is 1. meters, and one randomly chosen man is 1.80 meters tall, then the "error" is 0.05 meters; if the randomly chosen man is 1.70 meters tall, then the "error" is −0.05 meters. The expected value, being the mean of the entire population, is typically unobservable, and hence the statistical error cannot be observed either
Random error is an error due to chance: it has no particular pattern and is assumed to cancel itself out over repeated measurements. For instance, the error scores over a number of measurements of the same object are assumed to have a mean of zero. Therefore, if someone is weighed 10 times in succession on the same scale, you may observe slight differences in the number returned to you: some will be higher than the true value, and some will be lower. Assuming the true weight is 120
pounds, perhaps the first measurement will return an observed weight of 119 pounds (including an error of −1 pound), the second an observed weight of 122 pounds (for an error of +2 pounds), the third an observed weight of 118.5 pounds (an error of −1.5 pounds), and so on. If the scale is accurate and the only error is random, the average error over many trials will be 0, and the average observed weight will be 120 pounds. You can strive to reduce the amount of random error by using more accurate instruments, training your technicians to use them correctly, and so on, but you cannot expect to eliminate random error entirely.
In contrast, the systematic error has an observable pattern, is not due to chance, and often has a cause or causes that can be identified and remedied. For instance, a scale might be incorrectly calibrated to show a result that is 5 pounds over the true weight, so the average of multiple measurements of a person whose true weight is 120 pounds would be 125 pounds, not 120. Systematic error can also be due to human factors: perhaps the technician is reading the scale's display at an angle so that she sees the needle as registering higher than it is truly indicating. If a pattern is detected with systematic error, for instance, measurements drifting higher over time (so the error components are random at the beginning of the experiment, but later on are consistently high), this is useful information because we can intervene and recalibrate the scale.
It may be due to the person's bad habit of not properly remembering data at the time of taking down reading, writing and calculating, and then presenting the wrong data at a later time. This may be the reason for gross errors in the reported data, and such errors may end up in the calculation of
The Systematic errors that occur due to fault in the measuring device are known as systematic errors. Usually, they are called as Zero Error – a positive or negative error. These errors can be detached by correcting the measurement device
Random errors are caused by the sudden change in experimental conditions and noise and tiredness in the working persons. These errors are either positive or negative. An example of the random errors is during changes in humidity, unexpected change in temperature and fluctuation in voltage. These errors may be reduced by taking the average of a large number of readings.
An error of definition occurs when the variables which are being measured is not clearly defined. For example length of a box, where which dimension of the box is the length is not clearly defined.
Accuracy refers to the closeness of a measured value to a standard or known value. For example, if in lab you obtain a weight measurement of 3.2 kg for a given substance, but the actual or known weight is 10 kg, then your measurement is not accurate. In this case, your measurement is not close to the known value.
Precision refers to the closeness of two or more measurements to each other. Using the example above, if you weigh a given substance five times, and get 3.2 kg each time, then your measurement is very precise. Precision is independent of accuracy. You can be very precise but inaccurate, as described above. You can also be accurate but imprecise.
Accuracy of a target grouping according to BIPM and ISO 5725
Low accuracy, poor precision, good trueness
Low accuracy, good precision, poor trueness
For example, if on average, your measurements for a given substance are close to the known value, but the measurements are far from each other, then you have accuracy without precision.
The general term " accuracy " is used to describe the closeness of a measurement to the true value. When the term is applied to sets of measurements of the same measured, it involves a component of random error and a component of systematic error. In this case trueness is the closeness of the mean of a set of measurement results to the actual (true) value and precision is the closeness of agreement among a set of results.
A high resolution does not help you anything if you can't read the clock. Therefore the smallest possible increase of time that can be experienced by a program is called precision.
In NTP precision is determined automatically, and it is measured as a power of two
A clock not only needs to be read, it must be set, too. The accuracy determines how close the clock is to an official time reference like UTC. Unfortunately all the common clock hardware is not very accurate. This is simply because the frequency that makes time increase is never exactly right.
particular part or feature of the world easier to understand, define, quantify, visualize, or simulate by referencing it to existing and usually commonly accepted knowledge
Modelling refers to the process of generating a model as a conceptual representation of some phenomenon. Typically a model will refer only to some aspects of the phenomenon in question, and two models of the same phenomenon may be essentially different, that is to say that the differences between them comprise more than just a simple renaming of components.
A scale model is most generally a physical representation of an object, which maintains accurate relationships between all important aspects of the model, although absolute values of the original properties need not be preserved. This enables it to demonstrate some behavior or property of the original object without examining the original object itself. The most familiar scale models represent the physical appearance of an object in miniature, but there are many other kinds.
With a deterministic model, the assumptions and equations you select "determine" the results. The only way the outputs change is if you change an assumption (or an equation).
With a stochastic model, an element of randomness is introduced at one or many points of the model. Every time you run the model, you get a different result. If you run it many times, this gives you a measure of variability in the process, as predicted by the model.
A stochastic model represents a situation where uncertainty is present. In other words, it’s a model for a process that has some kind of randomness. The word stochastic comes from the Greek word stokhazesthai meaning to aim or guess. In the real word, uncertainty is a part of everyday life, so a stochastic model could literally represent anything. The opposite is a deterministic model , which predicts outcomes with 100% certainty. Deterministic models always have a set of equations that describe the system inputs and outputs exactly. On the other hand, stochastic models will likely produce different results every time the model is run. All stochastic models have the following in common:
A conceptual model is a representation of a system, made of the composition of concepts which are used to help people know, understand, or simulate a subject the model represents. Some models are physical objects; for example, a toy model which may be assembled, and may be made to work like the object it represents.
The term conceptual model may be used to refer to models which are formed after a conceptualization or generalization process. Conceptual models are often abstractions of things in the real world whether physical or social. Semantics studies are relevant to various stages of
average overstates the average by over 15%. The harmonic average provides a better picture of the true average of this population, discounting the large outlier.
The harmonic average is best to use when there is:
In statistics, the standard deviation (SD, also represented by the Greek letter sigma 0 3 C 3or the Latin letter s) is a measure that is used to quantify the amount of variation or dispersion of a set of data values. A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.
A plot of normal distribution (or bell-shaped curve) where each band has a width of 1 standard deviation
A large standard deviation indicates that the data points can spread far from the mean and a small standard deviation indicates that they are clustered closely around the mean.
For example, each of the three populations {0, 0, 14, 14}, {0, 6, 8, 14} and {6, 6, 8, 8} has a mean of
Interpretation and application: Standard deviation may serve as a measure of uncertainty. In physical science, for example, the reported standard deviation of a group of repeated measurements gives the precision of those measurements. When deciding whether measurements agree with a theoretical prediction, the standard deviation of those measurements is of crucial importance: if the mean of the measurements is too far away from the prediction (with the distance measured in standard deviations), then the theory being tested probably needs to be revised. This makes sense since they fall outside the range of values that could reasonably be expected to occur, if the prediction were correct and the standard deviation appropriately quantified. See prediction interval. While the standard deviation does measure how far typical values tend to be from the mean, other measures are available. An example is the mean absolute deviation, which might be considered a more direct measure of average distance, compared to the root mean square distance inherent in the standard deviation.