A-level Physics (Advancing Physics)/Data Handling

Data Handling

Data Tables

Data should be collected in tables in a systematic way that makes it clear and easy to understand.

Headings should make it easy to find the information that is required and should include appropriate units and uncertainty.

Reference tables of data may be given in an appendix as they may be long.
Tables in the body of a report should be designed to convey a message clearly and may only include summary data such as averages.

The layout and contents of a results table, whether it is for recording numerical data or observations, should be decided before the experiment is performed. ‘Making it up as you go along’ often results in tables that are difficult to follow and don’t make the best use of space. Space should be allocated within the table for any manipulation of the data that will be required. The heading of each column must include both the quantity being measured and the units in which the measurement is made. Readings made directly from measuring instruments should be given to the number of decimal places that is appropriate for the measuring instrument used (for example, readings from a metre rule should be given to the nearest mm). Quantities calculated from raw data should be shown to the correct number of significant figures.

Uncertainty

All instruments have a level of uncertainty. In an experiment the biggest source of uncertainty is the most important to consider and try to reduce.

using a ruler to the nearest mm
a voltmeter reading to the nearest 0.1V
a set of scales measuring to the nearest 0.01g

This is the resolution of the instrument. The smallest change it can observe or 'see'. Any readings should only be taken to ±1 of the last digit at best. However there may be reasons to be even more pedantic about results.

The stability. The results on a meter may flicker randomly or in response to changing conditions like being knocked.
The range. Repeated readings of the same experiment may vary.
The calibration of the instrument. Is it giving a true reading compared to other supposedly identical instruments of against a known standard?

A result should be given including ± the uncertainty.

The uncertainty can be displayed on a graph as an Error Bar.

Systematic Errors

Systematic errors can arise from the experimental procedures or other bias.

Zero error on an instrument making all readings too large or small by a set amount
- A micrometer that reads -0.01mm when fully closed
- Not realising a 30 cm ruler has an extra few mm before the scale starts.
- A set of scales that has not been zeroed first.
Calibration of an instrument giving false readings.
- An ammeter consistently giving reading that are too high.
- A measuring tape becoming stretched over years of use.
Experimental design flaws
- Friction on a sloping runway not being accounted for.
- A resistor changing its value as it gets hot.
- Resistance of connecting wires.

These will often result in a line of best fit on a graph that doesn't go through an intercept where expected.

An experimental design can be improved to try to remove systematic errors.

Random Errors

These are often noise or random fluctuations in a repeated reading.

The height a ball bounces to when dropped from the same height.
The small variations in voltage when repeating a reading on the same length of wire.

They may also be due to a mistake or human error in the reading, for example using your eyes to measure the height that a ball bounces may lead to an outlier due to an error in judgement. Human error would be the most likely source of an outlier.

Spread and identifying possible outliers

It is often useful to use plot and look as a quick way to assess the quality of data. This is a dotplot that shows the distribution of a set of data. The following data would look like this:

5.5

6.2

6.7

5.9

7.7

Average = 6.4

The average of a dataset is often denoted by the greek letter "mu," $\mu$ .

awaiting image

Range: The maximum value minus the minimum value in a set of repeated readings.
Spread: ± Half the Range
Standard deviation: This is one of the most common ways to measure how spread out the data are. For example, let's calculate the standard deviation of the data. First, calculate how much each data point deviates from the average, or mean, of the data. Then, square that difference:

   ${\begin{array}{lll}(5.5-6.4)^{2}=0.81\\(6.2-6.4)^{2}=0.04\\(6.7-6.4)^{2}=0.09\\(5.9-6.4)^{2}=0.25\\(7.7-6.4)^{2}=1.69\\\end{array}}$

The variance is the mean of these values:

{\frac {0.81+0.04+0.09+0.25+1.69}{5}}=0.576

Finally, the standard deviation is the square root of the variance:

{\sqrt {0.576}}=0.759

More generally, consider the case of discrete random variables. In the case where X takes random values from a finite data set x₁, x₂, ..., x_N, with each value having the same probability of occurrence, the standard deviation, commonly denoted by the greek letter "sigma," $\sigma$ , is

\sigma ={\sqrt {{\frac {1}{N}}\left[(x_{1}-\mu )^{2}+(x_{2}-\mu )^{2}+\cdots +(x_{N}-\mu )^{2}\right]}},{\rm {\ \ where\ \ }}\mu ={\frac {1}{N}}(x_{1}+\cdots +x_{N}),

or, using summation notation,

\sigma ={\sqrt {{\frac {1}{N}}\sum _{i=1}^{N}(x_{i}-\mu )^{2}}},{\rm {\ \ where\ \ }}\mu ={\frac {1}{N}}\sum _{i=1}^{N}x_{i}.

If, instead of having equal probabilities, the values have different probabilities, let x₁ have probability p₁, x₂ have probability p₂, ..., x_N have probability p_N. In this case, the standard deviation will be

$\sigma ={\sqrt {\sum _{i=1}^{N}p_{i}(x_{i}-\mu )^{2}}},{\rm {\ \ where\ \ }}\mu =\sum _{i=1}^{N}p_{i}x_{i}.$
Outlier: A value is likely to be an outlier if it is further than 2 x the spread from the mean average. This should only be used as a guide and the possible reasons for an anomalous result should be considered before dismissing it.

Graphs

Scales

How not to plot.

A better plot.

The scales on a graph must include a suitable legend and units. At Advanced level, we write the axis label followed by a stroke (solidus) and then the unit e.g. "force / N", rather than "force (N)". If the forces had all been kilonewtons, then the axis label might read " force / 10³ N"; a graph showing cross sectional area on one axis might have the title "cross sectional area / 10^-6m²", rather than "cross sectional area (mm²)". A density scale could be labelled "density / 10³ kg m^-3", rather than "density (thousands of kg/m³). Notice that the units are expressed using negative powers rather than a stroke (solidus).

If producing graphs on computer they should be as big as is reasonably possible so that values can be read from them easily. The scales should include minor unit markers or even grid lines. Small graphs within the text of a report may be used for illustrative purposes but a full sized version should be included in an appendix.

Points should be plotted as an x, rather than a blob or dot, so it is clear exactly where the point is (where the two lines cross).

The purpose of displaying the data in a particular graph should be made clear from a caption or title.

Lines of best fit

Error bars shows the uncertainty in the data. If you use the raw data on the graph, its uncertainty is equal to the length of uncertainty bars. But if you use processed data, you need to process uncertainties, too. ( If you don't know how to do this, write me and i will send you a document) You don't have to use error bars on graph if they are too small to draw, but you have to give reasons for this on your report otherwise you lose point.

Scaling of the graph is importantant to get "complete" so you have to rescale it, if you use less than half on any axis. (Remember, axis don't have to start from zero)

Uncertainities in Slopes:

The slope of best line gives you the value used in calculating experimental value of n.