Dr Paul Atkinson of the University of Huddersfield Centre for the History of Public Health and Medicine announces the first two workshops in his AHRC funded (award AH/L011395/1) research support project Numerical Analysis Skills for Historians (NASH). If you have already expressed an interest in the series of events or you think you could benefit from additional training in quantitative methods read on.
I’m now in a position to give some details of our seminars. These will take two forms, one on the interpretation of data and the other on relationships between data series. At present we have arranged two seminars of the first form. In the next few weeks I will circulate details of the slightly more advanced seminar on relationships between data series (including regression analysis). I hope to provide details, too, of one more ‘interpretation of data’ seminar.
The initial seminars contain about four hours’ learning, running from mid-morning to mid-afternoon with a break for lunch. The style will be hands-on, using worked examples to understand how different concepts and methods work. A note on the subjects covered is attached.
At University College London on Wednesday 4 June
Presenter: Dr Andrew Hinde, University of Southampton. Dr Hinde is Head of Southampton’s Division of Social Statistics & Demography. His extensive publications include frequently used texts on demographic methods and on the population history of England.
At the University of Edinburgh on Monday 23 June
Presenter: Professor John MacInnes, University of Edinburgh. Professor MacInnes is the ESRC Strategic Advisor on Quantitative Methods Training, part of the ESRC Quantitative Methods Initiative. He sits on the advisory board of ‘getstats’, the Royal Statistical Society’s Statistical Literacy Campaign, and has created a new statistical literacy course for students from all colleges and years at the University of Edinburgh.
There is no charge for attendance. I regret that we are not able to help with travel costs or provide refreshments. If you would like to book, please email me for a booking form at email@example.com.
I look forward to seeing you,
A brief summary of the curriculum for the events follows:
Content: interpretation of data seminars
Variables, values and cases
Levels of measurement, categorical and continuous variables
Measurement and data capture as a social process, with substantial measurement error of various kinds. Validity and reliability.
Samples and populations.
Summary description of the distribution of the values for a variable, by numbers or visual methods.
Exploration of patterns of association between variables (theory, model, hypothesis generation).
Variable ‘time’ of special interest to historians: trends, rates, indices, risks/hazards.
Summary descriptions of association by numbers (correlation and regression coefficients; contingency tables) or visual methods (scatterplots, clustered bar or boxplot charts,)
Common problems with empirical description of sample data
Testing of a hypothesis, model or theory to see if it is consistent with the data collected.
Limits of testing: noisy data, common logical fallacies
Representativeness and generalisability
Historians almost always work with samples: how ‘representative’ these are:
– Random variation
– Selection effects
To the extent that a sample is not fully representative of the population it is drawn from, it is less safe to generalise from discussion of the sample to discussion of the population it is drawn from.
Inference from samples to populations
Random samples permit calculations of the probability that sample results obtained lie within a certain distance of the (unknown) result that would be obtained were it possible to measure the population.
Data always suffers from substantial amounts of measurement error.
Such error can be thought of as ‘noise’ in the data which obscures the ‘signal’.
When conclusions depend upon identifying patterns in the data that may turn out to be the product of the noise rather than a signal
Less is more, spurious accuracy and too much detail
Essential information: source, N, data availability, base for percentages, indices or rates.