Home icon
Data Visualisation Guide

Normalising data

2 minutes read

Pitfalls in statistics

Consider the following table, showing the final energy consumption in the top 5 EU member states in 2020:

Rank Country Electricity consumption (Gigawatt-hour)
1 Germany 490.054
2 France 420.356
3 Italy 283.814
4 Spain 227.172
5 Poland 148.241

Should we conclude from this table that the Germans are the biggest electricity consumers in Europe? In absolute numbers, yes. But it if you know a thing or two about European demography, you will notice that this top 5 of electricity consumers is also the top 5 in population:

Rank Country Electricity consumption (Gigawatt-hour) Population
1 Germany 490.054 83.166.711
2 France 420.356 67.320.216
3 Italy 283.814 59.641.488
4 Spain 227.172 47.332.614
5 Poland 148.241 37.958.138

This makes sense, of course: more inhabitants will consume more energy. But in order to compare the energy consumption between countries, population should be factored in, and the numbers should be divided by the population of each country.

The top 5 per capita electricity consumers in the EU looks completely different than the one with absolute numbers:

Rank Country Electricity consumption (Gigawatt-hour) Population Per capita consumption (Mwh)
1 Finland 78.144 5.525.292 14,1
2 Sweden 125.678 10.327.589 12,2
3 Luxembourg 6.120 626.108 9,8
4 Austria 63.577 8.901.064 7,1
5 Belgium 80.748 11.522.440 7,0

When numbers relate to countries or regions and when they correlate with population, relevant comparisons can only be made by normalising the data: calculating the per capita numbers.

Related pages

Correlations

Correlation is not causation

Distributions

The mean versus the median

Percentages versus percentage points

Ecological fallacy

Pitfalls in statistics