10
$\begingroup$

Hypothetical question, so I don't have actual data or visualization to share, but this is a problem I might face in the future.

Let's say I have a map of a region, divided by counties. I take samples from the population of these counties. I want to represent the proportion of people aged < 15 in each of these counties. As this is a sample, there's an uncertainty associated with the observed proportions, that we can formalize with confidence intervals.

How to correctly represent this uncertainty on a map, without it becoming bloated or unreadable?

A solution that came to my mind is to simply display the confidence intervals numbers on each county (e.g. under the form [0.21, 0.35]), but I guess it could become quickly unreadable if the map includes many counties or some other text (names of the counties, major cities, etc.).

Another solution would be to give up the idea of representing the uncertainty on the map, and presenting the confidence intervals on a separate table. The risk I see here is that a table might be overlooked by readers when it is a crucial piece of information. In addition, it might be that there's some sort of relation between the uncertainty and the location (e.g. perhaps the uncertainty is greater in northern counties), and that would be a phenomenon more difficult to see with a table.

The other solution I see is to represent the uncertainty on two separate maps (corresponding to the lower and upper CIs), but I fear they might be a bit overlooked too. In addition, this solution may require an effort to link an observed value to its uncertainty, compared to representing all the info on a single map.

What would be some good alternatives, if any? Thanks.

New contributor
Dani is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
$\endgroup$
6
  • $\begingroup$ if the visualization uses a map with the different counties anyway, I'd probably prefer conveying the age group proportion by coloring, plus the estimate and the uncertainty as a text label ( e.g. 0.35±0.05) . Depending on the map, some counties might not have enough space for it, but maybe the size of the overall figure can be adjusted. I definitely think it'd be the most intuitive way for me $\endgroup$ Commented yesterday
  • 1
    $\begingroup$ @deemel thanks for the feedback. Just as a small side note, some methods for computing confidence intervals (e.g. Wilson CI) do not necessarily generate symmetrical CIs, so one may end up with a bit more text than "±0.05". $\endgroup$ Commented yesterday
  • $\begingroup$ With some care, varying transparency or focus ("bokeh") can work. You have to experiment. $\endgroup$ Commented yesterday
  • 1
    $\begingroup$ @ischmidt You're right that there's no perfect answer: plenty of methods have been tried but none are in common use. // Uncertainties in proportions are bounded because the proportions are bounded. But proposing a restriction to bounded variables is a little puzzling, because (1) non-infinite quantities shown on any map are always bounded by their range and (2) it's not necessary for the mapping from a quantity to a graphical or geometric characteristic to be linear, anyway. $\endgroup$ Commented yesterday
  • 3
    $\begingroup$ One can illuminate geometric objects in a map (that is, alter their light values) according to levels of uncertainty. Although the illustration at gis.stackexchange.com/a/17190/664 literally shows light, it indicates what can be accomplished if you imagine the darker "nighttime" part of the map being locations of greater uncertainty. $\endgroup$ Commented yesterday

3 Answers 3

7
$\begingroup$

It used to be relatively common to overlay barplots on maps, I recall (one of the few times where I think bars might be useful). This only works well if you have a small-ish number of regions and the size disparity is not too large, but it's an option. With the bars, you can of course add confidence intervals. Here's a crude example of what I mean, albeit without error bars: https://www.originlab.com/doc/Origin-Help/Bar-Map

I do also like your other option of adding separate maps showing the lower and upper confidence intervals, though those would suggest a positive spatial correlation in estimates which may or may not be plausible.

Hashing or other texture can also be used to communicate categorical differences. Often this is used to indicate p-values less or greater than 0.05. I'm not sure it would be useful but one might in principle also define bands of confidence interval width and use texture to show that. Transparency could also be used for this with the advantage that it can be used to communicate a continuous response, but I suspect it would be hard to do this effectively and some binning would again be helpful.

The best solution is likely to depend heavily on the details of the mapped area and data, so you will likely have to experiment to find the best solution.

$\endgroup$
3
  • $\begingroup$ Thanks for the suggestions and the link! (it's not clickable by the way). As you say, I strongly suspect that it heavily depends on the actual data we have. But it's interesting to learn about a range of possible options. Before accepting your answer, I'll wait a bit to see what other people have to say about the issue. $\endgroup$ Commented yesterday
  • $\begingroup$ @Dani No problem, and I agree that you shouldn't accept so soon. Mine isn't in any way a definitive answer and you will likely get other good ones. $\endgroup$ Commented yesterday
  • 2
    $\begingroup$ +1. Although posting boxplots (or other such graphical objects) can represent uncertainties, it is cartographically poor. Usually, the main point of drawing a map is to exploit gestalt visual processing to enable the viewer to see, without conscious thought or painstaking decoding, any patterns or important elements in the data. This supports your second comment about the inadvisability of separately mapping confidence limits: that clearly won't work. You are right to be cautious about transparency: my experience indicates it takes a lot of experimentation and mightn't work as intended. $\endgroup$ Commented yesterday
7
$\begingroup$

Consider using value-suppressing color palettes that desaturate the values in the presence of more uncertainty and bias it towards a neutral hue.

An example of a color palette, from the uwdata medium article

Read more in an discussion from UW interactive data lab and its accompanying paper.$^1$ The entire article is worth reading as it also mentions bivariate visual schemes that use (color, shading) to encode (value, uncertainty) as well as traditional palettes that just desaturate in the presence of uncertainty.

$^1$ Correll, Moritz and Heer. "Value-Suppressing Uncertainty Palettes." Proc. ACM Human Factors in Computing Systems (CHI), 2018.

$\endgroup$
1
  • 6
    $\begingroup$ This is a nice idea, but saturation is one of the subtlest possible graphical indicators of a variable and usually is already involved in portraying the proportions themselves. Consider lightness ("value") instead. $\endgroup$ Commented yesterday
4
$\begingroup$

Does it not depend on what the data shows that you are trying to highlight? For example, the point of plotting the point estimates on a map is usually to show the contrast between different parts of the map. If the uncertainty tends to be the same across different parts of the map, does it need to be visualized? The point of visualization is messaging, so identify the message and then build the visualization to convey that message.

That said, I like some of the ideas on this page: The Visualization of Uncertainty.

New contributor
rakarnik is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct.
$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.