What’s the right way to define class boundaries in a classed choropleth? One part of this is macro-level: how should you determine the classes (equal-interval, quantile, Jenks, etc)? This is an interesting question, but its extensively answered elsewhere.
The question I’m interested in is - having decided on your general classes - exactly where do you draw the boundaries?
An example: you have count (ie discrete) data for population of US counties.
# Get data
pop <- read.csv("population.csv",skip=1)
Suppose you decide on five log intervals. Then I think there is one unambiguously best solution, which clearly reflects the discrete-ness of the population variable.
# Discrete data
pop$pop_count <- cut(
pop$Population,
include.lowest = T,
breaks = c(1e2, 1e3, 1e4, 1e5, 1e6, 1e7),
labels = c("100 to 999", "1000 to 9999", "10,000 to 99,000", "100,000 to 999,999", "1,000,000 to 9,999,999"))
plot_counties(pop, "pop_count", "Discrete Variable\nPopulation by county", "Blues")