Chicago PD Hourly Wages
Created 2019-07-24.
A histogram of (converted) hourly wages of Chicago police officers.

I used the City of Chicago's "Current Employee Names, Salaries, and Position Titles", which has variables on job titles, departments, full- or part-time work, salary, and typical hours.
I wanted to look at data for police officers. I first visualized the distribution of their salaries using a kernel denstiy estimate (see "KDE of police officer salaries"):
# DATA
df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
#
ggplot() +
# ELEMENT
geom_density(aes(Annual.Salary))
From the kernel density estimate, it appeared that there were three broad classes of payment. I used a point plot to try to confirm this (see "Point distribution of police officer salaries"):
# DATA
df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
#
ggplot() +
# ELEMENT
geom_point(aes(0, Annual.Salary))
The point plot showed that the pay is actually divided into what looks like a finite set of classes. I checked with some R code:
> df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %$%
unique(Annual.Salary)
[1] 48078 93354 68616 84054 87006 90024 72510 80016 76266 96060
58572 92316 87000 60600 68262 84450
There were 16 distinct salary levels, so I tried to translate those into hourly wages for a 40 hour work week:
> df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %$%
{Annual.Salary / (40 * 50)} %>%
{sprintf("%.2f", .)} %>%
unique %>%
sort
[1] "24.04" "29.29" "30.30" "34.13" "34.31" "36.26" "38.13" "40.01"
"42.03" "42.23" "43.50" "45.01" "46.16" "46.68" "48.03"
Some of these hourly wages a pretty close, so I decided to merge them if they are within a dollar of each other:
df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
# TRANS
mutate(
Hourly.Rate = Annual.Salary / (40 * 50)
) %>%
arrange(Hourly.Rate) %>%
mutate(
Pay.Group = Hourly.Rate %>% diff %>% { . > 1 } %>% cumsum %>% {c(0,.)}
) %>%
ggplot() +
# ELEMENT
geom_bar(aes(Pay.Group), stat=‘count'
To get the final graphic, I improved this by labelling the x-axis with an hourly wage range, instead of a numeric pay group (e.x. "$12.30/hr" instead of "2").


Code used to create "Chicago PD Hourly Wages"
- code.r - A fragment of the code used to generate the graphic (lost the original code).
# DATA
(df %>%
filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
# TRANS
mutate(
Hourly.Rate = Annual.Salary / (40 * 50)
) %>%
arrange(Hourly.Rate) %>%
mutate(
Pay.Group = as.ordered(Hourly.Rate %>% diff %>% { . > 1 } %>% cumsum %>% {c(0,.)})
) -> df_) %>%
#
ggplot() +
# SCALE
scale_x_discrete(
labels = df_ %>%
group_by(Pay.Group) %>%
summarize(Min=sprintf("%.2f", min(Hourly.Rate)), Max=sprintf("%.2f", max(Hourly.Rate))) %$%
ifelse(Min == Max, sprintf("$%s/hr", Min), sprintf("$%s-%s/hr", Min, Max))
) +
# GUIDE
ggtitle("Chicago PD Hourly Wages") +
ylab("# of Officers") +
xlab("Pay Range") +
# ELEMENT
geom_bar(aes(Pay.Group), stat='count') +
# THEME
theme(axis.text.x = element_text(angle = 45, hjust = 1))