John Lekberg


Chicago PD Hourly Wages

Created 2019-07-24.

A histogram of (converted) hourly wages of Chicago police officers.

I used the City of Chicago's "Current Employee Names, Salaries, and Position Titles", which has variables on job titles, departments, full- or part-time work, salary, and typical hours.

I wanted to look at data for police officers. I first visualized the distribution of their salaries using a kernel denstiy estimate (see "KDE of police officer salaries"):

# DATA
df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
  #
  ggplot() +
  # ELEMENT
  geom_density(aes(Annual.Salary))

From the kernel density estimate, it appeared that there were three broad classes of payment. I used a point plot to try to confirm this (see "Point distribution of police officer salaries"):

# DATA
df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
  #
  ggplot() +
  # ELEMENT
  geom_point(aes(0, Annual.Salary))

The point plot showed that the pay is actually divided into what looks like a finite set of classes. I checked with some R code:

> df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %$%
  unique(Annual.Salary)

 [1] 48078 93354 68616 84054 87006 90024 72510 80016 76266 96060
     58572 92316 87000 60600 68262 84450

There were 16 distinct salary levels, so I tried to translate those into hourly wages for a 40 hour work week:

> df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %$%
  {Annual.Salary / (40 * 50)} %>%
  {sprintf("%.2f", .)} %>%
  unique %>%
  sort

 [1] "24.04" "29.29" "30.30" "34.13" "34.31" "36.26" "38.13" "40.01"
     "42.03" "42.23" "43.50" "45.01" "46.16" "46.68" "48.03"

Some of these hourly wages a pretty close, so I decided to merge them if they are within a dollar of each other:

df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
  # TRANS
  mutate(
    Hourly.Rate = Annual.Salary / (40 * 50)
  ) %>%
  arrange(Hourly.Rate) %>%
  mutate(
    Pay.Group = Hourly.Rate %>% diff %>% { . > 1 } %>% cumsum %>% {c(0,.)}
  ) %>%
  ggplot() +
  # ELEMENT
  geom_bar(aes(Pay.Group), stat=‘count'

To get the final graphic, I improved this by labelling the x-axis with an hourly wage range, instead of a numeric pay group (e.x. "$12.30/hr" instead of "2").


KDE of police officer salaries

Point distribution of police office salaries

Code used to create "Chicago PD Hourly Wages"


code.r - A fragment of the code used to generate the graphic (lost the original code).
# DATA
(df %>%
  filter(Job.Titles == "POLICE OFFICER" & Full.or.Part.Time == "F") %>%
  # TRANS
  mutate(
    Hourly.Rate = Annual.Salary / (40 * 50)
  ) %>%
  arrange(Hourly.Rate) %>%
  mutate(
    Pay.Group = as.ordered(Hourly.Rate %>% diff %>% { . > 1 } %>% cumsum %>% {c(0,.)})
  ) -> df_) %>%
  #
  ggplot() +
  # SCALE
  scale_x_discrete(
    labels = df_ %>%
      group_by(Pay.Group) %>%
      summarize(Min=sprintf("%.2f", min(Hourly.Rate)), Max=sprintf("%.2f", max(Hourly.Rate))) %$%
      ifelse(Min == Max, sprintf("$%s/hr", Min), sprintf("$%s-%s/hr", Min, Max))
  ) +
  # GUIDE
  ggtitle("Chicago PD Hourly Wages") +
  ylab("# of Officers") +
  xlab("Pay Range") +
  # ELEMENT
  geom_bar(aes(Pay.Group), stat='count') +
  # THEME
  theme(axis.text.x = element_text(angle = 45, hjust = 1))