Tag Archives: data visualization

Minority Report meets the NFL: Which College Football Program Is Most Likely to Lead to Future NFL Arrests?

A few weeks ago I published a deep-dive analysis of USA Today’s NFL Arrest Database.  While I received many comments (mostly via email or Twitter DM), two rose to the top:

  1. College is a formative experience. Did the college the player attended affect the likelihood of arrest (and criminal charge)?
  2. Many towns have very active football programs. Did the town or high school the player attended drive specific outcomes?

Are we getting into Minority Report territory?

The more we look at variables that could be used to to classify future criminal behavior (e.g., does going to college X now indicate you will be arrested for Y seven years later?), the more we get into a world more like that depicted in the movie Minority Report. As such, we need to be really careful to ensure as compare “apples-to-apples” for all analysis.

This post will answer the first question. I am still processing the data on high schools before writing up the second.

The college that led to the most arrests: WVU

It is a bit tricky analyzing which college led to the most arrests. You cannot simply count arrests by NFL players and group them by college program. This would penalize colleges with great programs (a college with 200+ alumni in the NFL should have more alumni with arrests than a college with only 5 alumni). Similarly ,you cannot simply look at the ratio of arrested NFL alumni to total NFL alumni (as this would penalize a college with few alumni).

So how did I answer this question? I combined two factors. I overlayed the following:

  • Top 5% college programs with most alumni in the NFL
  • Top 5% college programs with most alumni arrested in the NFL

Here is a visualization of the result:

Spiral chart shows how many NFL arrests were from players from each respective college program. The seven schools highlighted were schools in BOTH the top 5% for NFL placement AND top 5% for NFL arrests.
Spiral chart shows how many NFL arrests were from players from each respective college program. The seven schools highlighted were schools in BOTH the top 5% for NFL placement AND top 5% for NFL arrests.

West Virginia University not only had the most arrested NFL alumni. It also had the most arrested NFL alumni in comparison to all other top college programs. 

I took a look at the ratio of each of these school’s arrested NFL alumni per 125 total NFL alumni to get a “arrest per squad that made it to the NFL.” The results we interesting:

  • The average “Top 5%” team had 4.53 arrest per NFL squad
  • WVU had 18.06, nearly 4x the average arrest rate
  • WVU also had almost double the next highest arrest rate: University of Miami (FL) who had 9.87 arrests per NFL squad

Other Schools with Many Arrested NFL Alumni

As the spiral diagram shows above, WVU was not the only school in the “Top 5%” for both alumni who made it to the NFL and alumni arrested in the NFL. There were seven schools or made this list:

College/
University

NFL
Alumni

Arrests of
NFL Alumni

Arrests
per Squad

West Virginia

180

26

18.06

Miami (FL)

304

24

9.87

USC

470

23

6.12

Ohio State

409

22

6.72

Florida

278

19

8.54

Michigan

346

17

6.14

Georgia

281

16

7.12

Are these numbers really bad?  The answer is a definitive “Yes”.  Let’s take a look:

  • While these colleges represent 2% of all colleges  that have placed players in the NFL, they represent 20% of all future NFL arrests.
  • The average college team’s NFL alumni have an “arrest per squad” rate of 4.89
  • The average “arrest per squad” rate of the team with the highest placement of NFL players is even better: 4.53
  • These seven colleges have a much higher rate: 8.10. This is 118% higher than average arrest rate of the all other schools with the most success placing alumni in the NFL

What is the path from College to NFL team to type of arrest?

After look at my Sankey diagram of NFL team to  criminal charge to legal outcome, many people asked me if I could show a similar diagram leading from college to NFL team to criminal charge. Doing this for all 158 colleges with arrested NFL alumni would be unreadable. However, here is a Sankey diagram of the flow from the seven universities with the most NFL arrests:

Sankey diagram representing the "flow" from university to NFL team to arrest (for the "Top 7" programs highlighted above).
Sankey diagram representing the “flow” from university to NFL team to arrest (for the “Top 7” programs highlighted above).

So, do specific college programs have a higher tendency for a type of crime?

My prior analysis  showed a strong correlation of criminal charge type by NFL team.  This led to people to ask me the following “smoking gun” question: does any college stand out as the “leader” in arrests of a particular criminal charge. The answer is No.

Yes, there is a college whose NFL alumni had the most arrests for criminal charge X. However, the numbers of arrest by criminal charge by college are so small that there are no statistically valid indicators that college team indicates any future criminal pattern. We should all be happy for that.

There might be a wider dimension than college that we could assess (e.g., conference, geographic area). However, college–in and of itself–is not a valid dimension to predict future criminal charge.

Some other interesting (and positive) insights

With all the attention on NFL arrests if it easy to overlook the positive. My analysis of colleges showed some strong positives as well.

Very successful college programs–in general–do not equate to high arrest rates:

  • The colleges with the highest success rate placing players have a 19% lower arrest rate than the average college program. Notre Dame, UCLA, UL Monroe, Wisconsin, Syracuse, Minnesota, Boston College, Mississippi, Baylor, Indiana, Northwestern, Northwestern State (LA), and Arizona stand out as schools with the lowest alumni arrest rates.
  • The most successful program, Notre Dame (with a whopping 536 alumni who made it to the NFL) had only three  alumni arrested. This corresponds to an arrest rate that is 86% lower than the average school.

Also, NFL players are arrested 1/3 less often than the average US population. Clearly emulating the the examples set at Notre Dame, UCLA, UL Monroe, Wisconsin, Syracuse, Minnesota, Boston College, Mississippi, Baylor, Indiana, Northwestern, Northwestern State (LA), and Arizona would lead to better outcomes for all.

More to follow later…

Data Analysis of USAToday’s NFL Arrest database: 15 Surprising Insights

As soon as I learned that USA Today had released on open database of NFL player arrests (2000 to present), the data scientist in me thought, “I imagine there are some interesting patterns in there.” Rather than wondering, I downloaded it and dived right in.

The arrest data is easily readable, but lacks some important items (such as the age of the player at the time of arrest). As such, I decided to mash-up the data with two other sources: DOB, Height and Weight data from NFL.com and the strength and speed data from the NFL Combine. This would let me explore some of the more interesting (and potentially controversial) claims I heard in many TV interviews about the effect of increases in player size and strength had on aggression and crime.

My findings

Here are my findings from analyzing the data:

  1. Arrest frequency is NOT increasing. It is actually down from a really bad spate from 2006-2008
  2. NFL players, in general, have a one-third less likelihood of being arrested than everyday US residents. They have 15x the median US income and 3x the college graduation rate.
  3. However, many of those who are arrested are arrested many times throughout their career. 124 people were arrested more than once. One player was arrested 9 times. Sixty-five arrests were for multiple counts, across multiple criminal charges.
  4. Guilty verdicts (conviction, plea, or plea agreement) are the most common legal outcome. They occur almost 7x more frequently than Acquittals
  5. Nevertheless the most common action taken by NFL teams in response to arrests is “No Response.” This occurs 84% of the time
  6. Two-thirds of arrests occur off-season. However over 99% are arrest of players under contract. Free agent arrests are rare (although all of them later signed onto teams)
  7. Three teams (Minnesota, Cincinnati and Denver) have seen double the “normal” number of arrests per team
  8. Four criminal charges (DUI, Drugs, Domestic Violence and Assault) represent 60% of all arrests.
  9. Six charges (DUI, Drugs, Domestic Violence, Assault, Gun Charges and Disorderly Conduct) represent 80% of all arrests. Each of these has a single team with more arrests than any other.
  10. Of the most frequent charges, conviction rate varied enormously. DUIs had the highest conviction rate; Domestic Violence the lowest. While Domestic Violence pleas + convictions outcomes outnumbered acquittals 10:1, the vast majority of these cases were dropped or resolved in Diversion Programs
  11. The median arrested NFL player is: 25 years, 6 months old; is 6’2” tall, weighs 230 lbs., can run the 40-yd dash in 4.61 seconds and can bench press 225 lbs. 21times.
  12. While 88% of the arrests were of players under 30, age was not a factor (in arrest or criminal charge). The distribution of age at time of arrest virtually matched the distribution of ages across the NFL.
  13. There has been much talk in the media about the size of players and the potential impact on aggression. However, contrary to the opinions, neither height nor weight was a factor in likelihood of arrest or type of criminal charge.
  14. Unsurprisingly,  player speed was not a factor as well.
  15. However, analysis of player strength did show a pattern–not one about the strongest players, but about the least-strong. players It turns out those arrested for Sexual Assault stood out as the group with the lowest distribution of strength scores in the NFL Combine.

The data

  • 730 arrests between 2000 and the present (the database actually expanded by one entry a few days after launch to account for the arrest of Jonathan Dwyer)
  • These 730 arrests spanned 544 players (more on that below). Of these 544 players, 330 had publicly-available NFL Combine results
  • The arrests spanned 51 separate criminal charges (with some interesting concentrations, see below)

Sankey NFL Arrest “Flow”

The diagram at the top of this post is called a Sankey Diagram. This allows exploration of which teams had players arrested for each type of criminal charge then what were the distribution of outcomes for these charges. You can explore this chart, for all teams and all criminal charges in full at this page.

Deeper Dive

Sankeys allow exploration of broad pattern (such as which teams have the most Domestic Violence arrests) and interesting outliers (e.g., which team had a player arrested for Pimping; and what was the result). However, they do not make it easy to explore other dimensions of this data. The rest of this post takes a deeper dive into the data, exploring each of the 15 findings with different summaries and visualizations.

   Next: Arrest Frequency (and who was arrested nine times)