Rows: 157,015
Columns: 20
$ raw_row_number <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,…
$ date <date> 2014-01-01, 2014-01-01, 2014-01-01, 2014-01-01, 2…
$ time <time> 05:04:00, 12:33:00, 12:27:00, 13:52:00, 03:02:00,…
$ location <chr> "700 E FRANKLIN AVE, MESA", "400 E BROWN RD, MESA"…
$ lat <dbl> 33.40238, 33.43558, 33.39370, 33.42234, 33.42212, …
$ lng <dbl> -111.8161, -111.8227, -111.7538, -111.8314, -111.8…
$ geocode_source <chr> "GM", "GM", "GM", "GM", "GM", "GM", "GM", "GM", "G…
$ subject_age <dbl> 36, 58, 19, 46, 27, 58, 24, 27, 27, 29, 22, 22, 25…
$ subject_race <chr> "hispanic", "white", "white", "black", "white", "w…
$ subject_sex <chr> "female", "female", "male", "male", "male", "male"…
$ officer_id_hash <chr> "d4eed1c4dc", "e24408e0fd", "4ab12ca221", "b1c5205…
$ type <chr> NA, "vehicular", "vehicular", "vehicular", "vehicu…
$ violation <chr> "NOISE AFTER 10PM TO 6AM", "NO PROOF OF INSURANCE"…
$ arrest_made <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TR…
$ citation_issued <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TR…
$ warning_issued <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
$ outcome <chr> "citation", "citation", "citation", "citation", "c…
$ raw_race_fixed <chr> "W", "W", "W", "B", "W", "W", "W", "W", "W", "W", …
$ raw_ethnicity_fixed <chr> "H", "N", "N", "N", "N", "N", "N", "N", "N", "H", …
$ raw_charge <chr> "------A-", "-------C", "------A", "-------B", "--…
Sources
This data set is from Stanford Open Policing Project.
This data is about the arrests made between different variables. The variables I decided to use were subject_sex, arrest_made, and date. There were a total of 12 colunms which included subject_age, subject_race, and etc.