Each month in 2018, the moderators of r/dataisbeautiful will provide a dataset for the community to visualize. It is entirely up to contestants how to represent the data and one winner is chosen at the end of each month. I've decided that this friendly competition is a great opportunity to learn new things about data visualization and my goal will be to learn about and incorporate at least one new technique each month.
January's competiton was visualizing the growth rate of multiple algae specied under various light and temperature conditions. More specifically, the dataset contained 18 species of algae, 4 temperatures, and 2 lighting conditions. Growth rates were then supplied for each permutation of these variables. After some initial exploration of the dataset I decided that it would be interesting to highlight which species favored high lighting conditions over low lighting conditions and vice versa. To achieve this, I decided to build line plots for each algae species, highlighting the growth variance between lighting types with the tested temperature spectrum along the x axis. As an added challenge, I wanted to find a way to highlight the area between the lines to emphasize the growth preferences of each species. In order to achieve this with ggplot I did the following:
df3 <- df3 %>% group_by(Species) %>% mutate(diff25 = l2500 - lag(l2500), diff50 = l5000 - lag(l5000), diffTemp = Temp - lag(Temp), slope25 = diff25/diffTemp, slope50 = diff50/diffTemp, intcpt25 = l2500 - slope25 * Temp, intcpt50 = l5000 - slope50 * Temp, x2 = (intcpt25 - intcpt50)/(slope50 - slope25), y3 = slope25 * x2 + intcpt25, x2 = ifelse(x2 > Temp | x2 < lag(Temp), NA, x2), y3 = ifelse(x2 > Temp | x2 < lag(Temp), NA, y3), y4 = y3, seg_type = ifelse(l2500 > l5000, 'low','high')) break_points <- df3 %>% filter(!(is.na(x2))) %>% select(Species, x = x2, ymin = y3, ymax = y4) high_ribbon <- df3 %>% filter(seg_type == 'high') %>% select(Species, x = Temp, ymin = l2500, ymax = l5000) high_ribbon <- bind_rows(high_ribbon,break_points) low_ribbon <- df3 %>% filter(seg_type == 'low') %>% select(Species, x = Temp, ymin = l5000, ymax = l2500) low_ribbon <- bind_rows(low_ribbon,break_points) plot <- df3 %>% ggplot()+ geom_hline(yintercept = 0, linetype='dashed', color = 'gray70', size = .3, alpha = .7) + geom_line(aes(x = Temp, y = l2500), color = 'dodgerblue2', size = 1)+ geom_line(aes(x = Temp, y = l5000), color = 'orange2', size = 1) + geom_ribbon(data = high_ribbon, aes(x = x, ymin = ymin, ymax = ymax), fill = 'orange2', alpha = .4) + geom_ribbon(data = low_ribbon, aes(x = x, ymin = ymin, ymax = ymax), fill = 'dodgerblue2', alpha = .4) + facet_wrap(~Species,ncol=6)
Frebruary's dataset contains the legal status of same sex marriage in each US state over a 21 year period from 1995 to 2015. This timeframe presents a rapidly changing legal landscape from a near universal ban of same-sex marriage, all the way to complete legality. I saw this as a perfect opportunity to create an animated chloropleth to display this change intuitively over time. Although I've made chloropleths with tools like R and Tableau in the past, I wanted use D3.js this time around to take advantage of the library's great transitions and the aesthetically pleasing nature of svg paths. It turns out that making chlorpleths in D3 is pretty easy since it has great built in features for handling geojson and topojson. After locating a US map geojson file, I simply had to bind each year of data to the existing json and set up synced transitions for the dynamic aspects of the visualization. I added some percentages and appropriately scaling circles to help convey the aggregate legal status along the way. The code for this viz can be viewed on github and the full screen version can be viewed here.