In this Take-home Exercise, I will critic the submission from one of my classmates in terms of clarity and aesthetics, and also attempt to remake the visualisation design.
In this take-home exercise, I will critic Ding Yanmu’s Take-home Exercise 1 submission in terms of clarity and aesthetics. I will also remake the original design by applying the data visualisation principles and best practice we had learnt in Lesson 1 and 2.
We will first install and launch the required R packages using the code chunk below.
packages = c('tidyverse', 'ggdist', 'ggridges', 'patchwork', 'ggthemes', 'ggrepel')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
We will also import the Participants.csv from the data folder into R using the code chunk below.
participants <- read_csv("data/Participants.csv")
glimpse(participants)
Rows: 1,011
Columns: 7
$ participantId <dbl> 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,~
$ householdSize <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, ~
$ haveKids <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU~
$ age <dbl> 36, 25, 35, 21, 43, 32, 26, 27, 20, 35, 48, 2~
$ educationLevel <chr> "HighSchoolOrCollege", "HighSchoolOrCollege",~
$ interestGroup <chr> "H", "B", "A", "I", "H", "D", "I", "A", "G", ~
$ joviality <dbl> 0.001626703, 0.328086500, 0.393469590, 0.1380~
There is a typo with the title as shown in the picture above. This affects the overall aesthetics of the article.
Should be corrected as below:
2. Data Description
The orientation of the age_group label on the x-axis can be challenging to read. Moreover, the orientation of the y-axis label is also vertically but not horizontally displayed. This coincides with one of the data-ink principles on graph labeling where the orientation of label should be reader friendly.
We will group the data using the code chunk below.
participants$age_group <- cut(participants$age,
breaks = c(-Inf,21, 26, 31, 36, 41, 46, 51, 56, Inf),
labels = c("<20", "21-25", "26-30","31-35", "36-40",
"41-45", "46-50","51-55", "56-60"),
right = FALSE)
The code chunk below plots a bar chart by using
geom_bar() of ggplot2, with the use of
coord_flip() to correct the orientation of the
age_group label and one of the theme() components
to rotate the y-axis label to horizontal position.
ggplot(data = participants,
aes(x=age_group)) +
geom_bar() +
coord_flip() +
geom_text(stat="count", aes(label=paste0(..count..)), check_overlap=TRUE, fontface="bold",
position=position_stack(vjust = 1.04)) +
xlab("Age Group") +
ylab("No. of Participants") +
ggtitle("Age Distribution") +
theme(axis.title.y= element_text(angle=0), axis.ticks.y= element_blank(),
panel.background= element_blank(), axis.line= element_line(color= 'grey'))
Beside the issue with the orientation of the labels as mentioned earlier, the legend of the graph is also missing. In addition, the choice of using line graph to display the mean statistics may not very informative.
The code chunk below plots a trellis boxplot with the use of
geom_boxplot() and facet_grid() of
ggplot2. Point labels have been revised using
stat_summary() to make them more appealing. The use of
boxplot allows the comparison of mean and median statistics across each
age group for those with or without kids.
ggplot(data=participants,
aes(y = joviality, x= age_group)) +
geom_boxplot() +
stat_summary(geom = "point",
fun="mean",
colour ="red",
size=2) +
stat_summary(aes(label = round(..y.., 2)), fun=mean, geom = "label_repel", size=3, angle=150) +
facet_grid(haveKids ~.) +
labs(y= 'Joviality', x= 'Age Group',
title = "Distribution of Joviality across Age Groups by Kids Status") +
theme(axis.title.y= element_text(angle=0), axis.ticks.x= element_blank(),
axis.line= element_line(color= 'grey'))
Beside the issue with the orientation of the labels as mentioned earlier, the legend of the graph is also not properly placed. In addition, the choice of the line/dot size seems to have resulted to much overlaps.
The code chunk below plots a line chart with the use of
stat_summary() of ggplot2. Legend has been
properly placed to the right-hand side of the graph and line/dot size
has been shrunk to avoid unnecessary overlaps. This is in accordance to
the principle of showing the data clearly.
ggplot(data = participants,
aes(x=age_group, y=joviality, color=educationLevel, group = educationLevel)) +
stat_summary(fun.y=mean, geom = "point") +
stat_summary(fun.y=mean, geom = "line") +
scale_y_continuous(expand = c(0, 0), limits = c(0, 1)) +
xlab("Age Group") +
ylab("Joviality") +
ggtitle("Joviality across Age Groups by Education Level") +
theme(axis.title.y= element_text(angle=0), axis.ticks.x= element_blank(),
panel.background= element_blank(), axis.line= element_line(color= 'grey'))
The graph title is missing and the bars are not sorted by their respective frequencies.
The code chunk below plots a bar chart by using
geom_bar() of ggplot2, with the use of
ggtitle() to add a graph title and bars are now sorted
according to their respective frequencies.
ggplot(data = participants,
aes(x=reorder(interestGroup, interestGroup, function(x)-length(x)))) +
geom_bar(fill="skyblue") +
ylim(0, 120) +
geom_text(stat="count",
aes(label=paste0(..count.., " (",
round(..count../sum(..count..)*100,
1), "%)")),
vjust=-1, size=2.7) +
xlab("Education Level") +
ylab("No. of\nParticipants") +
ggtitle("Education Level of Participants") +
theme(axis.title.y= element_text(angle=0), axis.ticks.x= element_blank(),
panel.background= element_blank(), axis.line= element_line(color= 'grey'))
The composite graph title is missing and the 3 charts have been combined in a manner that many parts have overlaps, making it hard for us to capture the information clearly from this dashboard.
The code chunk below creates a composite plot by using the
patchwork package, with the use of
plot_annotation() to add an overall graph title as well as
subplots tagging.
Note: p1 - p3 are assigned to plots presented earlier.
patchwork <- ((p1 | p2)/ p3) +
plot_annotation(tag_levels = 'I', title = 'Demographic of the city of Engagement, Ohio USA')
patchwork & theme_economist()