Data Visualization Step Two

Education Techonology

Author

Affiliation

Michelle Cui

 

Published

March 6, 2025

Citation

Cui, 2025


Research Question 1

What is the relationship between teacher’s perspectives on the school’s attitude toward using ICT technology in the classroom?
(This question aims to explore institutional attitude toward information and communication technology (ICT) influence individual teacher behavior in adopting technology in the classroom).

Variable selection:
it2g06a: ICT use at school when teaching; it2g14a: Your School/Use of ICT in teaching at your school/ICT is considered a priority for use in teaching; cntry: Country(Denmark, Finland, and Uruguay); partt: Survey participation; idteach: Teacher ID;

Visualize with tables

Show code
datasummary_skim(rq1_data, 
                 type = "categorical")
N %
cntry DNK 854 20.7
FIN 2451 59.4
URY 818 19.8
it2g06a Never 76 1.8
Less than once a month 194 4.7
At least once a month but not every week 444 10.8
At least once a week but not every day 946 22.9
Every day 2463 59.7
it2g14a Strongly Agree 1787 43.3
Agree 1948 47.2
Disagree 360 8.7
Strongly Disagree 28 0.7

The summary table looks good so far, but we can make it better by integrating table-making packages. At here, I added output = “gt” to apply the gt() function to the rest of the table. However, the summary table looks still robust.

Show code
rq1_summary <- rq1_data %>% 
  datasummary_skim(type = "categorical", output = "gt") %>% 
  tab_header(
    title = "Summary of Teachers' ICT Perspectives (2018 & 2020)"
  ) %>% 
  cols_width(everything() ~ px(130)) %>% 
  opt_table_outline()

rq1_summary
Summary of Teachers' ICT Perspectives (2018 & 2020)
N %
cntry DNK 854 20.7
FIN 2451 59.4
URY 818 19.8
it2g06a Never 76 1.8
Less than once a month 194 4.7
At least once a month but not every week 444 10.8
At least once a week but not every day 946 22.9
Every day 2463 59.7
it2g14a Strongly Agree 1787 43.3
Agree 1948 47.2
Disagree 360 8.7
Strongly Disagree 28 0.7

Then, I decided to subgroup the dataset by country and year to show the counts and percentage for each category.

Show code
rq1_summary %>% 
  gt() %>% 
  tab_header(title = "Teachers' ICT perspectives by country and year (2018 & 2020)") %>% 
  cols_label(
    cntry = "Country",
    it2g06a = "ICT Use Frequency",
    it2g14a = "School ICT Priority",
    `2018_N` = "N (2018)",
    `2018_Percent` = "% (2018)",
    `2020_N` = "N (2020)",
    `2020_Percent` = "% (2020)"
  ) %>% 
  fmt_number(columns = c(`2018_Percent`, `2020_Percent`), decimals = 1) %>%
  tab_options(table.width = px(100),
              table.font.size = px(12)) %>% 
  tab_style(
    style = cell_text(whitespace = "nowrap"),
    locations = cells_body(columns = c(it2g06a,it2g14a))) %>% 
  cols_width(
    it2g06a ~ px(200),
    everything() ~ px(100)) %>% 
  tab_style(
    style = list(cell_fill(color = "#9EDBDB")),
    location = list(
      cells_body(columns = everything(), rows = it2g06a == "Every day"))
  ) %>% 
  tab_footnote(
    footnote = "Note: DNK refers to Denmark, FIN refers to Finland, and URY refers to Uruguay.")
Teachers' ICT perspectives by country and year (2018 & 2020)
Country ICT Use Frequency School ICT Priority N (2018) N (2020) % (2018) % (2020)
DNK Less than once a month Disagree 1 NA 0.0 NA
DNK At least once a month but not every week Strongly Agree 2 2 0.0 0.0
DNK At least once a month but not every week Agree 5 3 0.1 0.1
DNK At least once a month but not every week Disagree 1 NA 0.0 NA
DNK At least once a week but not every day Strongly Agree 51 21 1.2 0.5
DNK At least once a week but not every day Agree 46 22 1.1 0.5
DNK At least once a week but not every day Disagree 2 NA 0.0 NA
DNK Every day Strongly Agree 243 288 5.9 7.0
DNK Every day Agree 83 78 2.0 1.9
DNK Every day Disagree 5 1 0.1 0.0
FIN Never Agree 6 2 0.1 0.0
FIN Never Disagree 2 1 0.0 0.0
FIN Less than once a month Strongly Agree 12 10 0.3 0.2
FIN Less than once a month Agree 30 18 0.7 0.4
FIN Less than once a month Disagree 9 5 0.2 0.1
FIN At least once a month but not every week Strongly Agree 32 21 0.8 0.5
FIN At least once a month but not every week Agree 100 57 2.4 1.4
FIN At least once a month but not every week Disagree 19 8 0.5 0.2
FIN At least once a month but not every week Strongly Disagree 1 2 0.0 0.0
FIN At least once a week but not every day Strongly Agree 91 69 2.2 1.7
FIN At least once a week but not every day Agree 187 159 4.5 3.9
FIN At least once a week but not every day Disagree 37 15 0.9 0.4
FIN At least once a week but not every day Strongly Disagree 1 NA 0.0 NA
FIN Every day Strongly Agree 289 433 7.0 10.5
FIN Every day Agree 360 390 8.7 9.5
FIN Every day Disagree 45 31 1.1 0.8
FIN Every day Strongly Disagree 4 5 0.1 0.1
URY Never Strongly Agree 6 3 0.1 0.1
URY Never Agree 20 10 0.5 0.2
URY Never Disagree 16 5 0.4 0.1
URY Never Strongly Disagree 2 3 0.0 0.1
URY Less than once a month Strongly Agree 11 7 0.3 0.2
URY Less than once a month Agree 30 24 0.7 0.6
URY Less than once a month Disagree 20 15 0.5 0.4
URY Less than once a month Strongly Disagree 2 NA 0.0 NA
URY At least once a month but not every week Strongly Agree 16 15 0.4 0.4
URY At least once a month but not every week Agree 69 34 1.7 0.8
URY At least once a month but not every week Disagree 40 15 1.0 0.4
URY At least once a month but not every week Strongly Disagree 1 1 0.0 0.0
URY At least once a week but not every day Strongly Agree 29 36 0.7 0.9
URY At least once a week but not every day Agree 57 71 1.4 1.7
URY At least once a week but not every day Disagree 24 24 0.6 0.6
URY At least once a week but not every day Strongly Disagree 1 3 0.0 0.1
URY Every day Strongly Agree 43 57 1.0 1.4
URY Every day Agree 39 48 0.9 1.2
URY Every day Disagree 9 10 0.2 0.2
URY Every day Strongly Disagree NA 2 NA 0.0
Note: DNK refers to Denmark, FIN refers to Finland, and URY refers to Uruguay.

Visualize with figures

The table summary can help me to get an idea of the structure of my dataset. However, it takes too much spaces. Therefore, I decided to visualize it with a plot.

My initial attempt was not that bad, but it looks too busy, and the color looks too bright. So I changed the position to dodge in geom_bar, and applied scale_fill_OkabeIto() to make it color-blind friendly.

Show code
rq1_p1 <- ggplot(rq1_data, aes(x = it2g14a, fill = it2g06a)) +
  geom_bar(position = "fill") +
  facet_wrap(year ~ cntry) +
  labs(
    title = "Teacher ICT Use in Teaching by School ICT Priority",
    x = "School ICT Priority",
    y = "Propportion",
    fill = "Teacher ICT Use Frequency in Teaching"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))
rq1_p1

Show code
rq1_p2 <- ggplot(rq1_data, aes(x = it2g06a, fill = it2g14a)) +
  geom_bar(position = "dodge") +
  facet_wrap(year ~ cntry) +
  labs(
    title = "Relationship between Teacher ICT Use and Schools' perspective on ICT",
    fill = "School ICT Priority",
    y = "Propportion",
    x = "Teacher ICT Use Frequency in Teaching"
  ) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  scale_fill_OkabeIto()
rq1_p2

I am not satisfied with that because it is difficult to tell the difference. So I decided to factor year, and group the data by country, year, ICT use frequency, and school priority and got this.

Show code
rq1_data$year <- as.factor(rq1_data$year)

rq1_p3 <- rq1_data %>% 
  group_by(year, cntry, it2g06a,it2g14a) %>% 
  summarise(count = n(), .groups = "drop") %>% 
  ggplot(aes(x = year,
         y = count,
         group = interaction(cntry,it2g14a),
         color = it2g14a,
         linetype = cntry,
         shape = cntry))+
  geom_line() +
  geom_point(size = 2) +
  facet_grid(~ it2g06a) +
  labs(title = "Figure 1. ICT Use Across Countries (2018 & 2020)",
       x = "Year of Response",
       y = "Count",
       color = "School ICT Priority",
       linetype = "Country",
       shape = "Country"
  ) +
  scale_color_OkabeIto() +
  scale_shape_manual(values = c("DNK" = 0, "FIN" = 15, "URY" = 5))+
  theme(panel.spacing = unit(1, "lines"))

rq1_p3

Research Question 2

What is the relationship between teachers’ technology use in the classroom and their major teaching subjects?
(This question aims to explore whether teachers in different subject areas (e.g., mathematics, sciences, language arts) use technology more frequently or differently than their peers in other subjects).

Variable selection: it2g06a: ICT use at school when teaching
Main Subjects in School Year:
it2g03a: Language arts test language
it2g03b: Language arts foreign and other national languages
it2g03c: Mathematics
it2g03d: Sciences
it2g03e: Human sciences/Humanities
it2g03f: Creative arts
it2g03g: Information technology
it2g03h: Practical and vocational subjects
it2g03i: Other
Type of variable: Binary
IT2G14A: Your School/Use of ICT in teaching at your school/ICT is considered a priority for use in teaching – X-axis
cntry: Country(Denmark, Finland, and Uruguay)
partt: Survey participation
idteach: Teacher ID

Show code
rq2_data <- mergedtp %>% 
  clean_names() %>% 
  select(idteach, cntry, partt, year, it2g06a,it2g03a, it2g03b, it2g03c, it2g03d, it2g03e, it2g03f, it2g03g, it2g03h, it2g03i) %>% 
  filter(partt == 1)

rq2_data <- na.omit(rq2_data)

rq2_data <- rq2_data %>% 
  mutate(it2g06a = factor(it2g06a,
                          level = 1:5,
                          labels = c("Never", "Less than once a month", "At least once a month but not every week", "At least once a week but not every day", "Every day"), ordered = TRUE),
         across(starts_with("it2g03"), ~ factor(.x, levels = 1:2, labels = c("Checked", "Not checked"))))

rq2_aggregated <- rq2_data %>% 
  pivot_longer(cols = starts_with("it2g03"), names_to = "subject", values_to = "checked") %>% 
  filter(checked == "Checked") %>% 
  group_by(subject, it2g06a) %>% 
  summarise(count = n(), .groups = 'drop') %>%
  mutate(subject = factor(subject, 
                          levels = c("it2g03a", "it2g03b", "it2g03c", "it2g03d", "it2g03e", "it2g03f", "it2g03g", "it2g03h", "it2g03i"),
                          labels = c("Language arts test language", "Language arts foreign and other national languages", "Mathematics", "Sciences","Human sciences/Humanities", "Creative arts", "Information technology", "Practical and vocational subjects", "Other")))
Show code
p_subject <- rq2_aggregated %>% 
  ggplot(aes(x = it2g06a, y = subject, fill = count)) +
  geom_tile() +
  scale_fill_gradient(low = "white", high = "orange") +
  labs(title = "Relationship Between Teachers’ Technology Use in School and Teaching Subjects",
       x = "Technology Use",
       y = "Teaching Subject",
       fill = "Count") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

p_subject

Show code
p_subject_2 <- rq2_data %>% 
  pivot_longer(cols = starts_with("it2g03"), names_to = "subject", values_to = "checked") %>% 
  filter(checked == "Checked") %>% 
  mutate(subject = factor(subject, 
                          levels = c("it2g03a", "it2g03b", "it2g03c", "it2g03d", "it2g03e", "it2g03f", "it2g03g", "it2g03h", "it2g03i"),
                          labels = c("Language arts test language", "Language arts foreign and other national languages", "Mathematics", "Sciences","Human sciences/Humanities", "Creative arts", "Information technology", "Practical and vocational subjects", "Other"))) 

p_subject_p2 <- p_subject_2 %>% 
  ggplot(aes(x = subject, fill = it2g06a)) +
  geom_bar(position = "dodge") +
  facet_wrap(~year)+
  labs(title = "Frequncy of ICT Use by Subject",
       x = "Subject",
       y = "Proportion of Teachers",
       fill = "ICT Use Frequency") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  scale_fill_OkabeIto()

p_subject_p2

Show code
p_subject_p3 <- p_subject_2 %>% 
  ggplot(aes(x = subject, y = as.numeric(it2g06a), fill = subject)) +
  geom_violin() +
  labs(title = "Density of Techology Use Frequency by Subject",
       x = "Subject",
       y = "ICT Use Frequency") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  scale_fill_OkabeIto()

p_subject_p3

Show code
p_subject_p4 <- rq2_data %>%
  pivot_longer(cols = starts_with("it2g03"), names_to = "subject", values_to = "checked") %>% 
  filter(checked == "Checked") %>% 
  group_by(subject, it2g06a, year, cntry) %>% 
  summarise(count = n(), .groups = 'drop') %>%
  mutate(subject = factor(subject, 
                          levels = c("it2g03a", "it2g03b", "it2g03c", "it2g03d", "it2g03e", "it2g03f", "it2g03g", "it2g03h", "it2g03i"),
                          labels = c("Language arts test language", "Language arts foreign and other national languages", "Mathematics", "Sciences","Human sciences/Humanities", "Creative arts", "Information technology", "Practical and vocational subjects", "Other"))) %>% 
  ggplot(aes(x = subject, y = count, fill = factor(year))) +
  geom_bar(stat = "identity", position = position_dodge(width = 1)) +
  labs(title = "Relationship Between Teachers’ Technology Use in School and Teaching Subjects",
       x = "Teaching Subjects",
       y = "Number of teachers using ICT",
       fill = "Year")+
  facet_wrap(~ cntry)+
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
  scale_fill_OkabeIto()

p_subject_p4

Footnotes

    Citation

    For attribution, please cite this work as

    Cui (2025, March 6). Data Visualization Final Project: Data Visualization Step Two. Retrieved from https://go-m-c.github.io/edld-652-blog/posts/2025-03-06-data-visualization-step-two/

    BibTeX citation

    @misc{cui2025data,
      author = {Cui, Michelle},
      title = {Data Visualization Final Project: Data Visualization Step Two},
      url = {https://go-m-c.github.io/edld-652-blog/posts/2025-03-06-data-visualization-step-two/},
      year = {2025}
    }