Examining Global Refugee Trends and Correlations Over the Years

INFO 526 - Fall 2023 - Project 1

Datatude Dynamos- Kaarthik Sai, Kamaljeet Singh, Surya Vardhan Dama, Ayesha Khatun, Mrunali Yadav, Rajitha

Our Dataset

Our target dataset comes from the {refugees} R package, which compiles extensive information on populations that have been strongly displaced from three main sources: UNHCR, UNRWA, and IDMC.

  • “Refugees” dataset from tidytuesday.
  • The entire dataset has been refined to highlight the countries from which refugees are coming to the USA.
  • Focusing on key variables such as 'year', 'coo_name', 'coa_name', from 'refugees' data for plotting.

Aim

Our Project’s aim is to prepare a comprehensive analysis that aims to uncover and present patterns in refugee populations globally over time. This study intends to delve into the patterns of refugee migration to the USA, tracing movements from various countries over time. Through analyzing refugee data on an annual basis and adjusting for population figures, our goal is to identify trends and correlate them with major global occurrences such as wars, environmental crises, Policy changes, and economic shifts.

Question 01

How have the patterns in refugee populations evolved over time, and how have the stances of American political parties impacted these developments? Analyse the dynamics of refugee migrations towards US political environments.

Data Preparation and Pre-processing

  • Using the R dplyr package
  • First aggregate our data by year before summarizing the overall number of refugees
  • Divide these totals into thousands for easier comprehension
total_refugee_trends <- population_data %>%
  group_by(year) %>%
  summarise(total_refugees = sum(refugees, 
                                 na.rm = TRUE) / 1000)  

Analysis: Refugee populations over time

  • Expeditious growth of the refugee population over time

Analysis: Refugee populations over time

  • A small number of refugees have returned over a period of time

Analysis: Refugees by year and country of origin

  • Top 10 countries with the highest total number of refugees
  • The maximum number of refugee came has come from Syrian, Afghanistan and South Sudan

Analysis: Refugees by year and country of origin

  • Top 10 countries with the highest total number of asylum seekers

Analysis: Refugee populations over time with US political context.

  • A small number of refugees have returned over a period of time

Question 02

How does the global refugee population’s migration to the USA vary across different countries and years? Are there external factors, such as COVID-19, war, climate change, financial stability, or policy changes, that influence the number of refugees seeking asylum in the USA?

  • This presentation aims to showcase the evolution of the refugee population migrating to the USA from various countries annually, spanning from 2010 to 2022.

Approach

  • Define Pre-processing Function:

  • Defined a function processRefugees to preprocess the refugee dataset. Selected relevant columns and handled missing data. Rename countries to standard names and created a categorical variable for grouping. Returned the pre-processed dataset. Process Data and Create Plots:

  • We Split population data into decades. Applied pre-processing function to each decade’s data. Defined a function generateRefugeePlot to generate a ggplot map plot. Used the pre-processed data to create plots for each year.

Code

Glimpse of process Function
processRefugees <- function (dataset, unique_countries) {
  filtered_data <- dataset |>
    # filtering only country name, year and refugees columns
    select(coo_name, 
           year, 
           refugees) |>
    bind_rows(
      # anti_join() is used to return only the rows from the first dataset that isn't having matching rows in the second dataset based on specified key columns
      anti_join(unique_countries, 
                dataset, 
                by = c("region" = "coo_name")) |>
        # adding year and number of refugees for that specific year as NA
        mutate(year = as.integer(dataset[1, 
                                         "year"]), 
               refugees = NA)
    ) |>
    
    
    
    
    mutate(
      coo_name = case_when(
        coo_name == "United States of America" ~ "USA",
        coo_name == "United Kingdom of Great Britain and Northern Ireland" ~ "UK",
        coo_name == "Iran (Islamic Rep. of)" ~ "Iran",
        coo_name == "Palestinian" ~ "Palestine",
        coo_name == "Serbia and Kosovo: S/RES/1244 (1999)" ~ "Serbia",
        coo_name == "Türkiye" ~ "Turkey",
        coo_name == "Congo" ~ "Congo",
        coo_name == "Dem. Rep. of the Congo" ~ "Democratic Republic of the Congo",
        coo_name == "Cote d'Ivoire" ~ "Ivory Coast",
        coo_name == "Central African Rep." ~ "Central African Republic",
        coo_name == "United Rep. of Tanzania" ~ "Tanzania",
        coo_name == "Russian Federation" ~ "Russia",
        coo_name == "Syrian Arab Rep." ~ "Syria",
        coo_name == "Bolivia (Plurinational State of)" ~ "Bolivia",
        coo_name == "Dominican Rep." ~ "Dominican Republic",
        coo_name == "Venezuela (Bolivarian Republic of)" ~ "Venezuela",
        coo_name == "Czechia" ~ "Czech Republic",
        coo_name == "Rep. of Korea" ~ "South Korea",
        coo_name == "Dem. People's Rep. of Korea" ~ "North Korea",
        coo_name == "Lao People's Dem. Rep." ~ "Laos",
        coo_name == "Viet Nam" ~ "Vietnam",
        coo_name == "China, Hong Kong SAR" ~ "Hong Kong",
        coo_name == "Netherlands (Kingdom of the)" ~ "Netherlands",
        coo_name == "Cabo Verde" ~ "Cape Verde",
        coo_name == "China, Macao SAR" ~ "Macao",
        coo_name == "Holy See" ~ "Vatican City",
        TRUE ~ coo_name
      )
    ) |>
    # creating a categorical variable refugee_m to group countries based on their number of refugee's 
  mutate(
    refugees_m = case_when(
      refugees < 100 ~ "<100",
      refugees >= 100 & refugees < 500 ~ "100 to 500",
      refugees >= 500 & refugees < 1000 ~ "500 to 1000",
      refugees >= 1000 & refugees < 2000 ~ "1k to 2k",
      refugees >= 2000 & refugees < 3000 ~ "2k to 3k",
      refugees >= 3000 & refugees < 4000 ~ "3k to 4k",
      refugees >= 4000 & refugees < 5000 ~ "4k to 5k",
      refugees >= 5000 & refugees < 7000 ~ "5k to 7k",
      refugees >= 7000 & refugees < 10000 ~ "7k to 10k",
      refugees >= 10000 & refugees < 20000 ~ "10k to 20k",
      refugees >= 20000 & refugees < 50000 ~ "20k to 50k",
      refugees >= 50000 & refugees < 100000 ~ "50k to 100k",
      refugees >= 100000 ~ "100k+",
      is.na(refugees)  ~ "NA"
    )
  ) %>%
  mutate(
    refugees_m = factor(refugees_m, 
                        levels = c("<100", 
                                   "100 to 500", 
                                   "1k to 2k",
      "2k to 3k",
      "3k to 4k",
      "4k to 5k",
      "5k to 7k",
       "7k to 10k",
       "10k to 20k",
       "20k to 50k",
       "50k to 100k",
      "100k+",
       "NA"))
  )

  return(filtered_data)
}
Function used to generate the plot
# Assuming filtered_data is a list of data frames for each year
filtered_data <- lapply(filtered_data, 
                        function(df) {
  df %>%
    filter(!is.na(coo_name))
})


generateRefugeePlot <- function(year) {
  
  world_plot <- ggplot(filtered_data[[as.character(year)]], 
                       aes(map_id = coo_name)) +
    geom_map(
      aes(fill = refugees_m),
      map   = world,
      color = "#B2BEB5",
      linewidth = 0.25,
      linetype  = "blank"
    ) +
    expand_limits(x = world$long, y = world$lat) +
    scale_fill_manual(values = color_mapping, na.value = "#F2F3F4") +
    coord_fixed(ratio = 1) +
    labs(
      title = paste("Number of Refugees by Country in", 
                    year),
      subtitle = "Migrated to USA",
      caption = "Data source: TidyTuesday",
      fill = "need to specify"
    ) +
    theme_void() +
    theme(
      legend.position = "bottom",
      legend.direction = "horizontal",
      plot.title = element_text(size = 19, 
                                face = "bold", 
                                hjust = 0.5),
      plot.subtitle = element_text(size = 15, 
                                   color = "azure4", 
                                   hjust = 0.5),
      plot.caption = element_text(size = 12, 
                                  color = "azure4", 
                                  hjust = 0.95)
    ) +
    guides(
      fill = guide_legend(
        nrow = 1,
        direction = "horizontal",
        title.position = "top",
        title.hjust = 0.5,
        label.position = "bottom",
        label.hjust = 1,
        label.vjust = 1,
        label.theme = element_text(lineheight = 0.25, 
                                   size = 9),
        keywidth = 1,
        keyheight = 0.5
      )
    )
  return(world_plot)
}

Plot-Alpha

War Effect in Year 2010-2016

  • Early in the decade, migrations might be influenced by the aftermath of the global financial crisis of 2008, conflicts such as the war in Afghanistan or Iraq, or ongoing issues in countries like Somalia and china.
  • This period may show increased migrations from the Middle East, particularly Syria, due to the Syrian Civil War beginning in 2011.
  • Other regions might also exhibit changes due to local conflicts or economic instability.

Economic / Climate Conditions in Year 2016-2019

  • The refugee crisis in Europe might influence numbers, with potential spillover effects on US Refugee/asylum applications.
  • Policy changes in the USA regarding immigration during the new administration might also become visible.
  • Continued conflicts and economic issues in various countries could maintain or increased refugee movements.

World Health Crisis and Policy Change in Year 2019-2022

  • The COVID-19 pandemic would likely cause a significant drop in migrations worldwide due to travel restrictions and border closures.
  • Post-pandemic recovery may lead to an increase in migrations as countries lift travel bans.
  • The situation in Afghanistan post-US withdrawal could result in an increase in refugees from that region.

EUREKAAAAAAAAAA!!!!!!!

  • While not typically classified as refugees, many Chinese nationals seek to leave China for economic opportunities
  • The relationship between China and the USA, including U.S. immigration policy, can influence refugee flows. Policies that allow for a greater number of asylum applications or provide specific provisions for individuals from China can result in higher refugee numbers. Additionally, there may be specific legislative acts or policies targeting the protection of certain groups from China, which could lead to an increase in accepted refugee applications.

Challenges faced

  • Data only available from 2010 not before that
  • Animation and frame rate selection.
  • Error in data type and rendering method selection of gif during animation.

Conclusion

By analyzing annual refugee data alongside global events, our study reveals how geopolitical conflicts, natural disasters, and economic changes drive global displacement patterns. This comprehensive examination underscores the urgency of addressing the root causes of forced migration and the importance of informed humanitarian responses.

References

[1] Title: Refugees, Source: tidytuesday, Link: https://github.com/rfordatascience/tidytuesday/blob/master/data/2023/2023-08-22/readme.md

[2] Analyzed some of the global trends of refugee population from UNHCR - “https://www.unhcr.org/us/global-trends”

[3] Quarto, For documentation and presentation - Quarto

[4] ggplot, For understanding of different plot - ggplot

[5] Our Presentation logo - Link: https://www.vectorstock.com/royalty-free-vector/family-people-and-earth-nature-logo-vector-21169176

Any Question?