Global Trends in Cancer Mortality: A Shiny App Exploration (1990-2019)

Proposal

Project description
Author
Affiliation

VizMasters - Siva Rohit, Rohit Vatsava, Surya Vardhan, Monica Kommareddy, Miki, Ajay

School of Information, University of Arizona

Project Overview

Cancer continues to be a significant challenge for global health, impacting mortality rates and quality of life across the world. In 2022 alone, an estimated 9.7 million cancer-related deaths occurred, accompanied by 20 million new cases worldwide. Despite remarkable strides in understanding and managing cancer, the relentless rise in global mortality rates persists unabated. Our project endeavors to tap into the extensive datasets available on cancer from https://ourworldindata.org/cancer, curated by Max Roser and Hannah Ritchie. By integrating diverse datasets, we seek to unravel the intricate patterns underlying cancer distribution and its multifaceted impacts on diverse populations.

Reasons for Choosing the Dataset

  • The datasets sourced from https://ourworldindata.org/cancer offer comprehensive and reliable information on various aspects of cancer, including mortality rates, types, demographics, and geographic variations.

  • Analyzing these datasets can provide valuable insights into the prevalence, distribution, and risk factors associated with different types of cancer.

  • The availability of multiple datasets allows for a holistic analysis of cancer-related trends and patterns, contributing to a better understanding of the disease’s global burden.

Datasets:

The project utilizes the following CSV files obtained from the provided source, each containing information related to cancer statistics. Below is an overview of the data files:

01_annual-number-of-deaths-by-cause: This file contains information on the annual number of deaths attributed to various causes, including cancer and encompasses data from 6,840 records across 35 different health conditions, including neoplasms. It likely includes data on the total number of deaths per year due to cancer, as well as deaths attributed to specific cancer types.

02_total-cancer-deaths-by-type: With 6841 records, This file provides details on the total number of cancer-related deaths categorized by cancer type. It may include information such as the number of deaths caused by lung cancer, breast cancer, prostate cancer, etc.

04_cancer-death-rates-by-age: Datasets, with 5,472 records, This dataset presents cancer death rates categorized by age groups. It offers insights into how cancer mortality rates vary across different age demographics.

05_share-of-population-with-cancer-crude: This file contains data on the crude prevalence of cancer within the population. It likely includes the percentage or proportion of the population diagnosed with cancer, without adjusting for age or other factors.

06_share-of-population-with-cancer-types: This dataset provides information on the prevalence of different cancer types within the population. It may include the percentage of individuals diagnosed with specific cancer types such as lung cancer, colorectal cancer, etc.

07_number-of-people-with-cancer-by-age: This file contains data on the number of individuals diagnosed with cancer categorized by age groups. It offers insights into the distribution of cancer cases across different age demographics.

11_cancer-deaths-rate-and-age-standardized-rate-index: This dataset likely includes information on cancer death rates and age-standardized rate indexes. It may provide insights into trends in cancer mortality rates over time and across different regions, adjusted for age variations in the population.

Chosen for their depth and breadth, these datasets collectively provide a detailed picture of cancer’s impact globally. They serve as a valuable resource for public health analysis, informing policy-making, and guiding research into effective cancer prevention and treatment strategies. The data likely originates from reputable health data collection entities, including national health ministries and global health organizations, reflecting a standardized and authoritative approach to health data compilation across countries and years.

Data Wrangling and Displaying the Data

Rows: 6,840
Columns: 35
$ entity                                                                         <chr> …
$ code                                                                           <chr> …
$ year                                                                           <dbl> …
$ deaths_meningitis_sex_both_age_all_ages_number                                 <dbl> …
$ deaths_alzheimers_disease_and_other_dementias_sex_both_age_all_ages_number     <dbl> …
$ deaths_parkinsons_disease_sex_both_age_all_ages_number                         <dbl> …
$ deaths_nutritional_deficiencies_sex_both_age_all_ages_number                   <dbl> …
$ deaths_malaria_sex_both_age_all_ages_number                                    <dbl> …
$ deaths_drowning_sex_both_age_all_ages_number                                   <dbl> …
$ deaths_interpersonal_violence_sex_both_age_all_ages_number                     <dbl> …
$ deaths_maternal_disorders_sex_both_age_all_ages_number                         <dbl> …
$ deaths_hiv_aids_sex_both_age_all_ages_number                                   <dbl> …
$ deaths_drug_use_disorders_sex_both_age_all_ages_number                         <dbl> …
$ deaths_tuberculosis_sex_both_age_all_ages_number                               <dbl> …
$ deaths_cardiovascular_diseases_sex_both_age_all_ages_number                    <dbl> …
$ deaths_lower_respiratory_infections_sex_both_age_all_ages_number               <dbl> …
$ deaths_neonatal_disorders_sex_both_age_all_ages_number                         <dbl> …
$ deaths_alcohol_use_disorders_sex_both_age_all_ages_number                      <dbl> …
$ deaths_self_harm_sex_both_age_all_ages_number                                  <dbl> …
$ deaths_exposure_to_forces_of_nature_sex_both_age_all_ages_number               <dbl> …
$ deaths_diarrheal_diseases_sex_both_age_all_ages_number                         <dbl> …
$ deaths_environmental_heat_and_cold_exposure_sex_both_age_all_ages_number       <dbl> …
$ deaths_neoplasms_sex_both_age_all_ages_number                                  <dbl> …
$ deaths_conflict_and_terrorism_sex_both_age_all_ages_number                     <dbl> …
$ deaths_diabetes_mellitus_sex_both_age_all_ages_number                          <dbl> …
$ deaths_chronic_kidney_disease_sex_both_age_all_ages_number                     <dbl> …
$ deaths_poisonings_sex_both_age_all_ages_number                                 <dbl> …
$ deaths_protein_energy_malnutrition_sex_both_age_all_ages_number                <dbl> …
$ deaths_road_injuries_sex_both_age_all_ages_number                              <dbl> …
$ deaths_chronic_respiratory_diseases_sex_both_age_all_ages_number               <dbl> …
$ deaths_cirrhosis_and_other_chronic_liver_diseases_sex_both_age_all_ages_number <dbl> …
$ deaths_digestive_diseases_sex_both_age_all_ages_number                         <dbl> …
$ deaths_fire_heat_and_hot_substances_sex_both_age_all_ages_number               <dbl> …
$ deaths_acute_hepatitis_sex_both_age_all_ages_number                            <dbl> …
$ deaths_measles_sex_both_age_all_ages_number                                    <dbl> …
Rows: 6,840
Columns: 32
$ entity                                                                      <chr> …
$ code                                                                        <chr> …
$ year                                                                        <dbl> …
$ deaths_liver_cancer_sex_both_age_all_ages_number                            <dbl> …
$ deaths_kidney_cancer_sex_both_age_all_ages_number                           <dbl> …
$ deaths_lip_and_oral_cavity_cancer_sex_both_age_all_ages_number              <dbl> …
$ deaths_tracheal_bronchus_and_lung_cancer_sex_both_age_all_ages_number       <dbl> …
$ deaths_larynx_cancer_sex_both_age_all_ages_number                           <dbl> …
$ deaths_gallbladder_and_biliary_tract_cancer_sex_both_age_all_ages_number    <dbl> …
$ deaths_malignant_skin_melanoma_sex_both_age_all_ages_number                 <dbl> …
$ deaths_leukemia_sex_both_age_all_ages_number                                <dbl> …
$ deaths_hodgkin_lymphoma_sex_both_age_all_ages_number                        <dbl> …
$ deaths_multiple_myeloma_sex_both_age_all_ages_number                        <dbl> …
$ deaths_other_neoplasms_sex_both_age_all_ages_number                         <dbl> …
$ deaths_breast_cancer_sex_both_age_all_ages_number                           <dbl> …
$ deaths_prostate_cancer_sex_both_age_all_ages_number                         <dbl> …
$ deaths_thyroid_cancer_sex_both_age_all_ages_number                          <dbl> …
$ deaths_stomach_cancer_sex_both_age_all_ages_number                          <dbl> …
$ deaths_bladder_cancer_sex_both_age_all_ages_number                          <dbl> …
$ deaths_uterine_cancer_sex_both_age_all_ages_number                          <dbl> …
$ deaths_ovarian_cancer_sex_both_age_all_ages_number                          <dbl> …
$ deaths_cervical_cancer_sex_both_age_all_ages_number                         <dbl> …
$ deaths_brain_and_central_nervous_system_cancer_sex_both_age_all_ages_number <dbl> …
$ deaths_non_hodgkin_lymphoma_sex_both_age_all_ages_number                    <dbl> …
$ deaths_pancreatic_cancer_sex_both_age_all_ages_number                       <dbl> …
$ deaths_esophageal_cancer_sex_both_age_all_ages_number                       <dbl> …
$ deaths_testicular_cancer_sex_both_age_all_ages_number                       <dbl> …
$ deaths_nasopharynx_cancer_sex_both_age_all_ages_number                      <dbl> …
$ deaths_other_pharynx_cancer_sex_both_age_all_ages_number                    <dbl> …
$ deaths_colon_and_rectum_cancer_sex_both_age_all_ages_number                 <dbl> …
$ deaths_non_melanoma_skin_cancer_sex_both_age_all_ages_number                <dbl> …
$ deaths_mesothelioma_sex_both_age_all_ages_number                            <dbl> …
Rows: 6,840
Columns: 10
$ entity                                              <chr> "Afghanistan", "Af…
$ code                                                <chr> "AFG", "AFG", "AFG…
$ year                                                <dbl> 1990, 1991, 1992, …
$ deaths_neoplasms_sex_both_age_70_years_rate         <dbl> 1021.49, 1013.76, …
$ deaths_neoplasms_sex_both_age_50_69_years_rate      <dbl> 407.23, 404.51, 40…
$ deaths_neoplasms_sex_both_age_15_49_years_rate      <dbl> 43.62, 40.53, 37.1…
$ deaths_neoplasms_sex_both_age_5_14_years_rate       <dbl> 9.37, 9.46, 9.74, …
$ deaths_neoplasms_sex_both_age_under_5_rate          <dbl> 21.33, 18.70, 16.8…
$ deaths_neoplasms_sex_both_age_age_standardized_rate <dbl> 159.96, 158.46, 15…
$ deaths_neoplasms_sex_both_age_all_ages_rate         <dbl> 101.41, 93.71, 84.…
Rows: 6,780
Columns: 4
$ entity                                                                          <chr> …
$ code                                                                            <chr> …
$ year                                                                            <dbl> …
$ current_number_of_cases_of_neoplasms_per_100_people_in_both_sexes_aged_all_ages <dbl> …
Rows: 6,780
Columns: 24
$ entity                                                                                                                <chr> …
$ code                                                                                                                  <chr> …
$ year                                                                                                                  <dbl> …
$ current_number_of_cases_of_liver_cancer_per_100_people_in_both_sexes_aged_age_standardized                            <dbl> …
$ current_number_of_cases_of_kidney_cancer_per_100_people_in_both_sexes_aged_age_standardized                           <dbl> …
$ current_number_of_cases_of_larynx_cancer_per_100_people_in_both_sexes_aged_age_standardized                           <dbl> …
$ current_number_of_cases_of_breast_cancer_per_100_people_in_both_sexes_aged_age_standardized                           <dbl> …
$ current_number_of_cases_of_thyroid_cancer_per_100_people_in_both_sexes_aged_age_standardized                          <dbl> …
$ current_number_of_cases_of_bladder_cancer_per_100_people_in_both_sexes_aged_age_standardized                          <dbl> …
$ current_number_of_cases_of_uterine_cancer_per_100_people_in_both_sexes_aged_age_standardized                          <dbl> …
$ current_number_of_cases_of_ovarian_cancer_per_100_people_in_both_sexes_aged_age_standardized                          <dbl> …
$ current_number_of_cases_of_stomach_cancer_per_100_people_in_both_sexes_aged_age_standardized                          <dbl> …
$ current_number_of_cases_of_prostate_cancer_per_100_people_in_both_sexes_aged_age_standardized                         <dbl> …
$ current_number_of_cases_of_cervical_cancer_per_100_people_in_both_sexes_aged_age_standardized                         <dbl> …
$ current_number_of_cases_of_testicular_cancer_per_100_people_in_both_sexes_aged_age_standardized                       <dbl> …
$ current_number_of_cases_of_pancreatic_cancer_per_100_people_in_both_sexes_aged_age_standardized                       <dbl> …
$ current_number_of_cases_of_esophageal_cancer_per_100_people_in_both_sexes_aged_age_standardized                       <dbl> …
$ current_number_of_cases_of_nasopharynx_cancer_per_100_people_in_both_sexes_aged_age_standardized                      <dbl> …
$ current_number_of_cases_of_colon_and_rectum_cancer_per_100_people_in_both_sexes_aged_age_standardized                 <dbl> …
$ current_number_of_cases_of_non_melanoma_skin_cancer_per_100_people_in_both_sexes_aged_age_standardized                <dbl> …
$ current_number_of_cases_of_lip_and_oral_cavity_cancer_per_100_people_in_both_sexes_aged_age_standardized              <dbl> …
$ current_number_of_cases_of_tracheal_bronchus_and_lung_cancer_per_100_people_in_both_sexes_aged_age_standardized       <dbl> …
$ current_number_of_cases_of_brain_and_central_nervous_system_cancer_per_100_people_in_both_sexes_aged_age_standardized <dbl> …
$ current_number_of_cases_of_gallbladder_and_biliary_tract_cancer_per_100_people_in_both_sexes_aged_age_standardized    <dbl> …
Rows: 6,468
Columns: 8
$ entity                                               <chr> "Afghanistan", "A…
$ code                                                 <chr> "AFG", "AFG", "AF…
$ year                                                 <dbl> 1990, 1991, 1992,…
$ prevalence_neoplasms_sex_both_age_70_years_number    <dbl> 5468.172, 5542.41…
$ prevalence_neoplasms_sex_both_age_50_69_years_number <dbl> 15160.55, 15403.3…
$ prevalence_neoplasms_sex_both_age_15_49_years_number <dbl> 11118.80, 11275.6…
$ prevalence_neoplasms_sex_both_age_5_14_years_number  <dbl> 2162.520, 2182.99…
$ prevalence_neoplasms_sex_both_age_under_5_number     <dbl> 3458.692, 3380.74…
Rows: 6,840
Columns: 6
$ entity                                              <chr> "Afghanistan", "Af…
$ code                                                <chr> "AFG", "AFG", "AFG…
$ year                                                <dbl> 1990, 1991, 1992, …
$ deaths_neoplasms_sex_both_age_age_standardized_rate <dbl> 159.96, 158.46, 15…
$ deaths_neoplasms_sex_both_age_all_ages_rate         <dbl> 101.41, 93.71, 84.…
$ deaths_neoplasms_sex_both_age_all_ages_number       <dbl> 11580, 11796, 1221…

Questions to Answer

Question 1

What is the contribution of cancer in global mortality?

Question 2

which age groups are most affected by cancer

Question 3

which types of cancers are most prevalent worldwide.

Analysis plan

Question 1

we will sumarize the total number of deaths from other causes and the total number of deaths from cancers over years and Calculate the percentage contribution of cancer-related deaths to total deaths. We will use ggplot2 and plotly to create interactive plot to users where they can select years and countries

Question 2

We will combine the data over years to find the total or average death rate for each age group. Compare these rates across age groups to identify which ones have higher rates of cancer-related deaths. We will use bar charts to show cancer deaths by age groups and we will use ggplot2 and plotly to create interactive plot to users where they can select type of cancer they can select and years

Question 3

we will summarize the number of cases per 100 people for each cancer type. Rank these cancer types based on prevalence to identify the top ones and we will use horizontal bar chart to show different types of cancers and we will use ggplot and plotly to create interactive plot to the users where they can select different years and also types of cancer to compare

Timeline

Task Deadline Ownership Status
Submit project name 20-Mar-2024 All Complete
Choose project 2 dataset 27-Mar-2024 All Complete
Proposal section -Introduction 02-Apr-2024 Monica Complete
Proposal section -Why this data 02-Apr-2024 Rohit V K Complete
Proposal section -Analysis Goal 02-Apr-2024 SRK Complete
Proposal section - Data Strucure 02-Apr-2024 Ajay Complete
Proposal section - Data Wrangling 02-Apr-2024 Surya Complete
Proposal section - Project schedule / Time line 02-Apr-2024 Miki Complete
Submit proposal draft for peer review 03-Apr-2024 All
Implement Peer review comments 06-Apr-2024 All
Resubmit Project Proposal for Final Review by Professor 08-Apr-2024 All
Update About section of Project website 14-Apr-2024 All
Incorporate Feedback from Professor 14-Apr-2024 All
Publish project 2 website 15-Apr-2024 SRK
Code - Data cleasing 23-Apr-2024 Surya & Ajay
Code - Data Wrangling 23-Apr-2024 Rohit V K & Monica
Code - Plot creation 23-Apr-2024 Miki & SRK
Code - Interpretation 23-Apr-2024 All
Internal project review 26-Apr-2024 All
Peer Code review 29-Apr-2024 Peer Team
First draft of presentation 03-May-2024 All
Final draft of presentation 06-May-2024 All
Final submission 06-May-2024 All

Citations

Source of Data :

Global cancer burden growing, amidst mounting need for services. (2024, February 1). “https://ourworldindata.org/cancer