Dataset : Overview of Data Gathered

Dataset Download Source Link Type/Information Description
Data1 csv link Numeric/Combined Data Obesity Percentage By Country From 1975-2016
Data2 csv link Numeric/Text/Missing Values Nutrition Physical Activity and Obesity Behavioral Risk Factor Surveillance System
Data3_1 - 3_5 csv link COVID/Food/Protein/Geo-location Food Supply by Country, Find Better Dietary Option for Stronger Body
Data4 csv link Numeric/Text/Health Acute Liver Failuer Data
Data5 csv link Numeric/Text/Combined Cardiac Disease Data
Data6_1 - 6_2 api code and csv api documentation Corpus/Text/Combined Subreddit Keto and Intermittend Fasting hottest submissions for a specific gathered date
Data7 api code (no csv available) api documentation Corpus/Text/Combined Scraping News with R

Data 1: Obesity Percentage By Country From 1975-2016

Kaggle data, data 1 from WHO gives the adult obsesity for each country by percentage from 1975-2016. Male, female and both data is included. Can be used for analyzing q1 q2 q3. Data 1 is cleaned.

Responsive image

Data 2: Nutrition Physical Activity and Obesity Behavioral Risk Factor Surveillance System

Kaggle data, data 2 provides. Data 2 is not cleaned.

Responsive image

Data 3: Food Supply by Country, Find Better Dietary Option for Stronger Body

Data 3 include fat quantity, energy intake (kcal), food supply quantity (kg), and protein for different categories of food (all calculated as percentage of total intake amount). At the end of the dataset are COVID-19 related cases, death etc. The original purpose of this dataset is to find optimal dietary option and healthy eating style to alleviate COVID-19 crisis. It is looking at non-pharmaceutical interventions to help fight severe disease. Although not tied directly to the topic of obesity, we may find insights on what type of food helps build stronger immune system and healther body.

- Different Food Supply in Kcal

Responsive image

- Different Food Fat Quantity

Responsive image

- Food Supply Quantity in KG

Responsive image

- Amount of Protein in Different Food Supply

Responsive image

- Food supply dictionary/Explaination

Responsive image

Data 4 & 5: Acute Liver Failuer Data and Cardio Disease Data

Kaggle data, data 4 presents patients with acute liver failureUse data 4 and 5 which to understand the weight and health statitics of patient with major disease. Data 4 cleaned, data 5 is not cleaned.

- Acute Liver Failure Patient Statistics

Responsive image

- Cardiology Disease Patient Statistics

Responsive image

Data6: Scraping Reddit with PRAW(API) in Python

The data collected here used API called PRAW, link below for detailed description on how to sign up for access to Reddit. The reddit data collection focuses on healthy diet and its impact on weightloss. This include intermitent fasting and ketogenetic diet for now. I want to understand the general sentiment and basic evalution people have towards this kind of weightloss approach and have a elementary understanding if they work or not. The top 1000 posts are collected from each subreddit. Furthur analysis requires removal of useless entry with the following criteria: 1. less than 20 comments to the post, no defined keyword in posts like weightloss, from, etc which requires furthur investigation.

- Keto Subreddit Hot Posts Data

Responsive image

- Intermitent Fasting Subreddit Hot Posts

Responsive image

Data7: Scraping News with R

The use-case of this API is to search for obesity-related news articles and find potential information that might help answer some of the questions. The data is not cleaned and is very basic, requires further web scrapping.

Responsive image