In this theoretical setting I will be acting as a data analyst for a fitness company looking to improve their recommendations for new members based on their fitness goals.
In data cleaning we started with data uniformity, converting column names and character data points into lowercase as well as trimming any unnecessary values e.g. hidden spaces.
Next, we re-coded the gender column to ensure all factors were either male or female, converting possible data points “m” to male, and “f” to female.
We then converted the age column to integers in the case that information was entered into as a type that may not be recognized later in our exploration.
Finally, we filtered age to only include the age range of 10-100, and weight to only include 60-600 lbs. as anything outside of these would most likely be a mistakes or outliers and of course dropped any duplicate entries.
| column | n_na | n_unique |
|---|---|---|
| age | 0 | 42 |
| gender | 0 | 2 |
| weight_kg | 0 | 532 |
| height_m | 0 | 51 |
| max_bpm | 0 | 40 |
| avg_bpm | 0 | 50 |
| resting_bpm | 0 | 25 |
| session_duration_hours | 0 | 147 |
| calories_burned | 0 | 621 |
| workout_type | 0 | 4 |
| fat_percentage | 0 | 239 |
| water_intake_liters | 0 | 23 |
| workout_frequency_days_week | 0 | 4 |
| experience_level | 0 | 3 |
| bmi | 0 | 771 |
| weight_lb | 0 | 532 |
| hours_per_week | 0 | 297 |
In this section, we will investigate the relationships between demographic and behavioral factors—such as gender, workout type, and weekly exercise duration—and key health outcomes including body fat percentage, estimated cardiovascular fitness. Our goal is to identify meaningful patterns that may inform recommendations for new gym members based on their fitness goals.
As you can see from the scatter plots there is a clear and distinct negative correlation between body fat percentage and hours per week spent in the gym.
HITT and cardio show the strongest negative correlation between hours of exercise per week and body fat. Strength and yoga show a strong negative correlation as well, however not as profound.
There doesn’t seem to be any significant correlation between body fat percentage and max BPM.
There is a clear difference in body fat between men and women as expected due to biological differences between the two group; women on average having naturally higher body fat percentages.
| gender | workout_type | n | hours_median | hours_IQR | bpm_median | fat_median | fat_IQR |
|---|---|---|---|---|---|---|---|
| female | cardio | 126 | 3.84 | 2.6325 | 182 | 29.2 | 6.275 |
| female | hiit | 107 | 4.14 | 2.4900 | 179 | 28.8 | 5.900 |
| female | strength | 123 | 4.00 | 2.6500 | 177 | 29.4 | 6.650 |
| female | yoga | 106 | 4.26 | 3.0500 | 180 | 28.1 | 10.800 |
| male | cardio | 129 | 3.99 | 2.7400 | 178 | 23.8 | 6.200 |
| male | hiit | 114 | 4.11 | 3.5250 | 183 | 23.2 | 11.750 |
| male | strength | 135 | 4.04 | 2.5350 | 180 | 24.4 | 6.750 |
| male | yoga | 133 | 3.84 | 2.7600 | 182 | 24.0 | 6.100 |
In the box plots we can see that the median body fat percentage across all workout types are insignificant; however, HITT has a much wider spread indicating that there is a larger variability among this group, further exploration is needed to see if this spread is due to other factors.
In the regression coefficient table below we can see that holding the workout type variables constant, gym goings can expect an estimated 1.9% decrease in body fat per hour they spend in the gym each week with no statistical significance between groups.
| term | estimate | std.error | statistic | p.value | conf.low | conf.high |
|---|---|---|---|---|---|---|
| Intercept | 36.132 | 1.985 | 18.203 | 0.000 | 32.237 | 40.027 |
| Male (vs Female) | -5.212 | 0.250 | -20.825 | 0.000 | -5.704 | -4.721 |
| HIIT (vs baseline) | -0.216 | 0.359 | -0.602 | 0.547 | -0.920 | 0.488 |
| Strength (vs baseline) | 0.625 | 0.344 | 1.814 | 0.070 | -0.051 | 1.301 |
| Yoga (vs baseline) | -0.069 | 0.351 | -0.197 | 0.844 | -0.759 | 0.621 |
| Hours per week | -1.921 | 0.058 | -33.398 | 0.000 | -2.034 | -1.808 |
| Max BPM | -0.001 | 0.011 | -0.059 | 0.953 | -0.022 | 0.021 |
| r.squared | adj.r.squared | sigma | df | p.value | AIC | BIC |
|---|---|---|---|---|---|---|
| 0.615 | 0.613 | 3.895 | 6 | 0 | 5416.406 | 5455.449 |
From our analysis we can confidently say that body fat percentage is highly correlated with hours spent per week exercising regardless of workout type with the most variation in HITT workouts
We were unable to find any significant correlation between workout types and cardiovascular health although our data being limited to a single point in time.
Based on our findings we can conclude that hours per week and body fat percentage are strongly correlated. This can give an initial recommendation for spending more time exercising and allow new gym members what to expect from exercise in general.
To discover fat loss and cardiovascular health we will need too collect more data to give our clients better recommendations.
Our next steps would be to train implement programs for new gym members looking to lose body fat to spend more time exercising.
Suggestions would be meeting with personal trainers and sign up for classes such as HITT and yoga that keep new gym members structure and accountability.
Lastly. we will want to collect more data and continue our analysis to give more new gym members for advanced recommendations.