2024 Week 35 | Power BI: Overlapping histogram with normal distribution curve

Introduction

Overlapping histograms with normal curve overlays are used to compare the distribution of a numerical variable across different groups such as male and female body mass. With this we can glance the central tendency, spread, and skewness of the data within each group and also how well the data follows a normal distribution. The further apart the modes of distribution, the more significant the differences between groups.

Requirements

  1. Obtain the data from Kaggle
  2. Create DAX measures for the mean, standard deviation, and count for the data.
  3. Create Bins of body mass
  4. Add combo bar and line chart to the canvas
  5. Place ‘penguins_size'[body_mass_g (bins)] on the X-Axis.
  6. Place the count of penguins measure on the column Y-Axis.
  7. Group by sex by placing the sex field into the legend field well
  8. Format Bars
  9. Create a measure for finding the normal distribution curve for each sex
  10. Scale the distribution curve to the histogram by multiplying the result by the bin count and number of data points
  11. Format lines to a smooth curve

EXTRA CREDIT : make the bins dynamically sized

(TIP : https://radacad.com/dynamic-banding-or-grouping-in-power-bi-using-dax-measures-choose-the-count-of-bins)

 

Dataset

This week’s dataset is downloadable from Kaggle or via https://github.com/kolky001/Workout-Wednesday-2023

Share

After you finish your workout, share on Twitter using the hashtags #WOW2024 and #PowerBI, and tag @MMarie, @shan_gsd, @KerryKolosko. Also make sure to fill out the Submission Tracker so that we can count you as a participant this week in order to track our participation throughout the year.

Solution

Coming soon.

Scroll to Top