Introduction
Overlapping histograms with normal curve overlays are used to compare the distribution of a numerical variable across different groups such as male and female body mass. With this we can glance the central tendency, spread, and skewness of the data within each group and also how well the data follows a normal distribution. The further apart the modes of distribution, the more significant the differences between groups.
Requirements
- Obtain the data from Kaggle
- Create DAX measures for the mean, standard deviation, and count for the data.
- Create Bins of body mass
- Add combo bar and line chart to the canvas
- Place ‘penguins_size'[body_mass_g (bins)] on the X-Axis.
- Place the count of penguins measure on the column Y-Axis.
- Group by sex by placing the sex field into the legend field well
- Format Bars
- Create a measure for finding the normal distribution curve for each sex
- Scale the distribution curve to the histogram by multiplying the result by the bin count and number of data points
- Format lines to a smooth curve
EXTRA CREDIT : make the bins dynamically sized
(TIP : https://radacad.com/dynamic-banding-or-grouping-in-power-bi-using-dax-measures-choose-the-count-of-bins)
Dataset
This week’s dataset is downloadable from Kaggle or via https://github.com/kolky001/Workout-Wednesday-2023
Share
After you finish your workout, share on Twitter using the hashtags #WOW2024 and #PowerBI, and tag @MMarie, @shan_gsd, @KerryKolosko. Also make sure to fill out the Submission Tracker so that we can count you as a participant this week in order to track our participation throughout the year.
Solution
Coming soon.