2024 Week 42 | Power BI: Scatter Plot Dynamic Colour Density

Introduction

Scatterplots are one of the most useful visual chart types for data exploration. You can use them to identify relationships among two or more variables, find clusters of data, encode data in several ways for communicating insights etc. This is especially useful when working with large datasets. In this WoW challenge, our goal is to use marker color to find clusters of data where “carat” and “price” variables are most concentrated.

Requirements

  1. Import data from
    https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv
  2. For carat and price columns, set summarization to “Do not summarize”
  3. Create scatterplot with carat on X axis and price on Y axis.
  4. Configure the scatterplot to identify range of “carat” and “price” values where the data is most concentrated (tip : configure transparency points)

TIP : Depending on the distribution of the data, you may need to turn off high density sampling which selectively chooses points if they are overlapping (advanced settings of the scatterplot). In the second chart, because of high density sampling, the algorithm chooses non-overlapping points thus masking the true distribution.

  1. Check out for more tips/solution :
    1. Dynamically Changing the Color Transparency in Power BI | Sandeep Pawar
      (pawarbi.github.io)
    2. High-density scatter charts in Power BI – Power BI | Microsoft Learn

Dataset

This week’s dataset is downloadable from https://raw.githubusercontent.com/selva86/datasets/master/diamonds.csv

Share

After you finish your workout, share on Twitter using the hashtags #WOW2024 and #PowerBI, and tag @MMarie, @shan_gsd, @KerryKolosko, @PawarBI. Also make sure to fill out the Submission Tracker so that we can count you as a participant this week in order to track our participation throughout the year.

Solution

Scroll to Top