Visualizing Mean, Median, Mode and Standard Deviation, all together

NIBEDITA (NS)
4 min readOct 4, 2024

--

Hey, guys! Welcome to the 6th Article in our Data Science and Analytics Series. In today’s article, we won’t solve any problem, instead today we’ll visualize Mean, Median, Mode and Standard Deviation using Python. This is the 2nd article on visulaizing the data in our Series.

Today, we don’t have any sample data. We’ll generate our own data and visualize with it.

Great! If you want, you can also watch the Video of this Visualization Step-by-step: 👇

How to Visualizing Mean, Median, Mode and Standard Deviation in Python?

Before visualizing, our first step is to gather all the necessary libraries. Here, we’ll import NumPy for generating data, Statistics for calculating basic statistics, Stats from SciPy for finding the mode, and Matplotlib for creating our plots.

Alright! Let’s get into it and start writing our code now.

So, let’s import our required libraries.

import numpy as np
import statistics as st
from scipy import stats
import matplotlib.pyplot as plt

Great! Now, we’ll generate our data using NumPy’s random.normal() method.

data = np.random.normal(50, 15, 100)
data

# Output:
array([77.84034586, 63.61076827, 50.05799708, 51.7076173 , 59.24958042,
23.51997514, 10.54280433, 44.21175473, 61.26905135, 68.99763631,
44.06561244, 56.84306375, 54.16079879, 51.58023607, 62.82149705,
56.3260037 , 41.77964388, 60.46394582, 40.88899742, 32.29432526,
53.75689231, 38.34444024, 60.13565411, 42.48499258, 46.57875768,
59.93981781, 57.2281684 , 49.4595047 , 56.76133072, 36.38659096,
55.26674737, 56.33925831, 36.84115103, 50.10214928, 74.26654909,
37.84775444, 53.14363925, 66.67358034, 50.39016715, 49.34676685,
51.17626633, 50.29158009, 20.48776699, 71.74993416, 49.40125608,
42.4625136 , 77.83724988, 80.42739908, 76.47892892, 32.71490197,
51.9198954 , 55.16418251, 38.61708791, 53.99685913, 51.93527014,
67.21953975, 49.51009701, 29.88840667, 39.91931725, 74.63304199,
35.31836408, 54.69640204, 48.66799816, 48.62042227, 41.34877846,
49.37191327, 44.56321766, 59.23245967, 53.46125848, 41.52937801,
45.39339701, 33.21696925, 69.68149089, 53.42870082, 63.61906899,
54.16198165, 46.32931798, 42.0988858 , 32.53843158, 30.62387111,
32.08206116, 61.71842088, 50.85683766, 40.56481588, 48.89087303,
47.75223865, 57.34026571, 70.13899741, 70.00159767, 40.1493401 ,
44.53812029, 43.71172317, 26.10512358, 36.53290379, 59.31540856,
48.11504779, 60.25513662, 33.19965716, 44.06735668, 39.77894382])

This basically means that we’re creating 100 Random values from aNormal Distribution with a Mean of 50 and a Standard Deviation of 15.

Okay, now we’ll calculate our statistical measures.

meanVal = st.mean(data)
medianVal = st.median(data)
modeVal = stats.mode(data).mode
stdVal = st.stdev(data)

If you’re thinking, why didn’t I use Statistics for mode as well, here’s the reason. The mode() function in Statistics can be sensitive to datasets with multiple modes. In our generated dataset, we have all unique values, means multiple modes. For simplicity and to ensure smooth operation, I have used stats.mode() function from scipy. Again, if you’re thinking, why did I write the mode twice, it’s for getting the mode value. Cuz, stats.mode() method gives the mode value along with its frequency as a result.

stats.mode(data)

# Output: ModeResult(mode=10.542804326118677, count=1)

See, here we’re getting the mode value and its frequency as well. I hope this much is clear.

Fine, now let’s jump into the main part of our article, Visualization.

plt.figure(figsize=(12, 6))
plt.hist(data, bins = 15, color = 'skyblue', alpha = 0.7,
edgecolor = 'black', label = 'Data Distribution')

plt.axvline(meanVal, color = 'red', linewidth = 2,
label = f'Mean: {meanVal:.2f}')
plt.axvline(medianVal, color = 'green', linewidth = 2,
label = f'Median: {medianVal:.2f}')
plt.axvline(modeVal, color = 'blue', linewidth = 2,
label = f'Mean: {modeVal:.2f}')
plt.axvline(meanVal - stdVal, color = 'purple', linewidth = 2,
label = f'Standard Deviation: {stdVal:.2f}')
plt.axvline(meanVal + stdVal, color = 'purple', linewidth = 2)


plt.title("Visualization of Mean, Median, Mode and Standard Deviation")
plt.xlabel("Data Values")
plt.ylabel("Frequency")
plt.legend()
plt.show()

Let me explain, what exactly I have done here first. So, first we have create a figure of size 12 by 6 inches. It’s not fixed, you can set it to any size you want or else just skip this, if you like the default size.

Next we created a histogram to visualize the distribution of our data. Where, the bins defines the number of bins in the histogram. I have set it to 15, but you might wanna experiment with different bin sizes depending on your data distribution to get the best visual results. I want it to be slightly transparent skyblue, that’s why I’ve set the alpha to 0.7. If you don’t like it, you can adjust it according to your preferences. I also want the edgecolor to be black for clarity, if you like another color, you can change it as well. Next, I have also added a label to show the Data Distribution.

Next we’ve added 5 vertical lines in the axes. First line is for the Mean, 2nd line is for the Median and 3rd line is for the Mode. 4th and 5th lines are for Standard Deviation, which will show the spread of the data.

Here, I have used plt.axvline() method. In our previous visualization, we have used plt.axhline() which draws the horizontal line between the axes, similarly, axvline() draws the vertical lines.

We’ve then added the final labels to our plot, including the title, x-axis label, and y-axis label to our plot. Here the legend() call adds a legend to identify each line. And Finally, the show() method displays the plot.

How to Visualize Mean, Median, Mode and Standard Deviation?
Visualization of Mean, Median, Mode and Standard Deviation

Here’s our final outcome. The vertical lines clearly show where the Mean, Median, Mode, and Standard Deviation fall within our data.

You can modify our visualization in many ways. Try doing it by yourself as well.

That’s it for today’s article! I hope this breakdown helps you understand how to visualize these important statistical measures. We’ll solve more real-world data science and analytics problems in the future articles and also visualize the data to understand the concepts better.

Thanks for reading! 😊

--

--

NIBEDITA (NS)
NIBEDITA (NS)

Written by NIBEDITA (NS)

Tech enthusiast, Content Writer and lifelong learner!

No responses yet