Central tendency

Under what circumstances a median is a better measure of central tendency than the mean?

The preferred measure of central tendency often depends on the shape of the distribution. Of the three measures of tendency, the mean is most heavily influenced by any outliers or skewness.

In a symmetrical distribution, the mean, median, and mode are all equal. In these cases, the mean is often the preferred measure of central tendency.

Mean = Median = Mode Symmetrical

For distributions that have outliers or are skewed, the median is often the preferred measure of central tendency because the median is more resistant to outliers than the mean. Below you will see how the direction of skewness impacts the order of the mean, median, and mode. Note that the mean is pulled in the direction of the skewness [i.e., the direction of the tail].

Median Mean Mode Skewed to the left

Median Mean Mode Skewed to the right

Learning Outcomes

Recognize, describe, and calculate the measures of the center of data: mean, median, and mode.

By now, everyone should know how to calculate mean, median and mode. They each give us a measure of Central Tendency [i.e. where the center of our data falls], but often give different answers. So how do we know when to use each? Here are some general rules:

Mean is the most frequently used measure of central tendency and generally considered the best measure of it. However, there are some situations where either median or mode are preferred.
Median is the preferred measure of central tendency when:
1. There are a few extreme scores in the distribution of the data. [NOTE: Remember that a single outlier can have a great effect on the mean]. b.
2. There are some missing or undetermined values in your data. c.
3. There is an open ended distribution [For example, if you have a data field which measures number of children and your options are [latex]0[/latex], [latex]1[/latex], [latex]2[/latex], [latex]3[/latex], [latex]4[/latex], [latex]5[/latex] or “[latex]6[/latex] or more,” than the “[latex]6[/latex] or more field” is open ended and makes calculating the mean impossible, since we do not know exact values for this field].
4. You have data measured on an ordinal scale.
Mode is the preferred measure when data are measured in a nominal [ and even sometimes ordinal] scale.

Published on July 30, 2020 by Pritha Bhandari. Revised on June 9, 2022.

Measures of central tendency help you find the middle, or the average, of a dataset. The 3 most common measures of central tendency are the mode, median, and mean.

Mode: the most frequent value.
Median: the middle number in an ordered dataset.
Mean: the sum of all values divided by the total number of values.

In addition to central tendency, the variability and distribution of your dataset is important to understand when performing descriptive statistics.

Distributions and central tendency

A dataset is a distribution of n number of scores or values.

Normal distribution

In a normal distribution, data is symmetrically distributed with no skew. Most values cluster around a central region, with values tapering off as they go further away from the center. The mean, mode and median are exactly the same in a normal distribution.

Example: Normal distributionYou survey a sample in your local community on the number of books they read in the last year.

A histogram of your data shows the frequency of responses for each possible number of books. From looking at the chart, you see that there is a normal distribution.

The mean, median and mode are all equal; the central tendency of this dataset is 8.

Skewed distributions

In skewed distributions, more values fall on one side of the center than the other, and the mean, median and mode all differ from each other. One side has a more spread out and longer tail with fewer scores at one end than the other. The direction of this tail tells you the side of the skew

In a positively skewed distribution, there’s a cluster of lower scores and a spread out tail on the right. In a negatively skewed distribution, there’s a cluster of higher scores and a spread out tail on the left.

In this histogram, your distribution is skewed to the right, and the central tendency of your dataset is on the lower end of possible scores.

In a positively skewed distribution, mode < median < mean.

In this histogram, your distribution is skewed to the left, and the central tendency of your dataset is towards the higher end of possible scores.

In a negatively skewed distribution, mean < median < mode.

Mode

The mode is the most frequently occurring value in the dataset. It’s possible to have no mode, one mode, or more than one mode.

To find the mode, sort your dataset numerically or categorically and select the response that occurs most frequently.

Example: Finding the modeIn a survey, you ask 9 participants whether they identify as conservative, moderate, or liberal.

To find the mode, sort your data by category and find which response was chosen most frequently.

To make it easier, you can create a frequency table to count up the values for each category.

Political ideologyFrequencyConservativeModerateLiberal

Mode: Liberal

The mode is easily seen in a bar graph because it is the value with the highest bar.

When to use the mode

The mode is most applicable to data from a nominal level of measurement. Nominal data is classified into mutually exclusive categories, so the mode tells you the most popular category.

For continuous variables or ratio levels of measurement, the mode may not be a helpful measure of central tendency. That’s because there are many more possible values than there are in a nominal or ordinal level of measurement. It’s unlikely for a value to repeat in a ratio level of measurement.

Example: Ratio data with no modeYou collect data on reaction times in a computer task, and your dataset contains values that are all different from each other. ParticipantReaction time [milliseconds]

1	2	3	4	5	6	7	8	9
267	345	421	324	401	312	382	298	303

In this dataset, there is no mode, because each value occurs only once.

The median of a dataset is the value that’s exactly in the middle when it is ordered from low to high.

Example: Finding the medianYou measure the reaction times of 7 participants on a computer task and categorize them into 3 groups: slow, medium or fast. ParticipantSpeed

1	2	3	4	5	6	7
Medium	Slow	Fast	Fast	Medium	Fast	Slow

To find the median, you first order all values from low to high. Then, you find the value in the middle of the ordered dataset—in this case, the value in the 4th position.

Ordered dataset

Slow

Medium

Fast

Median: Medium

In larger datasets, it’s easier to use simple formulas to figure out the position of the middle value in the distribution. You use different methods to find the median of a dataset depending on whether the total number of values is even or odd.

Median of an odd-numbered dataset

For an odd-numbered dataset, find the value that lies at the

position, where n is the number of values in the dataset.

ExampleYou measure the reaction times in milliseconds of 5 participants and order the dataset. Reaction time [milliseconds]

287

298

345

365

380

The middle position is calculated using

, where n = 5.

That means the median is the 3rd value in your ordered dataset.

Median: 345 milliseconds

Median of an even-numbered dataset

For an even-numbered dataset, find the two values in the middle of the dataset: the values at the

and

positions. Then, find their mean.

ExampleYou measure the reaction times of 6 participants and order the dataset. Reaction time [milliseconds]

287

298

345

357

365

380

The middle positions are calculated using

and

, where n = 6.

That means the middle values are the 3rd value, which is 345, and the 4th value, which is 357.

To get the median, take the mean of the 2 middle values by adding them together and dividing by 2.

Median: 351 milliseconds

Mean

The arithmetic mean of a dataset [which is different from the geometric mean] is the sum of all values divided by the total number of values. It’s the most commonly used measure of central tendency because all values are used in the calculation.

Example: Finding the mean ParticipantReaction time [milliseconds]

1	2	3	4	5
287	345	365	298	380

First you add up the sum of all values:

Then you calculate the mean using the formula

There are 5 values in the dataset, so n = 5.

Mean [x̄]: 335 milliseconds

Outlier effect on the mean

Outliers can significantly increase or decrease the mean when they are included in the calculation. Since all values are used to calculate the mean, it can be affected by extreme outliers. An outlier is a value that differs significantly from the others in a dataset.

Example: Mean with an outlierIn this dataset, we swap out one value with an extreme outlier. ParticipantReaction time [milliseconds]

1	2	3	4	5
832	345	365	298	380

Due to the outlier, the mean [

] becomes much higher, even though all the other numbers in the dataset stay the same.

Mean: 444 milliseconds

Population versus sample mean

A dataset contains values from a sample or a population. A population is the entire group that you are interested in researching, while a sample is only a subset of that population.

While data from a sample can help you make estimates about a population, only full population data can give you the complete picture.

In statistics, the notation of a sample mean and a population mean and their formulas are different. But the procedures for calculating the population and sample means are the same.

Sample mean formulaThe sample mean is written as M or x̄ [pronounced x-bar]. For calculating the mean of a sample, use this formula:

x̄: sample mean
: sum of all values in the sample dataset
n: number of values in the sample dataset

Population mean formulaThe population mean is written as μ [Greek term mu]. For calculating the mean of a population, use this formula:

μ: population mean
: sum of all values in the population dataset
N: number of values in the population dataset

The 3 main measures of central tendency are best used in combination with each other because they have complementary strengths and limitations. But sometimes only 1 or 2 of them are applicable to your dataset, depending on the level of measurement of the variable.

The mode can be used for any level of measurement, but it’s most meaningful for nominal and ordinal levels.
The median can only be used on data that can be ordered – that is, from ordinal, interval and ratio levels of measurement.
The mean can only be used on interval and ratio levels of measurement because it requires equal spacing between adjacent values or scores in the scale.

Levels of measurementExamplesMeasure of central tendencyNominalOrdinalInterval and ratio

Ethnicity
Political ideology

Mode

Level of anxiety
Income bracket

Mode
Median

Reaction time
Test score
Temperature

Mode
Median
Mean

To decide which measures of central tendency to use, you should also consider the distribution of your dataset.

For normally distributed data, all three measures of central tendency will give you the same answer so they can all be used.

In skewed distributions, the median is the best measure because it is unaffected by extreme outliers or non-symmetric distributions of scores. The mean and mode can vary in skewed distributions.

Frequently asked questions about central tendency

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bhandari, P. [2022, June 09]. Central Tendency | Understanding the Mean, Median & Mode. Scribbr. Retrieved October 31, 2022, from //www.scribbr.com/statistics/central-tendency/

Is this article helpful?

You have already voted. Thanks :-] Your vote is saved :-] Processing your vote...

Under what circumstances a median is a better measure of central tendency than the mean?

Learning Outcomes

Distributions and central tendency

Normal distribution

Skewed distributions

Mode

When to use the mode

Median of an odd-numbered dataset

Median of an even-numbered dataset

Mean

Outlier effect on the mean

Population versus sample mean

Frequently asked questions about central tendency

Cite this Scribbr article

Bài Viết Liên Quan

What happens to the distribution as you increase the number of trials?

Which of the following statements is true usually mean is the best measure of central tendency?

Which assessment finding is most likely to occur in a patient with diabetes insipidus DI )?

genesee valley central schools là gì - Nghĩa của từ genesee valley central schools

What is the proper technique for cleaning catheter insertion site when performing a CVC dressing change?

glasgow central là gì - Nghĩa của từ glasgow central

Toplist mới

Top 7 sự tích hồ gươm - ngữ văn lớp 6 2023

Top 7 gdcd 6 bài 1 kết nối tri thức 2023

Top 7 ý nghĩa của xây dựng gia đình văn hóa 2023

Top 6 mẫu hợp đồng mượn đất làm nhà xưởng 2023

Top 3 tổng tài biến thái tôi yêu anh tập 27 2023

Top 6 kết thực phim mỹ nhân vô lệ 2023

Top 9 trong những câu thơ sau câu nào sử dụng thành ngữ 2023

Top 8 đề tài và chủ de của tác phẩm tắt đèn 2023

Top 5 tiểu sử của thầy thích pháp hòa 2023

Bài mới nhất

Chiếc lược trung và dài hạn của toyota bình định năm 2024

Bài toán giữ con lắc lò xo treo thẳng đứng năm 2024

Du học nước nào có tỉ lệ định cư cao năm 2024

Bô đê thi minh ho a thpt 2023 môn văn năm 2024

Đồng phục công sở nên như thế nào năm 2024

Chủ Đề