Z Scores if You Dont Know the Standard Deviation

A z-score measures the distance between a data betoken and the mean using standard deviations. Z-scores can exist positive or negative. The sign tells you lot whether the observation is above or below the mean. For example, a z-score of +2 indicates that the data point falls two standard deviations above the mean, while a -2 signifies it is two standard deviations below the mean. A z-score of nothing equals the mean. Statisticians also refer to z-scores as standard scores, and I'll use those terms interchangeably.

Standardizing the raw data by transforming them into z-scores provides the following benefits:

  • Understand where a information point fits into a distribution.
  • Compare observations betwixt dissimilar variables.
  • Identify outliers
  • Summate probabilities and percentiles using the standard normal distribution.

In this mail, I cover all these uses for z-scores forth with using z-tables, z-score calculators, and I show you lot how to exercise it all in Excel.

How to Find a Z-score

To summate z-scores, take the raw measurements, subtract the mean, and divide by the standard deviation.

The formula for finding z-scores is the following:

Z = {\displaystyle \frac {\text {X} - \mu}{\sigma}}

X represents the information bespeak of interest. Mu and sigma represent the mean and standard departure for the population from which you drew your sample. Alternatively, use the sample mean and standard difference when you practise not know the population values.

Z-scores follow the distribution of the original data. Consequently, when the original data follow the normal distribution, and then do the respective z-scores. Specifically, the z-scores follow the standard normal distribution, which has a mean of 0 and a standard deviation of 1. However, skewed data will produce z-scores that are similarly skewed.

In this post, I include graphs of z-scores using the standard normal distribution considering they bring the concepts to life. Additionally, z-scores are most valuable when your information are unremarkably distributed. However, exist aware that when your data are nonnormal, the z-scores are also nonnormal, and the interpretations might non exist valid.

Acquire how to identify the distribution of your information!

Related posts: The Mean in Statistics and Standard Deviation

Using Z-scores to Understand How an Ascertainment Fits into a Distribution

Z-scores help you understand where a specific ascertainment falls inside a distribution. Sometimes the raw examination scores are not informative. For example, Sat, ACT, and GRE scores do not accept real-earth interpretations on their own. An SAT score of 1340 is not fundamentally meaningful. Many psychological metrics are simply sums or averages of responses to a survey. For these cases, y'all need to know how an individual score compares to the unabridged distribution of scores. For example, if your standard score for any of these tests is a +two, that'south far above the hateful. Now that'southward helpful!

In other cases, the measurement units are meaningful, just you want to see the relative standing. For case, if a babe weighs 5 kilograms, you might wonder how her weight compares to others. For a one-month-quondam baby girl, that equates to a z-score of 0.74. She weighs more than than average, but non past a full standard deviation. At present you sympathize where she fits in with her cohort!

In all these cases, y'all're using standard scores to compare an observation to the boilerplate. You're placing that value within an unabridged distribution.

When your information are unremarkably distributed, y'all can graph z-scores on the standard normal distribution, which is a particular grade of the normal distribution. The mean occurs at the height with a z-score of null. Above average z-scores are on the correct half of the distribution and below average values are on the left. The graph beneath shows where the infant'south z-score of 0.74 fits in the population.

image of the standard normal distribution.

Analysts often catechumen standard scores to percentiles, which I cover later in this post.

Related post: Understanding the Normal Distribution

Using Standard Scores to Compare Dissimilar Types of Variables

Z-scores let you lot to take data points drawn from populations with different means and standard deviations and identify them on a common scale. This standard scale lets you compare observations for different types of variables that would otherwise be difficult. That's why z-scores are also known equally standard scores, and the procedure of transforming raw data to z-scores is called standardization. Information technology lets you compare information points across variables that have dissimilar distributions.

In other words, you can compare apples to oranges. Isn't statistics grand!

Imagine we literally need to compare apples to oranges. Specifically, we'll compare their weights. We have a 110-gram apple and a 100-gram orange.

By comparing the raw values, it'due south easy to run into the apple tree weighs slightly more than the orange. Even so, permit's compare their z-scores. To practise this, we need to know the means and standard deviations for the populations of apples and oranges. Assume that apples and oranges follow a normal distribution with the following properties:

Apples Oranges
Mean weight grams 100 140
Standard Divergence 15 25

Permit'southward summate the Z-scores for our apple and orange!

Apple = (110-100) / 15 = 0.667

Orange = (100-140) / 25 = -1.6

The apple's positive z-score (0.667) signifies that it is heavier than the average apple. It's non an extreme value, but it is to a higher place the mean. Conversely, the orange has a markedly negative Z-score (-1.6). It's well below the mean weight for oranges. I've positioned these standard scores in the standard normal distribution below.

Graph of a standard normal distribution that compares apples to oranges using a Z-score.

Our apple tree is a bit heavier than average, while the orange is puny! Using z-scores, we learned where each fruit falls inside its distribution and how they compare.

Using Z-scores to Detect Outliers

Z-scores can quantify the unusualness of an ascertainment. Raw data values that are far from the average are unusual and potential outliers. Consequently, we're looking for loftier absolute z-scores.

The standard cutoff values for finding outliers are z-scores of +/-3 or more farthermost. The standard normal distribution plot below displays the distribution of z-scores. Z-scores across the cutoff are and then unusual you lot can hardly see the shading under the curve.

Distribution of Z-scores for finding outliers.

In populations that follow a normal distribution, Z-score values exterior +/- iii have a probability of 0.0027 (two * 0.00135), approximately one in 370 observations. However, if your data don't follow a normal distribution, this approach might not be correct.

For the case dataset, I brandish the raw information points and their z-scores. I circled an ascertainment that is a potential outlier.

Datasheet that displays Z-scores to identify outliers.

Circumspection: Z-scores can be misleading in small datasets because the maximum z-score is limited to (due north−1) / √ north.

Samples with x or fewer data points cannot have Z-scores that exceed the cutoff value of +/-three.

Additionally, an outlier's presence throws off the z-scores because it inflates the hateful and standard departure. Notice how all z-scores are negative except the outlier's value. If we calculated Z-scores without the outlier, they'd be unlike! If your dataset contains outliers, z-values appear to be less farthermost (i.e., closer to cypher).

Related postal service: V Ways to Observe Outliers

Using Z-tables to Summate Probabilities and Percentiles

The standard normal distribution is a probability distribution. Consequently, if you lot accept only the mean and standard departure, and you lot can reasonably presume your information follow the normal distribution (at least approximately), y'all tin can easily utilise z-scores to summate probabilities and percentiles. Typically, y'all'll use online calculators, Excel, or statistical software for these calculations. We'll become to that.

But first I'll testify y'all the sometime-fashioned way of doing that past mitt using z-tables.

Allow's go back to the z-score for our apple (0.667) from before. We'll use information technology to calculate its weight percentile. A percentile is the proportion of a population that falls below a value. Consequently, we need to find the area under the standard normal distribution curve corresponding to the range of z-scores less than 0.667. In the portion of the z-table below, I'll use the standard score that is closest to our apple, which is 0.65.

Photograph shows a portion of a table of standard scores (Z-scores).

Click here for a full Z-table and illustrated instructions for using it!

Related postal service: Understanding Probability Distributions and Probability Fundamentals

The Basics and Bolts of Using Z-tables

Using these tables to calculate probabilities requires that you understand the properties of the normal distribution. While the tables provide an answer, it might not be the answer you need. However, by applying your cognition of the normal distribution, you lot tin can discover your respond!

For case, the table indicates that the surface area of the curve between -0.65 and +0.65 is 48.43%. Unfortunately, that's not what nosotros desire to know. We demand to find the area that is less than a z-score of 0.65.

We know that the two halves of the normal distribution are symmetrical, which helps the states solve our problem. The z-table tells united states that the area for the range from -0.65 and +0.65 is 48.43%. Because of the symmetry, the interval from 0 to +0.65 must be half of that: 48.43/two = 24.215%. Additionally, the area for all scores less than zero is one-half (50%) of the distribution.

Therefore, the area for all z-scores up to 0.65 = 50% + 24.215% = 74.215%

That's how yous convert standard scores to percentiles. Our apple tree is at approximately the 74th percentile.

If you want to summate the probability for values falling betwixt ranges of standard scores, calculate the percentile for each z-score then subtract them.

For example, the probability of a z-score between 0.40 and 0.65 equals the difference betwixt the percentiles for z = 0.65 and z = 0.40. Nosotros calculated the percentile for z = 0.65 above (74.215%). Using the same method, the percentile for z = 0.40 is 65.540%. Now we subtract the percentiles.

74.215% – 65.540% = 8.675%

The probability of an observation having a z-score between 0.40 and 0.65 is 8.675%.

Using but uncomplicated math and a z-tabular array, you tin can easily find the probabilities that you need!

Alternatively, apply the Empirical Rule to discover probabilities for values in a normal distribution using ranges based on standard deviations.

Related post: Percentiles: Interpretations and Calculations

Using Z-score Calculators

In this twenty-four hour period and age, y'all'll probably apply software and online z-score calculators for these probability calculations. Statistical software produced the probability distribution plot beneath. Information technology displays the apple'due south percentile with a graphical representation of the expanse under the standard normal distribution curve. Graphing is a nifty fashion to get an intuitive feel for what you lot're calculating using standard scores.

The percentile is a tad different because we used the z-score of 0.65 in the table while the software uses the more than precise value of 0.667.

A probability distribution plot that graphically displays a percentile using a Z-score.

Alternatively, yous tin can enter z-scores into calculators, like this one.

If you enter the z-score value of 0.667, the left-tail p-value matches the shaded region in the probability plot in a higher place (0.7476). The right-tail value (0.2524) equals all values above our z-score, which is equivalent to the unshaded region in the graph. Unsurprisingly, those values add together to 1 because you're covering the entire distribution.

How to Find Z-scores in Excel

Yous can calculate z-scores and their probabilities in Excel. Permit'southward work through an example. We'll render to our apple example and start by computing standard scores for values in a dataset. I accept all the data and formulas in this Excel file: Z-scores.

To find z-scores using Excel, you'll need to either summate the sample mean and standard departure or use population reference values. In this example, I use the sample estimates. If you need to use population values supplied to you, enter them into the spreadsheet rather than computing them.

My apple weight data are in cells A2:A21.

To calculate the mean and standard departure, I apply the following Excel functions:

  • Hateful: =Average(A2:A21)
  • Standard deviation (sample): =STDEV.South(A2:A21)

And so, in column B, I apply the following Excel formula to calculate the z-scores:

=(A2-A$24)/A$26

Cell A24 is where I have the hateful, and A26 has the standard divergence. This formula takes a information value in column A, subtracts the hateful, and so divides by the standard deviation.

I copied that formula for all rows from B2:B21 and Excel displays z-scores for all data points.

Using Excel to Calculate Probabilities for Standard Scores

Next, I use Excel's NORM.South.DIST office to calculate the probabilities associated with z-scores. I work with the standard score from our apple case, 0.667.

The NORM.South.DIST (Z, Cumulative) function provides either the cumulative distribution role (TRUE) or probability mass part (Imitation) for the z-score you lot specify. The probability mass function is the pinnacle value in the z-tabular array before in this post, and it corresponds to the y-axis value on a probability distribution plot for the z-score. We'll employ the cumulative function, which calculates the cumulative probability for all z-scores less than the value we specify.

In the function, we need to specify the z-value (0.667) and apply the True parameter to obtain the cumulative probability.

I'll enter the following:

= NORM.S.DIST(0.667,True)

Excel displays 0.747613933, matching the output in the probability distribution plot above.

If you desire to find the probability for values greater than the z-score, retrieve that the values to a higher place and below it must sum to 1. Therefore, subtract from 1 to calculate probabilities for larger values:

= 1 – NORM.South.DIST(0.667,True)

Excel displays 0.252386067.

Hither's what my spreadsheet looks like.

Excel spreadsheet that calculates z-scores and uses them to find probabilities.

pattisondouncestably.blogspot.com

Source: https://statisticsbyjim.com/basics/z-score/

0 Response to "Z Scores if You Dont Know the Standard Deviation"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel