8 Validity and Reliability

When can we believe and trust the results of a study?

When can we know if when we repeat a study we get the same results?

These ideas are about validity and reliability.

8.1 Validity

We need to know what went on during the research process in order to accept the results, whether they are stated in words or shown in graphs. Even if the study has validity, people (especially those in the media) might distort results from data with misleading charts and graphs. (One of us just spent over an hour looking at some pretty awful charts. Oops.) Anyway, let’s talk about validity.

We will focus on two types of validity when deciding whether or not we can trust the results from a study: internal validity and external validity.

8.1.1 External Validity

The ability to get a true representation of the population from a sample is to achieve external validity.

External validity is the certainty that the observations in our methods and presence in no way jeopardize our ability to use the random sample as a true representative of the population. We need to make sure we did not influence the results in any way- by the way we collected data or by our presence. We also need to make sure that what we did in a specific place can be generalized to the entire population we wish to study. This means that if we are doing a test in a free clinic, can we generalize to all patients? If we asked students on the 8th Floor in the C building, can we generalize to all FIT students?

8.1.2 Internal Validity

Internal validity is the certainty that the observations in our sample are accurate measures of the characteristics we set out to measure.

We need to make sure that from our sample we obtained HONEST, ACCURATE, and RELIABLE information for internal validity. Imagine we are doing a study about the ages of a certain group. We should be able to get the actual ages. Believe it or not, people many lie about their age. Or, someone may actually forget how old they are (it happens)!

In order to make sure we have internal validity we could ask to see driver’s licenses or passports.

8.1.3 Examples of Validity

For the following two examples, answers are given below.

Example 8.1 (Creating Potions)

Professor Snape wanted to know his students’ ability to create potions. So he went over to each student and measured their efforts. Some students may have become nervous and their ability to produce potions could have been influenced by Snape’s presence. (I mean… did you see Neville’s face?!) So, his study may not reflect the actual potion-making ability of the students, because of the violation of ___________ validity.

Example 8.2 (Gamma Radiation of Cosmic Tesseract)

Dr. Banner and Mr. Stark were studying a cosmic tesseract and its levels of gamma radiation. They decided to gather information of the concentration of gamma radiation in all cosmic cubes and precisely weigh each from their sample. If they were given a faulty scale, then there is a violation of ___________ validity.

Answers

Creating Potions

Answer: External
Reason: Professor Snape failed to make sure he did not influence the results in any way by his presence. (External validity: We need to make sure we did not influence the results in any way- by the way we collected data or by our presence. )

Gamma Radiation of Cosmic Tesseract

Answer: Internal
Reason: If the scale was faulty then our sample of precise weight did not contain HONEST, ACCURATE and RELIABLE information.

8.2 Reliability

We want studies to not just be valid but also reliable.

Reliability is different from validity! A measurement may be valid but not reliable, or reliable but not valid.

Reliability is another term for consistency.

For reliability if the study was done a second time you would get the same results.

Note that this is different from accuracy. Reliability just means it is consitent. It says nothing about accuracy.

Imagine if we had a colleague bring in a scale to our department at FIT in 2021. A certain mathematics professor decides to change this scale to show the weight but five pounds lighter! If anyone weighs themselves a couple times in one hour, the weight it shows will be reliable since it is the same every time the individual steps on it. It will not be accurate, since it is not reading the actual weight (but five pounds lighter than the true weight).

We also want to give an example of a bullseye because that is the most common way to visualize these terms. Seriously… if you google these terms and search images most are going to show bullseyes.

Where do you expect points to be on a bullseye if we have both reliability and validity?

Can you think of what points on a bullseye may look like when the shots are reliable but they would not be valid (so miss the target)?

What about not reliable but valid?

And not reliable and not valid?