# Descriptive Statistics Study Guide

*Odecir Gocking, Brittney Hornbuckle, Samantha Pitt, Emily Reiter, Anna Woody, and Elena Boersma*

**Descriptive Statistics**

Allows researchers to describe and summarize quantitative data and helps with understanding

research evidence (Polit & Beck, 2017).

**Four Levels of Measurement**

**Nominal Measurement**: used to assign variables/characteristics/categories as a number or as a symbol to represent a category (Polit & Beck, 2017). The numbers used in nominal measurement are not used as a quantifiable meaning but rather used to identify the variable i.e. to code the genders of a sample 1 = male, 2 = female (Polit & Beck, 2017).

**Ordinal Measurement**: a measurement that shows where/how an object or people are sorted and where they rank on an attribute level or how they are coded into a group i.e. a person who is independent could be coded (1), one person assist (2), two-person assist (3), mechanical lift (4) (Polit & Beck, 2017). Ordinal measurement tells us the ranking of the level of the attribute (Polit & Beck, 2017).

**Interval Measurement**: refers to the distance between the rank of an attribute i.e. a temperature of 80 degrees Fahrenheit is 20 degrees warmer than 60 degrees Fahrenheit (Polit & Beck, 2017).

**Ratio Measurement**: have an absolute zero, therefore mathematical equations are allowable (Polit & Beck, 2017). This is the highest level of measurement and is known as the interval or ratio scale (continuous) (Polit & Beck, 2017).

**Critiquing Descriptive Statistics**

Descriptive statistics are an important part of biometric analysis for the understanding of further

statistical evaluations.

· Medical data is based on a collection of the data of individual cases or

objects

· The goal of a scientific study must always be clearly defined

· If the data are of good quality conclusions can be drawn

· The definition of the value determines the level of measurement of the variable

**Frequency Distributions**

Used to systematically organize numeric data from lowest to highest and allows the researcher to visualize the data system (Polit & Beck, 2017).** **Histograms (bar graphs) and Frequency Polygons (dot line graphs)- Graphs used to display interval and ratio level data,

**Shapes of Distribution**

1) Symmetric- graph can be folded in half and mirror each side-data is mirrored.

2) Skewed- the peak of the graph is off center-asymmetric; one tail is longer than the other. a. Positive skew- the tail of the graph is to the right b. Negative skew- the tail of the graph is to the left

3) Modality- graph displays the number of peaks within the displayed data

a. Unimodal- one data peak b. Multimodal- two of more data peaks

c. Bimodal- two data peaks

d. Normal- the data displayed is symmetric peak within the graph.

**Central Tendency Frequency**– distribution in the center of a distribution or the average. The results of the three indexes below will provide results in skewed distributions as previously defined (Polit & Beck, 2017).

**Modes**– Most frequent occurring value in a distribution i.e. median (the data middle point) and the mean (the data added together).

**Risk Indexes- **The risk index is the result of a risk assessment. Indexes can be used in the calculation of:

· The likelihood index shows the probability of a risk event occurring

· It is a composite of the likelihood and impact index

· Risk assessment can be used to calculate this index

· Impact index shows the impact a risk event has

**Variability – **The term that represents the spread or distribution of data from narrow to wide or somewhere in between (Polit & Beck, 2017).

**Range** – represents the highest score or distribution “minus the lowest score in a distribution (Polit & Beck, 2017, p. 362). Range is often not an accurate representation of data and is often only used to show an overall description of the data (Polit & Beck, 2017).

**Standard Deviation** (SD) – informs the researcher how much measurements are spread out from the average (mean) of the data. SD can be narrow-meaning the spread of numbers is close to the average in either direction or it can be wide-meaning the spread of numbers is at a distance from the average in either direction (Khan Academy, 2019). Additionally, SD can also be calculated to find individual data scores (Polit & Beck, 2019).

**Variance –** is the SD before the square root is calculated and it is used in used statistical testing (Polit & Beck, 2017).

**Bivariate Descriptive Statistics**

Bivariate Descriptive Statistics describes the relationship between variables by the methods of crosstabs tables and correlation indexes. (Polit & Beck, 2017).

**Crosstabs Tables**

Crosstabs tables are where the frequency of two variables are cross tabulated i.e., when deciphering whether it is males or females who are heavy, light or nonsmokers, the numerical data for all women nonsmokers is placed in the table, and then the data for all male nonsmokers (Polit & Beck, 2017). The numerical data for all light and heavy smokers for each gender is also placed in the table and each numerical data is presented as a percentage (Polit & Beck, 2017). From the total of the percentage one can conclude whether it is males or females who are indeed the heavier smokers (Polit & Beck, 2017). This data can be further broken down to compare each category whether women were more likely than men to be nonsmokers and less likely than men to be heavy smokers (Polit & Beck, 2017). The table will provide the opportunity for one to conclude which gender is in fact the heavier smoker by evaluating the percentage (Polit & Beck, 2017).

**Correlation**

Relationships between two variables are described through correlation procedures with the question usually being *to what extent are two variables related to each other *(Polit & Beck, 2017).*? *For example, to what degree are anxiety scores and blood pressure readings related? The correlation between two variables can be graphed as a scatter plot which is a coordinate graph (Polit & Beck, 2017). The two variables are laid out at right angles: one variable is labeled “X” and is scaled on the horizontal axis and the other variable is labeled “Y” and is scaled vertically (Polit & Beck, 2017). So, the first variable would be located on the graph, then the location would be where the “X” and the “Y” variable meets. If the slope on the graph begins on the lower left corner and rises to the right then the relationship is said to be positive (Polit & Beck, 2017). **A positive correlation** occurs when high values on one variable is associated with high values on another variable while a** negative relationship** is one which high values on one variable are related to low values on the other (Polit & Beck, 2017).

When relationships are perfect it is possible to predict the value of one variable by knowing the value of the second (Polit & Beck, 2017). For example, if all people who are 5 ft 9 inches tall and are two hundred pounds, we can then automatically know the height of everyone who is two hundred pounds. When the relationship is not perfect one can assume the degree of correlation by seeing how close the points on the graph cluster around a straight line. If they cluster close to each other the correlation is strong and if the points are scattered over the graph then the relationship is nonexistent (Polit & Beck, 2017).

The most widely used correlation statistics is the *Product-moment correlational coefficient* which is also called *Pearson’s r *(Polit & Beck, 2017)*.* This coefficient is computed with variables measured on an interval or ratio scale (Polit & Beck, 2017)*. Spearman’s rho* is a correlational index for ordinal level data (Polit & Beck, 2017).

**Conclusion**

Clinicians are required to keep up on current information that could potentially help and provide patients with quality care. As mentioned by Harvey (2018) descriptive statistics are used to summarize sample characteristics, describe key variables, and document methodical formulas. Descriptive statistics helps to understand quantitative research evidence, communicate the sample study, and communicate the values of key outcome variables.

**References**

Harvey, E. (2018). *Statistics for Nursing: A Practical Approach*. Jones & Bartlett Learning. Retrieved from:

Khan Academy. (2019). *Calculating Standard Deviation Step by Step.* Retrieved from https://www.khanacademy.org/math/probability/data-distributions-a1/summarizing-spread-distributions/a/calculating-standard-deviation-step-by-step

Polit, D. F., & Beck, C. T. (2017). *Nursing Research Generating and Assessing Evidence for Nursing Practice.* Philadelphia: Wolters Kluwer.