PUB 550 Discuss the impact of data type on test selection
PUB 550 Discuss the impact of data type on test selection
The decision for a statistical test is founded upon the scientific question that needs to be answered. There are steps that should be taken before a statistical test is selected.
– First we are gonna determine what our null and alternative hypothesis will be as well as the research study design method.
-The test and the level of significance must be specified before the public health study is performed. A decision should be made whether the test should be two tailed or one tailed. If the test is two tailed test then there is no particular direction of expected difference is assumed. A one tailed test is done if there is no clear direction the difference may be.
The type and the distribution of data influences the selection of a statistical test. There are various types of data that can be collected in public health study. These include binary, categorical and continuous data. In binary data there are two possibilities. Hypothesis tests that assess proportions require binary data and allow one to use the sample data to make inferences about the proportions of populations. If there is a meaning sequence of categorical endpoints this can be described as an ordinal endpoint. In the continuous data a statistical test should be selected which is appropriate in comparing the two groups of data set which are normally distributed. Mishra et al. (2019) research suggests that its important to make appropriate selection because any wrong selection may create serious problems such as type 1 error and wrong interpretation.
Mishra P, Pandey CM, Singh U, Keshri A, Sabaretnam M. Selection of appropriate statistical methods for data analysis. Ann Card Anaesth. 2019 Jul-Sep;22(3):297-301. doi: 10.4103/aca.ACA_248_18. PMID: 31274493; PMCID: PMC6639881.
Click here to ORDER an A++ paper from our Verified MASTERS and DOCTORATE WRITERS PUB 550 Discuss the impact of data type on test selection:
You can perform statistical tests on data that have been collected in a statistically valid manner either through an experiment, or through observations made using probability sampling methods.
For a statistical test to be valid, your sample size needs to be large enough to approximate the true distribution of the population being studied.
To determine which statistical test to use, you need to know:
- whether your data meets certain assumptions.
- the types of variables that you’re dealing with.
The major steps in basic data analysis are: cleaning data, coding and conducting descriptive analyses; calculating estimates (with confidence intervals); calculating measures of association (with confidence intervals); and statistical testing. In earlier issues of FOCUS we discussed data cleaning, coding and descriptive analysis, as well as calculation of estimates (risk and odds) and measures of association (risk ratios and odds ratios). In this issue, we will discuss confidence intervals and p-values, and introduce some basic statistical tests, including chi square and ANOVA. Before starting any data analysis, it is important to know what types of variables you are working with.
Parab S, Bhalerao S. Choosing statistical test. Int J Ayurveda Res. 2010 Jul;1(3):187-91. doi: 10.4103/0974-7788.72494. PMID: 21170214; PMCID: PMC2996580.
It is imperative to make sure your data is accurate when it comes to statistical tests. The first aspect that is crucial with analyzing public health data is to make sure it involves three statistical assumptions which include independence of observations, homogeneity of variance, and normality of data. When “these assumptions are violated then the test may not be valid.” (Sivarajah 2020) Parametric tests are then to be used once the three mentioned assumptions are present. These three parametric tests are regression tests, comparison tests, and correlations tests. Regression tests are often used to see if the change in one variable causes a certain change within another. A comparison test is simply used to see the differences and dynamics between two variables. A correlation test is used to see if there is a relationship between two variables. Each method of statistical testing can be used for the appropriate situation for what the data calls for, but it is important to know that without the specific assumptions being present, this can ruin the accuracy of the data during these testing methods. It is important to know “basic knowledge” of the topic being tested and also “proper consultation from statistical experts” will also ensure the data and statistics for better accuracy. (Ann Card Anaseth 2019)
Annals of Cardiac Anaesthesia. “Selection of Appropriate Statistical Methods for Data Analysis.” (2019 22 July-Sep)
Nayak, Barun K & Hazra, Avijit. Indian Journal of Ophthalmology. “How to Choose the Right Statistical Test?” (2011 Mar-Apr)
Sivarajah, Sivakar. Towards Data Science. “Statistical Testing: How to Select the Best Test for Your Data?” (2020 10 Aug)
An analysis plan is a document that guides how you progress from raw data to the final report. It describes where you are starting (data sources and data sets), how you will look at and analyze the data, and where you need to finish (final report). It lays out the key components of the analysis in a logical sequence and provides a guide to follow during the actual analysis.
Step 1: Write your hypotheses and plan your research design
To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.
Writing statistical hypotheses. The goal of research is often to investigate a relationship between variables within a population. You start with a prediction and use statistical analysis to test that prediction.
Step 2: Collect data from a sample
In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample. Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures. You should aim for a sample that is representative of the population.
Step 3: Summarize your data with descriptive statistics
Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.
Inspect your data
There are various ways to inspect your data, including the following:
- Organizing data from each variable in frequency distribution tables.
- Displaying data from a key variable in a bar chart to view the distribution of responses.
- Visualizing the relationship between two variables using a scatter plot.
Step 4: Test hypotheses or make estimates with inferential statistics
A number that describes a sample is called a statistic, while a number describing a population is called a parameter. Using inferential statistics, you can make conclusions about population parameters based on sample statistics.
Researchers often use two main methods (simultaneously) to make inferences in statistics.
- Estimation: calculating population parameters based on sample statistics.
- Hypothesis testing: a formal process for testing research predictions about the population using samples.
Step 5: Interpret your results
The final step of statistical analysis is interpreting your results.
In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.