# MHA-FP 5017 Assessment 3 Predicting an Outcome Using Regression Models

**Sample Answer for MHA-FP 5017 Assessment 3 Predicting an Outcome Using Regression Models** **Included After Question**

### Overview

Perform multiple regression on the relationship between hospital costs and patient age, risk factors, and patient satisfaction scores, and then generate a prediction to support this health care decision. Write a 3-4-page analysis of the results in a Word document and insert the test results into this document. Note: You are strongly encouraged to complete the assessments in this course in the order they are presented.

## SHOW LESS

Regression is an important statistical technique for determining the relationship between an outcome (dependent variable) and predictors (independent variables). Multiple regression evaluates the relative predictive contribution of each independent variable on a dependent variable. The regression model can then be used for predicting an outcome at various levels of the independent variables. For this assessment, you will perform multiple regression and generate a prediction to support a health care decision.

**A Sample Answer For the Assignment:** MHA-FP 5017 Assessment 3 Predicting an Outcome Using Regression Models

**Title: **MHA-FP 5017 Assessment 3 Predicting an Outcome Using Regression Models

**Introduction**

Regression analysis refers to the set of statistical methods that are applied in the estimation of the dependent variable and one or more independent variables. Regression analysis can be applied to assess the strength of the correlation between variables and for modeling the future relationship that may be expected between independent and dependent variables. In regression analysis, there exist several variations such as multiple linear, linear, as well as nonlinear. Some of the most common models are multiple linear and simple linear (Kumari & Yadav, 2018). Non-linear regression analysis is usually applied for complicated data sets where the independent and dependent variables indicate a nonlinear relationship (Aggarwal & Ranganathan, 2017). There are numerous applications of regression analysis, including research processes as well as financial analysis. The purpose of this assignment is to predict an outcome using regression models through the application of the dataset given.

Before conducting regression analysis, it is necessary to understand the assumptions. One of the assumptions is that the independent variable is not always random. Some other assumptions include the value of residuals is zero, the independent and dependent variables often show a linear relationship between the intercept and the slope, the value of residual is always constant across all the observations made; finally, the values of residual are not always correlated across different observations (Montgomery et al., 2021). Besides, the residual values often follow the normal distribution.

**Regression analysis**

From the information given, the dependent variable is hospital costs, while the independent variables include patient age, risk factors, and patient satisfaction scores. Both the independent and dependent variables are continuous.

Table 1: Descriptive Statistics | |||

Mean | Std. Deviation | N | |

Cost | 14906.51 | 2614.346 | 185 |

Age | 73.25 | 6.430 | 185 |

risk | 5.69 | 2.777 | 185 |

satisfaction | 50.02 | 28.919 | 185 |

Table 1 indicates the descriptive statistics for both the dependent and independent variables. The means of variables, cost, age, risk, and satisfaction include $14906.51, 73.25 years, 5.69, and 50.02. The sample size used was 185.

Table 2: Correlations | |||||

cost | age | risk | satisfaction | ||

Pearson Correlation | cost | 1.000 | .279 | .199 | -.071 |

age | .279 | 1.000 | .152 | .094 | |

risk | .199 | .152 | 1.000 | .037 | |

satisfaction | -.071 | .094 | .037 | 1.000 | |

Sig. (1-tailed) | cost | . | .000 | .003 | .169 |

age | .000 | . | .019 | .101 | |

risk | .003 | .019 | . | .307 | |

satisfaction | .169 | .101 | .307 | . | |

N | cost | 185 | 185 | 185 | 185 |

age | 185 | 185 | 185 | 185 | |

risk | 185 | 185 | 185 | 185 | |

satisfaction | 185 | 185 | 185 | 185 |

Table 2 shows the correlation between dependent and independent variables. The outcomes show that there is a weak positive correlation between the cost and age; the correlation coefficient is 0.279. The correlation between cost and risk is also weak and positive; the correlation coefficient is 0.199. Finally, the correlation between cost and the level of satisfaction is weak and negative; the correlation coefficient is -.071.

Table 3: Model Summary | |||||||||

Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | Change Statistics | ||||

R Square Change | F Change | df1 | df2 | Sig. F Change | |||||

1 | .336^{a} | .113 | .098 | 2482.429 | .113 | 7.692 | 3 | 181 | .000 |

a. Predictors: (Constant), satisfaction, risk, age |

From table 3, the R-Square is 0.113 showing a “Medium” effect size; therefore, the model attempt to explain much of the variance in the dependent variable. The significant value from the analysis is 0.000 < 0.05; therefore, we reject that null hypothesis and conclude that the model is fit or significant. Given that the analysis was done at 95% level of significance, the null hypothesis is rejected when the significant values obtained are less than 0.05.

Table 4: Coefficients^{a} | ||||||||

Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | 95.0% Confidence Interval for B | |||

B | Std. Error | Beta | Lower Bound | Upper Bound | ||||

1 | (Constan) | 6652.176 | 2096.818 | 3.173 | .002 | 2514.825 | 10789.527 | |

age | 107.036 | 28.911 | .263 | 3.702 | .000 | 49.990 | 164.082 | |

risk | 153.557 | 66.685 | .163 | 2.303 | .022 | 21.978 | 285.136 | |

satisfaction | -9.195 | 6.358 | -.102 | -1.446 | .150 | -21.740 | 3.351 | |

a. Dependent Variable: cost |

From table 4, there is the indication of different unstandardized coefficients for the independent variables used in the study. A regression equation can therefore be formulated from the information given. Using the equation of a straight line, Y= Mx +C, at the Y-intercept, x becomes 0. Therefore, the equation becomes, Y=M (0) + C, Y=C. From the table above, Y= 6652.176. To formulate a regression equation, there is the need for the analysis to consider the constant and unstandardized coefficients of the independent variables. The equation takes the form of a line equation which is Y= Mx + c,

Therefore, we find that:

Cost = 6652.176 + 107.036 (age) + 153.557 (risk) – 9.195 (satisfaction)

The above regression equation can be used to predict the costs given each of the independent variables. While determining the cost using each of the variables, we set all other independent variables to zero. The above equation shows that the cost depends on the age of the patients, risks factors, as well as the level of satisfaction of the patients after treatments.

**Conclusion**

Regression analysis can be applied to assess the strength of the correlation between variables and for modeling the future relationship that may be expected between independent and dependent variables. The analysis shows that the hospital costs are dependent on patient age, risk factors, and patient satisfaction scores. Both the independent and dependent variables are continuous.

**References**

Kumari, K., & Yadav, S. (2018). Linear regression analysis study. *Journal of the practice of Cardiovascular Sciences*, *4*(1), 33. https://www.j-pcs.org/article.asp?issn=2395-5414;year=2018;volume=4;issue=1;spage=33;epage=36;aulast=Kumari

Montgomery, D. C., Peck, E. A., & Vining, G. G. (2021). *Introduction to linear regression analysis*. John Wiley & Sons. http://sutlib2.sut.ac.th/sut_contents/H133678.pdf

Aggarwal, R., & Ranganathan, P. (2017). Common pitfalls in statistical analysis: Linear regression analysis. *Perspectives in clinical research*, *8*(2), 100. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5384397/