Home > Article > Backend Development > How to perform ANCOVA in Python?
ANCOVA (analysis of covariance) is a useful statistical method because it can include covariates in the analysis, which can help adjust auxiliary variables and increase the precision of comparisons between groups . These additional factors or covariates can be included in the study by using ANCOVA. To ensure that observed differences between groups are caused by the treatment or intervention in the study and not by extraneous factors, ANCOVA can be used to adjust for the effect of covariates on the group means. This allows for more accurate comparisons between groups and gives more reliable conclusions about the relationships between variables. In this article, we will take a closer look at ANCOVA and implement it in Python.
The analysis of covariance (ANCOVA) method compares the means of two or more groups while adjusting for the effect of one or more continuous variables (called covariates). ANCOVA is similar to ANOVA (analysis of variance), but it allows variables to be included in the model. It is therefore a valuable tool for assessing the impact of these factors on group means and making more accurate comparisons between groups.
Consider the following scenario − You are conducting a study to evaluate the efficacy of a new blood pressure-lowering drug. You collect blood pressure data for a group of people who take the drug and a group who doesn't, as well as data on the age of each participant. You can use ANCOVA to compare the means of two groups on a dependent variable (blood pressure) while adjusting for the effect of a covariate (age) on the group means. This will allow you to determine whether the drug is successful in lowering blood pressure taking into account any age differences between the groups.
Consider the following ANCOVA performed in Python using the statsmodels module:
df = pd.DataFrame({'dependent_variable' : [8, 7, 9, 11, 10, 12, 14, 13, 15, 16], 'group' : ["A", "A", "A", "B", "B", "B", "C", "C", "C", "C"], 'covariate' : [20, 30, 40, 30, 40, 50, 40, 50, 60, 70]}) model = ols('dependent_variable ~ group + covariate', data=df).fit()
Using Python's statsmodels module, ANCOVA (analysis of covariance) can be performed. Analysis of covariance (ANCOVA) is a statistical method used to compare the means of two or more groups while adjusting for the effect of one or more continuous variables, called covariates.
Import Pandas and statsmodel.api
Define Ancova’s data
Perform Ancova operation
Print model summary
Here is a demonstration of using the scikit-posthocs library to run Dunn's tests -
import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols # Define the data for the ANCOVA df = pd.DataFrame({'dependent_variable' : [8, 7, 9, 11, 10, 12, 14, 13, 15, 16], 'group' : ["A", "A", "A", "B", "B", "B", "C", "C", "C", "C"], 'covariate' : [20, 30, 40, 30, 40, 50, 40, 50, 60, 70]}) # Perform the ANCOVA model = ols('dependent_variable ~ group + covariate', data=df).fit() # Print the summary of the model print(model.summary())
OLS Regression Results ============================================================================== Dep. Variable: dependent_variable R-squared: 0.939 Model: OLS Adj. R-squared: 0.909 Method: Least Squares F-statistic: 31.00 Date: Fri, 09 Dec 2022 Prob (F-statistic): 0.000476 Time: 09:52:28 Log-Likelihood: -10.724 No. Observations: 10 AIC: 29.45 Df Residuals: 6 BIC: 30.66 Df Model: 3 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ Intercept 6.0000 1.054 5.692 0.001 3.421 8.579 group[T.B] 2.3333 0.805 2.898 0.027 0.363 4.303 group[T.C] 4.8333 1.032 4.684 0.003 2.308 7.358 covariate 0.0667 0.030 2.191 0.071 -0.008 0.141 ============================================================================== Omnibus: 2.800 Durbin-Watson: 2.783 Prob(Omnibus): 0.247 Jarque-Bera (JB): 1.590 Skew: -0.754 Prob(JB): 0.452 Kurtosis: 1.759 Cond. No. 201.
The estimated coefficients of the group and covariate variables, along with their p-values and confidence bounds, will be included in the output of this code. This data can be used to compare group means while accounting for the effects of covariates and to assess the importance of group and covariate variables in the model.
Overall, the statsmodels module provides Python users with a powerful and adaptable tool for performing ANCOVA. It makes it easy to create, test, analyze and understand ANCOVA models and their outputs.
Finally, ANCOVA (analysis of covariance) is a statistical method used to compare the means of two or more groups while adjusting for the influence of one or more continuous variables (called covariates). ANCOVA is similar to ANOVA (Analysis of Variance), but it allows variables to be included in the model. Therefore, it is a valuable tool for assessing the impact of these factors on group means and generating more accurate between-group comparisons. It is widely used in various research fields, including psychology, biology, and economics, to evaluate the impact of covariates on group means and to draw more precise conclusions about variable correlations.
The above is the detailed content of How to perform ANCOVA in Python?. For more information, please follow other related articles on the PHP Chinese website!