Linear regression and correlation are widely used statistical techniques for analyzing the relationship between two variables. In pharmaceutical and biomedical research, these tools help understand patterns, predict outcomes, and evaluate the strength and direction of associations. While correlation measures the degree of relationship between variables, linear regression mathematically models this relationship.
1. Correlation
Correlation indicates how strongly two variables are related. It does not imply cause and effect but only shows whether variables move together in a predictable pattern.
Types of Correlation
- Positive correlation: Both variables increase together.
- Negative correlation: One variable increases as the other decreases.
- No correlation: No predictable relationship exists.
Pearson’s Correlation Coefficient (r)
Pearson’s r measures the strength and direction of linear correlation between two continuous variables.
Formula
r = Σ[(X − X̄)(Y − Ȳ)] / √[Σ(X − X̄)² × Σ(Y − Ȳ)²]
Interpretation of r
| Value of r | Interpretation |
|---|---|
| +1 | Perfect positive correlation |
| 0 | No correlation |
| −1 | Perfect negative correlation |
The closer r is to +1 or −1, the stronger the relationship.
Assumptions of Pearson Correlation
- Both variables are continuous.
- Data is normally distributed.
- A linear relationship exists between variables.
- No significant outliers.
2. Linear Regression
Linear regression is used to model the relationship between a dependent variable (Y) and an independent variable (X) using a straight-line equation. It helps predict the value of Y based on X.
Regression Equation
Y = a + bX
- a = intercept (value of Y when X = 0)
- b = slope (change in Y for every unit change in X)
Formula for Slope (b)
b = Σ[(X − X̄)(Y − Ȳ)] / Σ(X − X̄)²
Formula for Intercept (a)
a = Ȳ − bX̄
Interpretation of Regression Line
- The slope indicates the direction and magnitude of change.
- The intercept shows the baseline level of Y.
- A regression model helps make predictions based on the observed relationship.
Differences Between Correlation and Regression
| Correlation | Regression |
|---|---|
| Measures strength/direction of relationship. | Describes and predicts relationship (Y from X). |
| Does not imply causation. | Often used for prediction and estimation. |
| r value indicates strength of association. | Regression equation models the relationship. |
Example (Illustration)
A researcher studies whether drug dosage (X) affects blood concentration levels (Y). By performing correlation, the strength of the relationship can be evaluated. Using linear regression, the researcher can predict blood concentration for a given dose.
Assumptions of Linear Regression
- Linearity between X and Y.
- Normal distribution of residuals.
- Homoscedasticity (constant variance of errors).
- Independence of observations.
Applications in Research
- Predicting drug response based on dose.
- Estimating risk factors in epidemiology.
- Analyzing laboratory calibration data.
- Studying the effect of time on drug concentration.
Advantages
- Helps predict outcomes.
- Identifies strength and direction of relationships.
- Useful in clinical, pharmaceutical, and epidemiological studies.
Limitations
- Not suitable for non-linear relationships.
- Sensitive to outliers.
- Requires normality of residuals.
Detailed Notes:
For PDF style full-color notes, open the complete study material below:
