Comprehensive Biostatistics and Computer Notes – Units 1 to 4

Biostatics and computer notes

Comprehensive Biostatistics and Computer Notes – Units 1 to 4

Detailed Notes Pdf – BIOSTATISTICS AND COMPUTER APPLICATIONS

Short Notes -✅ UNIT 1

1. Definition & Calculation of Mean (Direct Method, Shortcut Method and Step Deviation Method)➛

The mean, also known as the arithmetic mean or average, is a measure of the central tendency of a dataset or a population. It represents the sum of all values divided by the number of values.

In simpler terms, the mean is a way to describe the “typical” value in a dataset. It’s a single value that represents the entire dataset, and it’s often used to understand the general trend or pattern in the data.

The mean is sensitive to extreme values (outliers) in the data, and it’s not always the best representation of the data, especially when the data is skewed or has outliers. In such cases, other measures like the median or mode might be more appropriate.

Formula

Mean = (Σx) / n

Where:

Σx = Sum of all values
n = Number of values

Example

Numbers: 2, 4, 6, 8, 10

Mean = (2 + 4 + 6 + 8 + 10) / 5

Mean = 30 / 5 = 6

Direct Method

The direct method involves adding up all the values in a dataset and dividing by the number of values.

Formula

Mean = (Σx) / n

Where:

Σx = Sum of all values
n = Number of values

Example

Find the mean of 2, 4, 6, 8, 10

Σx = 2 + 4 + 6 + 8 + 10 = 30

n = 5

Mean = 30 / 5 = 6

Shortcut Method

The shortcut method involves finding the mean of a dataset by first finding the mean of a smaller subset of the data and then adjusting for the remaining values.

Formula

Mean = (Σx + (n – m) × A) / n

Where:

Σx = Sum of a smaller subset of values
n = Total number of values
m = Number of values in the subset
A = Mean of the subset

Example

Find the mean of 2, 4, 6, 8, 10 using subset 2, 4, 6

Σx = 12

n = 5

m = 3

A = 12 / 3 = 4

Mean = (12 + (5 – 3) × 4) / 5

= (12 + 8) / 5

= 20 / 5

= 4

Step Deviation Method

The step deviation method involves finding the mean by first finding the deviation of each value from an assumed mean, then finding the mean of these deviations, and finally adjusting the assumed mean.

Formula

Mean = A + (Σd / n)

Where:

A = Assumed mean
Σd = Sum of deviations from A
n = Number of values

Example

Assumed Mean = 5

Deviations:

(2 – 5) = -3
(4 – 5) = -1
(6 – 5) = 1
(8 – 5) = 3
(10 – 5) = 5

Σd = 5

Mean = 5 + (5 / 5)

Mean = 6

2. Mode and Median

In biostatistics, the median is the middlemost value in the ordered list of observations, and the mode is the most frequently occurring value.

Mode

The mode is the most frequently occurring value in a dataset.

Steps to Calculate Mode

Arrange the data in ascending or descending order.
Count the frequency of each value.
Identify the value with the highest frequency.
The value with the highest frequency is the mode.

Example

Data: 1, 2, 3, 4, 4, 4, 5, 6

Mode = 4

Bimodal Distribution Example

Data: 1, 2, 2, 3, 3, 3, 4, 4, 4

Modes = 3 and 4

Median

The median is the middle value when the dataset is arranged in ascending or descending order.

Steps to Calculate Median

Arrange data in ascending order.
Count the number of observations (n).
If n is odd, median is the middle value.
If n is even, median is the average of the two middle values.

Example (Odd)

Data: 1, 3, 5, 7, 9

Median = 5

Example (Even)

Data: 1, 3, 5, 7, 9, 11

Median = (5 + 7) / 2

Median = 6

3. Individual Observation

An individual observation is a single data point or measurement collected from a single subject, participant, or unit of analysis in a study or experiment.

Examples include:

Participant ID
Age
Gender
Weight
Survey responses

Individual observations are the building blocks of statistical analysis and are used to calculate summary statistics, perform hypothesis testing, and model relationships between variables.

4. Discrete Observation

A discrete observation is a type of individual observation that can only take on specific, distinct, and countable values.

Examples:

Gender
Color
Number of children
Yes/No responses
Income categories

Discrete observations are commonly analyzed using:

Frequency counts
Percentages
Chi-square tests

5. Continuous Observation

A continuous observation can take any value within a certain range or interval.

Examples:

Height
Weight
Blood Pressure
Temperature
Time
Distance

Characteristics:

Can take any value within a range
Measured on a continuous scale
Can include decimal values
Analyzed using mean, standard deviation and regression analysis

✅ UNIT 2

1. Tabulation of Data and Graphical Presentation of Frequency Distribution

Tabulation of Data

Tabulation involves organizing and summarizing data into tables.

Common Tables:

Frequency Tables
Contingency Tables
Descriptive Statistics Tables

Graphical Presentation of Frequency Distribution

Common Graphs:

Histogram
Bar Chart
Pie Chart
Box Plot

Advantages:

Helps identify trends and patterns
Easy interpretation of data
Supports decision-making

2. Line Frequency

Line frequency (Frequency Polygon) is a graphical representation of frequency distribution using connected lines.

Characteristics:

X-axis represents values or ranges.
Y-axis represents frequencies.
Points are connected by straight lines.

Uses:

Displaying distribution shape
Identifying modes and outliers
Comparing groups

3. Histogram (Equal and Unequal Class Intervals)

A histogram is a graphical representation of a frequency distribution using bars.

Equal Class Intervals

All class intervals have the same width.

Example:

Class Interval	Frequency
0–10	5
11–20	8
21–30	12
31–40	10
41–50	7

Unequal Class Intervals

Class intervals have different widths.

Example:

Class Interval	Frequency
0–5	3
6–15	8
16–30	15
31–50	12
51–100	5

4. Inclusive Data and Mid Value

Inclusive Data

Inclusive data includes all values without gaps or overlaps.

Example:

0–5, 5–10

Mid Value

The midpoint of a class interval.

Formula:

Mid Value = (Upper Limit + Lower Limit) / 2

Example:

Class Interval 0–5

Mid Value = 2.5

5. Frequency Polygon

A frequency polygon is constructed by:

Plotting class mid-values on the x-axis.
Plotting frequencies on the y-axis.
Connecting points with straight lines.

Uses:

Visualizing distribution
Comparing variables
Identifying patterns

6. Frequency Curve

A frequency curve is a smooth graphical representation of a frequency distribution.

Uses:

Identifying skewness
Detecting peaks and troughs
Understanding spread of data

7. Cumulative Frequency Curve (Ogive)

A cumulative frequency curve shows cumulative frequencies against upper class limits.

Applications:

Finding median
Determining percentiles
Identifying distribution characteristics

✅ UNIT 3

1. Probability

Probability is a measure of the likelihood of an event occurring.

Range:

0 = Impossible Event
1 = Certain Event

Formula:

P(A) = Number of Favorable Outcomes / Total Number of Possible Outcomes

Applications:

Statistics
Engineering
Finance
Insurance
Medicine

2. Definition of Probability

Probability is a number between 0 and 1 representing the chance of occurrence of an event.

Formula:

P(A) = Favorable Outcomes / Total Outcomes

Properties of Probability

Non-Negativity
Normalization
Countable Additivity
Monotonicity
Subadditivity
Conditional Probability
Independence
Multiplication Rule
Law of Total Probability
Bayes’ Theorem

3. Binomial Distribution

The binomial distribution models the number of successes in a fixed number of independent trials.

Parameters:

n = Number of trials
p = Probability of success

Properties

Mean = np
Variance = np(1-p)
Standard Deviation = √np(1-p)

Applications:

Coin Tossing
Quality Control
Medical Research

4. Normal Distribution

The normal distribution (Gaussian Distribution) is a symmetrical bell-shaped probability distribution.

Characteristics

Mean = Median = Mode
Symmetrical around mean
Continuous distribution

Important Percentages

68% within 1 SD
95% within 2 SD
99.7% within 3 SD

Applications:

Human height
IQ scores
Measurement errors
Medical statistics

5. Poisson Distribution

The Poisson distribution models the number of events occurring within a fixed interval.

Parameter:

λ (Lambda)

Formula

P(X = k) = (e^-λ × λ^k) / k!

Properties

Mean = λ
Variance = λ
Mode = λ

Applications:

Disease occurrence
Customer arrivals
Manufacturing defects

6. Properties and Problems of Poisson Distribution

Properties:

Mean = λ
Variance = λ
Memoryless nature
Summation property
Limiting case of Binomial Distribution
Approximation to rare event probability

Practice Problems:

Manufacturing defects problem
Call center arrival problem
Disease occurrence problem
Quality control problem
Banking customer arrival problem

✅UNIT 4

1. Parametric Tests

Parametric tests assume a specific distribution of data.

Examples:

T-Test
ANOVA
Regression Analysis
Chi-Square Test
F-Test

Assumptions

Normality
Equal Variances
Independence

2. T-Test (One Sample, Unpaired/Pooled and Paired)

The T-Test determines whether a significant difference exists between means.

Types

One-Sample T-Test

Compares sample mean with population mean.

Unpaired/Pooled T-Test

Compares means of two independent groups.

Paired T-Test

Compares means of two related groups.

Applications:

Clinical studies
Educational research
Experimental analysis

3. ANOVA (One-Way and Two-Way)

ANOVA compares means of multiple groups.

One-Way ANOVA

One independent variable.

Example:

Comparing scores of three classes.

Two-Way ANOVA

Two independent variables.

Example:

Comparing scores by gender and age group.

Assumptions

Normality
Equal Variances
Independence

4. Least Significant Difference (LSD)

LSD is a post-hoc test used after ANOVA to determine which groups differ significantly.

Formula

LSD = t × √(2 × MSE / n)

Where:

t = Critical value
MSE = Mean Square Error
n = Sample size

Non-Parametric Tests

Non-parametric tests do not require normal distribution assumptions.

Examples:

Wilcoxon Rank-Sum Test
Mann-Whitney U Test
Kruskal-Wallis Test
Friedman Test
Chi-Square Test
Sign Test

1. Wilcoxon Rank-Sum Test

Used to compare two independent groups.

Assumptions:

Independent observations
Ordinal or continuous data

2. Mann-Whitney U Test

Alternative to Independent Samples T-Test.

Used when normality assumptions are not met.

3. Kruskal-Wallis Test

Used to compare three or more independent groups.

Alternative to One-Way ANOVA.

4. Friedman Test

Used to compare three or more related groups.

Alternative to Repeated Measures ANOVA.

Assumptions:

Related observations
Ordinal or continuous data

Operation Theatre Technology Notes : Powered Surgical Instruments and Specialized Surgical Equipment

FOLLOW FOR FIFA WORLD CUP 2026 UPDATE