Comprehensive Biostatistics and Computer Notes – Units 1 to 4

Biostatics and computer notes 

Comprehensive Biostatistics and Computer Notes – Units 1 to 4

Biostatistics and Computer Notes

Detailed Notes Pdf – BIOSTATISTICS AND COMPUTER APPLICATIONS

 

Short Notes -✅ UNIT 1 

1. Definition & Calculation of Mean (Direct Method, Shortcut Method and Step Deviation Method)➛

The mean, also known as the arithmetic mean or average, is a measure of the central tendency of a dataset or a population. It represents the sum of all values divided by the number of values.

In simpler terms, the mean is a way to describe the “typical” value in a dataset. It’s a single value that represents the entire dataset, and it’s often used to understand the general trend or pattern in the data.

The mean is sensitive to extreme values (outliers) in the data, and it’s not always the best representation of the data, especially when the data is skewed or has outliers. In such cases, other measures like the median or mode might be more appropriate.

Formula

Mean = (Σx) / n

Where:

  • Σx = Sum of all values
  • n = Number of values

Example

Numbers: 2, 4, 6, 8, 10

Mean = (2 + 4 + 6 + 8 + 10) / 5

Mean = 30 / 5 = 6

Direct Method

The direct method involves adding up all the values in a dataset and dividing by the number of values.

Formula

Mean = (Σx) / n

Where:

  • Σx = Sum of all values
  • n = Number of values

Example

Find the mean of 2, 4, 6, 8, 10

Σx = 2 + 4 + 6 + 8 + 10 = 30

n = 5

Mean = 30 / 5 = 6

Shortcut Method

The shortcut method involves finding the mean of a dataset by first finding the mean of a smaller subset of the data and then adjusting for the remaining values.

Formula

Mean = (Σx + (n – m) × A) / n

Where:

  • Σx = Sum of a smaller subset of values
  • n = Total number of values
  • m = Number of values in the subset
  • A = Mean of the subset

Example

Find the mean of 2, 4, 6, 8, 10 using subset 2, 4, 6

Σx = 12

n = 5

m = 3

A = 12 / 3 = 4

Mean = (12 + (5 – 3) × 4) / 5

= (12 + 8) / 5

= 20 / 5

= 4

Step Deviation Method

The step deviation method involves finding the mean by first finding the deviation of each value from an assumed mean, then finding the mean of these deviations, and finally adjusting the assumed mean.

Formula

Mean = A + (Σd / n)

Where:

  • A = Assumed mean
  • Σd = Sum of deviations from A
  • n = Number of values

Example

Assumed Mean = 5

Deviations:

  • (2 – 5) = -3
  • (4 – 5) = -1
  • (6 – 5) = 1
  • (8 – 5) = 3
  • (10 – 5) = 5

Σd = 5

Mean = 5 + (5 / 5)

Mean = 6


2. Mode and Median

In biostatistics, the median is the middlemost value in the ordered list of observations, and the mode is the most frequently occurring value.

Mode

The mode is the most frequently occurring value in a dataset.

Steps to Calculate Mode
  1. Arrange the data in ascending or descending order.
  2. Count the frequency of each value.
  3. Identify the value with the highest frequency.
  4. The value with the highest frequency is the mode.
Example

Data: 1, 2, 3, 4, 4, 4, 5, 6

Mode = 4

Bimodal Distribution Example

Data: 1, 2, 2, 3, 3, 3, 4, 4, 4

Modes = 3 and 4

Median

The median is the middle value when the dataset is arranged in ascending or descending order.

Steps to Calculate Median
  1. Arrange data in ascending order.
  2. Count the number of observations (n).
  3. If n is odd, median is the middle value.
  4. If n is even, median is the average of the two middle values.
Example (Odd)

Data: 1, 3, 5, 7, 9

Median = 5

Example (Even)

Data: 1, 3, 5, 7, 9, 11

Median = (5 + 7) / 2

Median = 6


3. Individual Observation

An individual observation is a single data point or measurement collected from a single subject, participant, or unit of analysis in a study or experiment.

Examples include:

  • Participant ID
  • Age
  • Gender
  • Weight
  • Survey responses

Individual observations are the building blocks of statistical analysis and are used to calculate summary statistics, perform hypothesis testing, and model relationships between variables.


4. Discrete Observation

A discrete observation is a type of individual observation that can only take on specific, distinct, and countable values.

Examples:

  • Gender
  • Color
  • Number of children
  • Yes/No responses
  • Income categories

Discrete observations are commonly analyzed using:

  • Frequency counts
  • Percentages
  • Chi-square tests

5. Continuous Observation

A continuous observation can take any value within a certain range or interval.

Examples:

  • Height
  • Weight
  • Blood Pressure
  • Temperature
  • Time
  • Distance

Characteristics:

  • Can take any value within a range
  • Measured on a continuous scale
  • Can include decimal values
  • Analyzed using mean, standard deviation and regression analysis

✅ UNIT 2

1. Tabulation of Data and Graphical Presentation of Frequency Distribution

Tabulation of Data

Tabulation involves organizing and summarizing data into tables.

Common Tables:

  • Frequency Tables
  • Contingency Tables
  • Descriptive Statistics Tables

Graphical Presentation of Frequency Distribution

Common Graphs:

  • Histogram
  • Bar Chart
  • Pie Chart
  • Box Plot

Advantages:

  • Helps identify trends and patterns
  • Easy interpretation of data
  • Supports decision-making

2. Line Frequency

Line frequency (Frequency Polygon) is a graphical representation of frequency distribution using connected lines.

Characteristics:

  • X-axis represents values or ranges.
  • Y-axis represents frequencies.
  • Points are connected by straight lines.

Uses:

  • Displaying distribution shape
  • Identifying modes and outliers
  • Comparing groups

3. Histogram (Equal and Unequal Class Intervals)

A histogram is a graphical representation of a frequency distribution using bars.

Equal Class Intervals

All class intervals have the same width.

Example:

Class IntervalFrequency
0–105
11–208
21–3012
31–4010
41–507

Unequal Class Intervals

Class intervals have different widths.

Example:

Class IntervalFrequency
0–53
6–158
16–3015
31–5012
51–1005

4. Inclusive Data and Mid Value

Inclusive Data

Inclusive data includes all values without gaps or overlaps.

Example:

0–5, 5–10

Mid Value

The midpoint of a class interval.

Formula:

Mid Value = (Upper Limit + Lower Limit) / 2

Example:

Class Interval 0–5

Mid Value = 2.5


5. Frequency Polygon

A frequency polygon is constructed by:

  1. Plotting class mid-values on the x-axis.
  2. Plotting frequencies on the y-axis.
  3. Connecting points with straight lines.

Uses:

  • Visualizing distribution
  • Comparing variables
  • Identifying patterns

6. Frequency Curve

A frequency curve is a smooth graphical representation of a frequency distribution.

Uses:

  • Identifying skewness
  • Detecting peaks and troughs
  • Understanding spread of data

7. Cumulative Frequency Curve (Ogive)

A cumulative frequency curve shows cumulative frequencies against upper class limits.

Applications:

  • Finding median
  • Determining percentiles
  • Identifying distribution characteristics

✅ UNIT 3

1. Probability

Probability is a measure of the likelihood of an event occurring.

Range:

  • 0 = Impossible Event
  • 1 = Certain Event

Formula:

P(A) = Number of Favorable Outcomes / Total Number of Possible Outcomes

Applications:

  • Statistics
  • Engineering
  • Finance
  • Insurance
  • Medicine

2. Definition of Probability

Probability is a number between 0 and 1 representing the chance of occurrence of an event.

Formula:

P(A) = Favorable Outcomes / Total Outcomes

Properties of Probability

  1. Non-Negativity
  2. Normalization
  3. Countable Additivity
  4. Monotonicity
  5. Subadditivity
  6. Conditional Probability
  7. Independence
  8. Multiplication Rule
  9. Law of Total Probability
  10. Bayes’ Theorem

3. Binomial Distribution

The binomial distribution models the number of successes in a fixed number of independent trials.

Parameters:

  • n = Number of trials
  • p = Probability of success

Properties

  • Mean = np
  • Variance = np(1-p)
  • Standard Deviation = √np(1-p)

Applications:

  • Coin Tossing
  • Quality Control
  • Medical Research

4. Normal Distribution

The normal distribution (Gaussian Distribution) is a symmetrical bell-shaped probability distribution.

Characteristics

  • Mean = Median = Mode
  • Symmetrical around mean
  • Continuous distribution

Important Percentages

  • 68% within 1 SD
  • 95% within 2 SD
  • 99.7% within 3 SD

Applications:

  • Human height
  • IQ scores
  • Measurement errors
  • Medical statistics

5. Poisson Distribution

The Poisson distribution models the number of events occurring within a fixed interval.

Parameter:

λ (Lambda)

Formula

P(X = k) = (e^-λ × λ^k) / k!

Properties

  • Mean = λ
  • Variance = λ
  • Mode = λ

Applications:

  • Disease occurrence
  • Customer arrivals
  • Manufacturing defects

6. Properties and Problems of Poisson Distribution

Properties:

  1. Mean = λ
  2. Variance = λ
  3. Memoryless nature
  4. Summation property
  5. Limiting case of Binomial Distribution
  6. Approximation to rare event probability

Practice Problems:

  1. Manufacturing defects problem
  2. Call center arrival problem
  3. Disease occurrence problem
  4. Quality control problem
  5. Banking customer arrival problem

✅UNIT 4

1. Parametric Tests

Parametric tests assume a specific distribution of data.

Examples:

  • T-Test
  • ANOVA
  • Regression Analysis
  • Chi-Square Test
  • F-Test

Assumptions

  • Normality
  • Equal Variances
  • Independence

2. T-Test (One Sample, Unpaired/Pooled and Paired)

The T-Test determines whether a significant difference exists between means.

Types

One-Sample T-Test

Compares sample mean with population mean.

Unpaired/Pooled T-Test

Compares means of two independent groups.

Paired T-Test

Compares means of two related groups.

Applications:

  • Clinical studies
  • Educational research
  • Experimental analysis

3. ANOVA (One-Way and Two-Way)

ANOVA compares means of multiple groups.

One-Way ANOVA

One independent variable.

Example:

Comparing scores of three classes.

Two-Way ANOVA

Two independent variables.

Example:

Comparing scores by gender and age group.

Assumptions

  • Normality
  • Equal Variances
  • Independence

4. Least Significant Difference (LSD)

LSD is a post-hoc test used after ANOVA to determine which groups differ significantly.

Formula

LSD = t × √(2 × MSE / n)

Where:

  • t = Critical value
  • MSE = Mean Square Error
  • n = Sample size

Non-Parametric Tests

Non-parametric tests do not require normal distribution assumptions.

Examples:

  • Wilcoxon Rank-Sum Test
  • Mann-Whitney U Test
  • Kruskal-Wallis Test
  • Friedman Test
  • Chi-Square Test
  • Sign Test

1. Wilcoxon Rank-Sum Test

Used to compare two independent groups.

Assumptions:

  • Independent observations
  • Ordinal or continuous data

2. Mann-Whitney U Test

Alternative to Independent Samples T-Test.

Used when normality assumptions are not met.


3. Kruskal-Wallis Test

Used to compare three or more independent groups.

Alternative to One-Way ANOVA.


4. Friedman Test

Used to compare three or more related groups.

Alternative to Repeated Measures ANOVA.

Assumptions:

  • Related observations
  • Ordinal or continuous data

Operation Theatre Technology Notes : Powered Surgical Instruments and Specialized Surgical Equipment

FOLLOW FOR FIFA WORLD CUP 2026 UPDATE

Comments

No comments yet. Why don’t you start the discussion?

    Leave a Reply

    Your email address will not be published. Required fields are marked *