Description
2.1 Graphical Techniques & Descriptive Stats
-
Box Plot: Visualizes the 5-number summary (Min, Q1, Median, Q3, Max); identifies outliers.
-
Skewness: Measure of asymmetry. (Left/Negative, Right/Positive).
-
Kurtosis: Measure of tailedness or “peakedness” (Leptokurtic, Mesokurtic, Platykurtic).
-
Descriptive Stats: Summary of data using Central Tendency (Mean, Median, Mode) and Dispersion (Range, Variance, Standard Deviation).
2.2 Correlation, Regression & Data Cleaning
-
Correlation ($r$): Strength and direction of a linear relationship between two variables (-1 to +1).
-
Regression: Predicting a dependent variable ($Y$) based on independent variables ($X$).
-
Data Cleaning: Process of fixing corrupt, inaccurate, or formatting errors in a dataset.
2.3 Imputation Techniques
-
Mean/Median Imputation: Replacing missing values with the average or middle value.
-
Mode Imputation: Used for categorical missing data.
-
K-NN Imputation: Replacing missing data based on the similarity of neighboring data points.
2.4 ANOVA & Chi-Square
-
ANOVA (Analysis of Variance): Testing if the means of 3 or more groups are significantly different.
-
Chi-Square ($\chi^2$): Testing the relationship between categorical variables (Goodness of fit or Independence).
2.5 Scatter Diagram
-
Visualization: Plotting individual data points on an $X-Y$ axis.
-
Pattern Recognition: Used to visually identify clusters, trends, and outliers.
2.6 Estimation & Hypothesis Testing
-
Estimation: Using sample data to estimate a population parameter (Point vs. Interval).
-
Null Hypothesis ($H_0$): Assumption of “no effect” or “no difference.”
-
P-Value: Probability that the observed result happened by chance; compare to Alpha ($\alpha$).
2.7 Sampling Distributions & Counting
-
Central Limit Theorem (CLT): As sample size increases, the distribution of the sample mean becomes normal.
-
Standard Error: The standard deviation of a sampling distribution.
-
Counting: Permutations (order matters) and Combinations (order doesn’t matter).
2.8 Probability & Distributions
-
-
Probability: Likelihood of an event occurring (0 to 1).
-
Binomial: Discrete distribution for success/failure outcomes.
-
Poisson: Discrete distribution for events occurring in a fixed interval of time/space.
-
Normal (Gaussian): Bell-shaped curve defined by mean ($\mu$) and standard deviation ($\sigma$).
-





Reviews
There are no reviews yet.