Dr. Ji Son

Dr. Ji Son

Variability

Slide Duration:

Table of Contents

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
Roadmap
0:10
Roadmap
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements
About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
Roadmap
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
Roadmap
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
Sketch Problem 1: Driver's License
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
Example 3: Quiz Grade Stemplot
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
Roadmap
0:05
Roadmap
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Variability (or Spread)
0:45
Variability (or Spread)
0:46
Things to Think About
5:45
Things to Think About
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Summarizing Distributions
0:37
Shape, Center, and Spread
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
Roadmap
0:16
Roadmap
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
Roadmap
0:15
Roadmap
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
Spread & Strength
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
Inadequate Sample Frame
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
Example 1: Which Type(s) of Sampling was this?
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33
Addition Rule for Disjoint Events

20m 29s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
Addition Rule for Disjoin Events
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
General Addition Rule
5:41
General Addition Rule
5:42
Generalized Addition Rule
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
Example 2: College Graduates
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
SPREAD
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
Answering the 'Questions that Remain'
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
SDoM vs. SDoSP: Spread
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
Binomial Distribution vs. SDoSP: Spread
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
Example 2: Friends on Facebook
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
Example: Average Facebook Friends
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
Example: Average Facebook Friends
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
Roadmap
0:14
Roadmap
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
Roadmap
0:06
Roadmap
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
Roadmap
0:18
Roadmap
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
Instead of a Box…Distributions!
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
Hypotheses about Proportions
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
Hypotheses About Proportions
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Loading...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
Bookmark & Share Embed

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
  ×
  • - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.
  • Discussion

  • Answer Engine

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Lecture Comments (8)

0 answers

Post by tonetvideo on July 29, 2022

This is beyond perfection. He uses simple language to explain. Bravo.

0 answers

Post by Thomas Lyles on January 25, 2021

this lesson is crap.  yikes.
I especially like how there are no responses to any of the comments in the discussion forum.
Could you maybe get some decent teachers?
Shame.

0 answers

Post by sepehr zarrin on October 18, 2013

You should've used more examples.....

0 answers

Post by Manoj Joseph on May 1, 2013

do i need to learn basic of alograthim to understand the transformation formula?

0 answers

Post by Manoj Joseph on May 1, 2013

its bit more complex.On top of that the video is taking time in buffering and I am suffering

0 answers

Post by Kambiz Khosrowshahi on March 27, 2013

My apologies about previous comment (frustrated). You actually explained everything quite well, I still wish you had more examples...

0 answers

Post by Jeff Keith on January 22, 2013

You should have more examples these are hard to understand.

0 answers

Post by Tomer Eiges on March 27, 2012

At 9:24 you spelled wear as "where"

Variability

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:05
    • Roadmap
  • Variability (or Spread) 0:45
    • Variability (or Spread)
  • Things to Think About 5:45
    • Things to Think About
  • Range, Quartiles and Interquartile Range 6:37
    • Range
    • Interquartile Range
  • Interquartile Range Example 10:58
    • Interquartile Range Example
  • Variance and Standard Deviation 12:27
    • Deviations
    • Sum of Squares
    • Variance
    • Standard Deviation
  • Sum of Squares (SS) 18:34
    • Sum of Squares (SS)
  • Population vs. Sample SD 22:00
    • Population vs. Sample SD
  • Population vs. Sample 23:20
    • Mean
    • SD
  • Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File 27:21
  • Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File 35:25
  • Example 3: Sum of Squares 38:58
  • Example 4: Standard Deviation 41:48

Transcription: Variability

Hi and welcome to www.educator.com.0000

Today we are going to be talking about variability.0002

We are going to start off with just a conceptual introduction to the different kinds of ways that you could measure variability.0008

Then we are going to be talking about range, cortex, and inter quartile range.0014

We are going to be talking about variance and standard deviation.0019

In particular, we are going to focus a little bit the concept of sum of squares.0023

We are going to be talking about population, standard deviation versus sample standard deviation and talk about the differences in their formulas.0031

We are going to calculate standard deviation in Excel.0041

Let us get started.0044

Let us think about out conceptual way of thinking about variability.0048

There is lot of different ways that you could actually think about variability.0055

For instance, let me give you this example.0059

Let us say this x right here shown in each of these is the president Barrack Obama.0061

Let us say that this is the president and these are different groups of people that are standing within a formal event.0074

Here we see the secret service and this is how far each of them are from him.0088

Here we see the supreme court justices and they are scattered around him.0093

Here are his cabinet members that he has appointed and they are scattered around him.0100

Here the tea party senators.0105

Let us just that they are the senators that do not like the president as much.0108

There are seem to be hurdled over here.0114

Which of these groups of people are most spread out from the president?0119

Which of these groups of people are closest to him?0126

Who is closest to the president?0129

Can we describe that with a number?0133

There is a couple of ways that you might want to think about.0137

One we might be just look at the farthest person away from the president in each of these sets?0139

Maybe for this it is this guy or this guy and get that distance, maybe that is the distance that you need.0151

For this, it is maybe this guy or this guy.0158

Maybe here it is that guy over there.0162

Maybe here it is this guy, maybe that guy, they seem pretty distant.0166

I knew that guy is a little bit farther.0172

Just looking at the farthest person in the group, that is one way of looking at it.0174

In that case, it does not matter how many people in the group you have.0179

This group has less fewer people that this group but it would not matter if we are just looking at just the one farthest guy in the group.0184

That is one way of looking at it.0193

Another way of looking at it is creating a little boundary and saying how many people are in that boundary.0194

Maybe we have this little square around the president and we just look at how many people are in that square.0203

Maybe for here if we draw a square like that, how many people fall in that square?0208

If that was our measure we would say this group is the closest to the president. Right?0226

Here we have 1 person in this square and none other groups have any people in this square.0236

Maybe we could look at different types of squares and see if that changes anything.0239

That maybe one way of doing it.0245

Another way of doing it might be to find the area of the border.0247

That is another way of doing it.0260

That one does not seems to be a very good model because that one mean that these people are the closest to the president but this is an odd group.0270

They are close to each other but not necessarily close to the president.0280

Should that matter in a measure of variability?0285

That is another thing to think about.0289

The probably one that comes to your mind is this idea that maybe the average distance of all these guys away from the president.0291

Who has the closest average distance?0303

We also would not need to worry about how many are in the group because we divide by the number of people in the group.0309

It actually would not matter if they are close to each other or not, we just care about the distance to the president.0316

These are different ways that you could think about variability.0325

Notice that they are all ways of sticking a number on this concept of variability but you might come up with different numbers.0328

You might come up with different definition for what it means to be spread out versus very close.0337

There are some things to think about, should we be measuring how far they are from the center or how far they are from each other?0347

Center is going to be an important concept in variability so shall we measure it from the median, mode, mean?0357

Does it matter if this group has few and many members?0366

Should that be taking into account?0369

Does it matter what direction away from the president or from that center point if it is to the right or to the left, up or down?0372

What about consistent clustering?0380

Should that matter?0382

Does are some things to think about when we think about a measure of variability.0383

There are lots of different kinds of measures in variability.0388

We are going to talking about two classes of them that are going to address these questions in different ways.0391

The first class of measure that we want to think about are range, cortex, and inter quartile range.0400

This is the idea of just taking the one farthest guy or the one closest guy by looking at that person.0406

Usually, these measures of variability are used with median.0416

It is usually measuring the spread around the median.0422

One of the reason that this is going to be the case is that when we look at range, cortex, and inter quartile range, what we are doing is taking our 0716.8 distribution and cutting it up.0426

Either cutting it up in a half which would be the median, the middle point.0439

Or cutting it up into quartiles, right?0444

Which would be cutting it into ¼ instead of ½.0447

That is the idea.0452

That is why we are going to be using median as their measure of central tendency.0454

When we think about range, you do not need a central tendency at all.0461

What you need is the minimum value and the maximum value and the distance in between.0466

You could think of it as the maximum value in the set of x then subtract the minimum value in the set of x.0473

If you have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 as your distribution, you take 10 – 1 and your range is 9.0485

The problem with that measure of variability even though it is very simple and intuitive, it is highly susceptible to outliers.0493

If we change our set to something like 1, 2, 3, 4, 5, 6, 7, 8, 9, 100, all of a sudden it will be 100 – 1 and our range will be 99.0504

Just by changing one of our numbers we could drastically change the range.0516

Inter quartile range is going to be less susceptible to those outliers but before we get into0523

how to calculate inter quartile range, we have to divide that data into quartiles.0529

Let us just look at a simple example.0537

Here what we would need to do is divide this data into quartiles first.0549

Since it is an even number, the median would fall in between 5.5 and to divide it further to the quartiles we divide it by 3 and divide it up to 8.0556

Here is the first quartile, second quartile, third, and fourth.0570

Because of that, this borders actually has special little names.0575

These borders are called Q1, Q2, and Q3, just to indicate that they are the borders of the quartiles.0579

First you divide the data in quartiles and then basically in order to get the interquartile range, you are lapping off these guys on the ends.0590

It is like the end of bread or cucumber, just like chopping it off, casting it aside.0602

Just in case that there are some extreme outliers.0608

Here what we do is we take Q3 – Q1.0612

In this case, it would be 8 – 3 and the inter quartile range would be 5.0620

Here the inter quartile range gives you the idea 50% of the numbers fall into this range because that is two quartiles.0626

That is 50% right there.0638

That is why it is a nice measure.0640

It is more best than actual range because it is less susceptible to outliers.0642

It is still intuitive and you can see that nice 50% of all the numbers falls in this range.0648

That is inter quartile range, pretty easy.0656

Let us do an example.0661

Here let us say that there are these ages and we want to know what are the inter quartile range of these cells.0663

First, it helps to separate them by quartiles.0671

There are 3, 6, 10 numbers here, because of that here is the mid point.0675

The median also called Q2 that is 30.0687

Here is Q1 and here is Q3.0696

In order to find inter quartile range, sometimes called iQr, it is Q3 – Q1.0709

In this case it would be 38 – 20.0721

Inter quartile range is 18.0727

Within 18 here we could just draw that distance of 18.0730

In that distance, 50% of our numbers fall in there, between 20 and 38.0737

We are going to be talking about variance in standard deviation.0749

When we talk about variance in standard deviation, it is more like in that conceptual example,0753

that distance away from the president, where we are looking at the actual distance.0759

In statistics, what we call distance away from the mean, the president in this case, is a deviation.0766

What we might want to do is get the average deviation but there is going to be a little bit of issue.0776

When we get the deviations from the mean, remember the mean is the value at the middle.0784

The amount is actually in the middle of all the other values.0793

Some of the values are going to be greater than the mean and some of the values are less than the mean.0798

When we add all of those up, the formula looks like this, the summation sign and we take each value in our distribution x sub I, take out the mean.0804

Get that distance away from the mean, that deviation from the mean.0820

When we add all those up, where I goes from 1 all the way to n, however many we have in our sample.0824

We basically get 0 because sometimes the value is greater than the mean and sometimes the value is less than the mean.0832

When it is greater the number is greater than 0.0841

When it is less, the number is less than 0.0845

We add up a whole bunch of positive and negative numbers, you end up getting something very close to 0.0848

That is the problem because when you get 0 as your sum and you divide whatever your n is,0854

no matter what n is it is going be 0 because 0 divided by anything is 0.0861

This is not going to work for us.0868

That is not going to be good to have every single average deviation being 0.0870

That is not useful.0877

What do we do?0878

Here we are going to sum the squared deviation.0880

Instead of just summing up all the deviations, we are going to square the deviation and them sum those up.0883

Whenever you square it, you get a positive number.0890

The sum of squares is always going to be positive.0895

You will get many advantages out of doing this squaring business and we will learn more about some of those advantages later.0899

Let us talk about how to write this in notation.0905

Here we have that same idea, that same deviation idea where looking at distances away from the mean,0909

but we are going to square each of those distances.0919

I = 1 to n.0925

Just a word about this summing notation, basically when you have the summing notation whatever is here,0929

you need to do this first and them sum up everything in here.0939

Sometimes what people do is they sum up all of x sub I first, they sum up all of them up and then subtract out the x.0945

But we are not summing the values, we are summing the squared deviation.0956

You got to get the squared deviation first.0964

Each values is going to have a distance and each of those distance needs to be squared and then you need to add them up.0967

This would not be equal to 0 unless all your values are 0 and your mean is 0.0977

In that case, they would not usually equal to 0.0985

This is going to be called sum of squares and that is often shown by using the term ss.0989

If it is sum of squares are the samples, sometimes you will see this notation where it has a little x down there.0998

If it is the sum of squares of the population which you probably ever have, it will be ss sub X.1006

We could look at the average squared distance from the mean, average squared deviation.1017

You will do that simply by dividing by the number of values you have.1026

When we have the variance of the sample, that is going to be called s2, that is going to be the variance.1030

I will write it in blue, right?1040

That is the variance of a sample.1041

That is just going to be ss ÷ n.1044

The problem with variance is that it is not in the same units as you mean because we have squared all the distances.1051

In order to bring it back to the same unit as the mean, it is easier for comparison,1060

what we are going to do is get the stan dard deviation by just square rooting each side.1065

Standard deviation is just s and that is going to be just the square root of variance.1071

Standard deviation is now just the average distance from the mean, instead of average squared distance away from the mean.1085

This is going to be for samples, but in order to get variance for the population they use the lower case sigma.1094

For variance it will be lower case Σ2 and for standard deviation it will be just lower case Σ.1105

I will show you in a little bit how to do that.1111

Let us take a little bit of time to talk about sum of squares in depth.1117

Before that, there is a little typo on this page, I’m just going to correct that so that it will be smooth when we get down here.1123

Let us start from the beginning, sum of squares is always this sum of squared distances away from the mean of the sample.1136

The mean of the sample is x bar, that is how we denote it.1145

That is the symbol for it.1150

The sum of squared distances away from the mean is going to be the smallest sum of squares and from any other point.1153

You can pick any other number this will give you the smallest sum of squares.1160

Any other number will give you a bigger sum of squares.1167

Here is the problem, the sample mean is rarely ever the actual population mean.1171

Because of that, the population mean is this any other point.1179

If we have the real some of squares from the population mean, we would actually get a bigger sum of squares than we actually have.1185

That is the problem.1193

Here is why, because then that means because we have a sum of squares that is a little bit to small,1195

our sample standard deviation is going to be actually a little bit smaller than our population standard deviation all the time.1201

That is an issue.1210

We are always under shooting the population standard deviation.1211

To correct for this, we are going to divide the sum of squares from our sample by a slightly smaller number than we actually do.1215

Right now, to get s or standard deviation, we take sum of squares ÷ n.1227

That is what we do right now.1237

This will help us approximate the actual population.1239

Here we are going to need divide by a slightly smaller number1246

because when we divide by a smaller number, then our resulting answer is slightly bigger.1252

Dividing by 5 we are going to get a bigger answer than if you divide by 8.1259

Because of that we are going to use that.1268

Instead, in order of approximate the population standard deviation what we are going to do is use ss ÷ n – 1.1272

This is going to be a slightly smaller number giving us a slightly bigger population standard deviation.1293

Why n – 1? Why not n - .5 or n – 2?1301

There is a proof that you could look at up on line called Pessel’s Correction Proof and it is a really elegant proof if you have time to look it up.1307

That is my spill on sum of squares but we will come back to this because it is a pretty important idea.1315

Let us talk about the difference between population standard deviation and sample standard deviation.1323

We always want to make inferences from the sample to the population, that is what we would like to do.1330

Our sample distribution is denoted by lower case x and our population distribution is denoted by upper case X.1337

In order to make that leap, we are going from sample statistics to population parameter.1346

We are going to be estimating things like estimating mu from x bar, that is estimating the mean of the population from the mean of the sample.1364

We are going to estimate the Σ or the standard deviation of a population from s, which is the standard deviation of the sample.1375

Sigma is our new notation, notice that for population we are using parameters with Greek letters and here we are using regular Roman letters.1388

Let us talk about the formulas for these.1403

When we talk about mean, mu in this case, an x bar, in this case.1407

We talk about adding up all of the lower case x and dividing by lower case n.1414

Here we add it up all at once in our upper case X and dividing by upper case N, just superficial changes.1421

When we talk about standard deviation, here we are going to be talking about lower case Σ or talking about s.1433

Let us actually write down this formula.1445

You could write it as √sum of squares ÷n, that is one way to do it.1448

One thing you could do is think about double clicking on this.1455

Just double click on it.1463

Then what we would get is you would see the whole she bang inside.1466

Hopefully I could try.1472

Sum of squares means give me all the squared deviations, distances, away from x bar, square all of those.1474

If you want you could put in I = 1 all the way up to n ÷ n.1485

If we want to actually use this to estimate that, we will divide by n – 1.1505

This is upper case S and I’m going to denote that by using a little bar there.1513

In order to have this estimation, we would use lower case s.1520

In this case, what we would do is divide our sum of squares by n – 1.1534

That is our way of estimating from s to Σ.1540

That is our estimate.1544

When we talk about the population standard deviation, it is still ss ÷ n but it is upper case S this time.1547

When we double click on ss and see what is inside of it, we unpack that, here is what it looks like.1559

It is (X sub I – mu2) ÷ N.1569

Here are all of these formulas.1581

We have formulas for standard deviation of the sample, standard deviation of the population, but we also have this new idea.1592

This is in between this one and this one.1595

It is a way of going from sample information to estimating a population standard deviation.1600

Usually, we do not calculate sigma directly because we do not have every single value for the population.1611

Usually, we calculate small s which is going to be the estimated standard deviation and1619

we hardly use this one as well because we do not really care about the standard deviation in just our sample.1628

We want to know the standard deviation for the population.1635

Let us go on to our examples.1644

Here is example 1.1646

It says find the mean in standard deviation of the variable friends in the Excel file.1648

If you get the Excel file that you can download, go ahead and click on friends.1655

We are going to be finding the standard deviation for the variable friends.1662

What would be nice is if we could do everything in Excel but before we do that I jut want to make sure you understand how standard deviation works.1671

Because of that I’m going to have you do it manually first.1680

In order to do that, go ahead and go to data, find the variable friends, click on that column1684

and I’m just going to copy that whole column and paste that right in here.1693

Here I have my entire distribution of friends.1702

I’m going to say Excel calculate the mean for us.1707

I’m going to use the function average and select all this nice data right here, click enter.1712

That is our mean.1725

That mean is not going to change for anybody because mean is just the mean of the entire distribution.1729

I’m just going to put our pointer there and I’m going to say whatever the mean is on top of me,1737

that is the mean and I’m just going to paste that all the way down.1742

This whole column should have the same mean.1749

The reason I’m doing that is because that is going to make it easier for us to calculate square of deviation.1754

We could just use the locked version of mean too.1762

Let us get our squared deviation.1767

Deviation just means the distance from each value to the mean x bar2.1771

In order to do the square we put in the count and 2.1782

We hit enter and here is our squared deviation.1789

I’m just going to drag that formula all the way down.1794

Here we have a whole bunch of squared deviations.1800

We have to sum up all those squared deviations.1804

Here I’m just going to put in ss because that is what we are going to get and in order to get ss, we just add up this whole column.1809

In order to get variance, where S2 what we need to do is take ss ÷ n.1829

I’m going to take ss ÷ n.1844

I know here that my n is 100 but if you did not know for some reason, you could use the function count1849

and just ask it count how many values there are.1855

Not count it, just count, count how many values there are.1858

It should be 100.1864

Indeed it is a hundred because it moved the decimal point 2 over.1868

Now we could get standard deviation or S.1873

In order to get that, we just square root our variance.1879

Excel has a function called square root (sqrt) and I’m just going to square root my variance.1883

Here I get a standard deviation of 428.64.1892

I need to do all that just so that you would understand how to calculate standard deviation.1898

Excel has a nice handy way for you to do it.1906

Here I’m going to calculate s automatically.1908

Here we are looking at just s, in order to calculate s we would do stdevp because that is the one where you divide by n.1916

I’m finding the standard deviation of all my squared values, that is wrong.1953

I should be finding the standard deviation of my actual data, right?1956

In this method, you actually do not need any of this.1965

I will just make you go through it so you would learn.1969

When we calculate s automatically, using stdevp you will see that we get the exact same standard deviation1971

and we do have to do any of that mean calculating or calculating sum of squares of variance or anything like that.1981

There is even a way Excel will calculate for you little s, the estimate of the population standard deviation from the sample.1989

That is the one that you will be most likely using.2004

Because of that, I think that might be a good one for us to do.2007

Sum of squares is going to be the same thing.2010

I’m just going to copy all of this.2017

The sum of squares is going to be the same thing but variance is going to be a little bit different now.2019

Instead, I will be dividing by n, we are going to be dividing by n – 1.2029

I’m going to put in 99 instead of 100.2036

Square rooting, that works the same way, square root of my variance.2043

I noticed that when we divide by n -1, my standard deviation is slightly bigger than it would have been when we just divided up by n.2053

Let us calculate little s automatically.2074

Excel always assumes that is probably what you will be wanting to do.2077

It made stdev that default formula is going to divide by n -1.2084

We see that those two are the same values, a shortcut.2102

You see when you automatically calculate it with Excel, you are not going to need to calculate mean2107

or the sum of squares but it is nice to know where those things come from.2117

We did that already.2124

Let us find the mean and standard deviation of the tagged photos in the Excel file.2129

If you click over on data, let us go ahead and grab the tagged photos values in that variable column and paste it right in here.2137

It is just easier than going back and forth.2151

Let us find the mean in this sample.2154

I typed in average and I wanted to average all of this then I’m just going to say whatever is above me that is the same mean.2161

Copy and paste it all the way down, everybody else has the same mean.2182

I’m just going to get my squared deviation.2188

It is my first value – the mean2.2193

I’m going to copy and paste that all the way down.2203

Let us get the sum of squares.2213

In order to do that we just find the sum of all these squared deviation.2216

In order to find variance or S2, that is just s2 because that is the one you will be using for most part, right?2229

Our little s2, we take this sum of squares and we divide it by n -1.2240

We could use count, count all of that – 1.2249

All of this is in my denominator and hit enter.2269

That is my variance.2279

What is my standard deviation?2282

My little s, my estimated standard deviation.2286

All I have to do is square root my variance and that is what I got.2289

Let us check our answers by using the automatic Excel version.2296

Here we will put in stdev, I want to put in our actual data, our actual values.2305

This is our real distributions that we are working with here.2318

Excel does it nice and quickly for us.2325

We do not need all of these stuff.2328

In the future, we will just be using this automatic version but I do want you to know where that comes from.2330

Let us go on to example 3.2340

The average number of calories in a frozen yogurt is 250, with an estimated population standard deviation of 30.2342

If 24 frozen yogurts from popular chains where sampled, what would be their ss or sum of squares?2349

Here we know that we do not need the actual values and the means in order to find sum of squares.2358

Because we have some of the other pieces and we could just fill out what is missing and figure out what is missing.2365

We know that they have estimated population and standard deviation.2373

That is little s.2378

In order to get little s, we know that they added up all of the x sub I – the mean2 ÷ n -1 and took the square root of that.2382

We know that is what they did.2404

Another way of writing that is square root of ss / n – 1.2405

Let us fill in what we have.2414

They know that the standard deviation eventually is 30, this s is 30.2417

What we are trying to find out is this.2428

We do not have that ss.2431

But we do have n – 1 because n is 24.2439

24 – 1 is 23.2444

From that, and only that information we could figure out ss and in they have given us this mean 250.2448

It is sort of red airing, you do not actually need it in this problem.2458

I’m going to use a little piece of my Excel as a calculator and here I know I need to square 30, 302.2464

I could just multiply 23 to that.2486

I will get 20,700.2491

My ss is 20, 700.2496

I did not actually need all my values from the distribution nor my mean.2504

Last question, example 4.2512

This is a conceptual question, hopefully this will test you on concepts.2515

When we divide by n – 1, rather than by n, what effect does this have on the resulting standard deviation?2521

N -1 is a smaller number than n, right?2529

Dividing by a smaller number will result in a bigger answer.2532

The resulting standard deviation s will be a little bit greater than this s.2536

This one divides by n and this one divides by n -1.2544

That is it for variability.2556

Thanks for using www.educator.com.2558

Educator®

Please sign in to participate in this lecture discussion.

Resetting Your Password?
OR

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Membership Overview

  • Available 24/7. Unlimited Access to Our Entire Library.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lecture slides for taking notes.
  • Track your course viewing progress.
  • Accessible anytime, anywhere with our Android and iOS apps.