Dr. Ji Son

Dr. Ji Son

Introduction to Sampling Distributions

Slide Duration:

Table of Contents

Section 1: Introduction
Descriptive Statistics vs. Inferential Statistics

25m 31s

Intro
0:00
Roadmap
0:10
Roadmap
0:11
Statistics
0:35
Statistics
0:36
Let's Think About High School Science
1:12
Measurement and Find Patterns (Mathematical Formula)
1:13
Statistics = Math of Distributions
4:58
Distributions
4:59
Problematic… but also GREAT
5:58
Statistics
7:33
How is It Different from Other Specializations in Mathematics?
7:34
Statistics is Fundamental in Natural and Social Sciences
7:53
Two Skills of Statistics
8:20
Description (Exploration)
8:21
Inference
9:13
Descriptive Statistics vs. Inferential Statistics: Apply to Distributions
9:58
Descriptive Statistics
9:59
Inferential Statistics
11:05
Populations vs. Samples
12:19
Populations vs. Samples: Is it the Truth?
12:20
Populations vs. Samples: Pros & Cons
13:36
Populations vs. Samples: Descriptive Values
16:12
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:10
Putting Together Descriptive/Inferential Stats & Populations/Samples
17:11
Example 1: Descriptive Statistics vs. Inferential Statistics
19:09
Example 2: Descriptive Statistics vs. Inferential Statistics
20:47
Example 3: Sample, Parameter, Population, and Statistic
21:40
Example 4: Sample, Parameter, Population, and Statistic
23:28
Section 2: About Samples: Cases, Variables, Measurements
About Samples: Cases, Variables, Measurements

32m 14s

Intro
0:00
Data
0:09
Data, Cases, Variables, and Values
0:10
Rows, Columns, and Cells
2:03
Example: Aircrafts
3:52
How Do We Get Data?
5:38
Research: Question and Hypothesis
5:39
Research Design
7:11
Measurement
7:29
Research Analysis
8:33
Research Conclusion
9:30
Types of Variables
10:03
Discrete Variables
10:04
Continuous Variables
12:07
Types of Measurements
14:17
Types of Measurements
14:18
Types of Measurements (Scales)
17:22
Nominal
17:23
Ordinal
19:11
Interval
21:33
Ratio
24:24
Example 1: Cases, Variables, Measurements
25:20
Example 2: Which Scale of Measurement is Used?
26:55
Example 3: What Kind of a Scale of Measurement is This?
27:26
Example 4: Discrete vs. Continuous Variables.
30:31
Section 3: Visualizing Distributions
Introduction to Excel

8m 9s

Intro
0:00
Before Visualizing Distribution
0:10
Excel
0:11
Excel: Organization
0:45
Workbook
0:46
Column x Rows
1:50
Tools: Menu Bar, Standard Toolbar, and Formula Bar
3:00
Excel + Data
6:07
Exce and Data
6:08
Frequency Distributions in Excel

39m 10s

Intro
0:00
Roadmap
0:08
Data in Excel and Frequency Distributions
0:09
Raw Data to Frequency Tables
0:42
Raw Data to Frequency Tables
0:43
Frequency Tables: Using Formulas and Pivot Tables
1:28
Example 1: Number of Births
7:17
Example 2: Age Distribution
20:41
Example 3: Height Distribution
27:45
Example 4: Height Distribution of Males
32:19
Frequency Distributions and Features

25m 29s

Intro
0:00
Roadmap
0:10
Data in Excel, Frequency Distributions, and Features of Frequency Distributions
0:11
Example #1
1:35
Uniform
1:36
Example #2
2:58
Unimodal, Skewed Right, and Asymmetric
2:59
Example #3
6:29
Bimodal
6:30
Example #4a
8:29
Symmetric, Unimodal, and Normal
8:30
Point of Inflection and Standard Deviation
11:13
Example #4b
12:43
Normal Distribution
12:44
Summary
13:56
Uniform, Skewed, Bimodal, and Normal
13:57
Sketch Problem 1: Driver's License
17:34
Sketch Problem 2: Life Expectancy
20:01
Sketch Problem 3: Telephone Numbers
22:01
Sketch Problem 4: Length of Time Used to Complete a Final Exam
23:43
Dotplots and Histograms in Excel

42m 42s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Previously
1:02
Data, Frequency Table, and visualization
1:03
Dotplots
1:22
Dotplots Excel Example
1:23
Dotplots: Pros and Cons
7:22
Pros and Cons of Dotplots
7:23
Dotplots Excel Example Cont.
9:07
Histograms
12:47
Histograms Overview
12:48
Example of Histograms
15:29
Histograms: Pros and Cons
31:39
Pros
31:40
Cons
32:31
Frequency vs. Relative Frequency
32:53
Frequency
32:54
Relative Frequency
33:36
Example 1: Dotplots vs. Histograms
34:36
Example 2: Age of Pennies Dotplot
36:21
Example 3: Histogram of Mammal Speeds
38:27
Example 4: Histogram of Life Expectancy
40:30
Stemplots

12m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
What Sets Stemplots Apart?
0:46
Data Sets, Dotplots, Histograms, and Stemplots
0:47
Example 1: What Do Stemplots Look Like?
1:58
Example 2: Back-to-Back Stemplots
5:00
Example 3: Quiz Grade Stemplot
7:46
Example 4: Quiz Grade & Afterschool Tutoring Stemplot
9:56
Bar Graphs

22m 49s

Intro
0:00
Roadmap
0:05
Roadmap
0:08
Review of Frequency Distributions
0:44
Y-axis and X-axis
0:45
Types of Frequency Visualizations Covered so Far
2:16
Introduction to Bar Graphs
4:07
Example 1: Bar Graph
5:32
Example 1: Bar Graph
5:33
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:07
Do Shapes, Center, and Spread of Distributions Apply to Bar Graphs?
11:08
Example 2: Create a Frequency Visualization for Gender
14:02
Example 3: Cases, Variables, and Frequency Visualization
16:34
Example 4: What Kind of Graphs are Shown Below?
19:29
Section 4: Summarizing Distributions
Central Tendency: Mean, Median, Mode

38m 50s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Central Tendency 1
0:56
Way to Summarize a Distribution of Scores
0:57
Mode
1:32
Median
2:02
Mean
2:36
Central Tendency 2
3:47
Mode
3:48
Median
4:20
Mean
5:25
Summation Symbol
6:11
Summation Symbol
6:12
Population vs. Sample
10:46
Population vs. Sample
10:47
Excel Examples
15:08
Finding Mode, Median, and Mean in Excel
15:09
Median vs. Mean
21:45
Effect of Outliers
21:46
Relationship Between Parameter and Statistic
22:44
Type of Measurements
24:00
Which Distributions to Use With
24:55
Example 1: Mean
25:30
Example 2: Using Summation Symbol
29:50
Example 3: Average Calorie Count
32:50
Example 4: Creating an Example Set
35:46
Variability

42m 40s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Variability (or Spread)
0:45
Variability (or Spread)
0:46
Things to Think About
5:45
Things to Think About
5:46
Range, Quartiles and Interquartile Range
6:37
Range
6:38
Interquartile Range
8:42
Interquartile Range Example
10:58
Interquartile Range Example
10:59
Variance and Standard Deviation
12:27
Deviations
12:28
Sum of Squares
14:35
Variance
16:55
Standard Deviation
17:44
Sum of Squares (SS)
18:34
Sum of Squares (SS)
18:35
Population vs. Sample SD
22:00
Population vs. Sample SD
22:01
Population vs. Sample
23:20
Mean
23:21
SD
23:51
Example 1: Find the Mean and Standard Deviation of the Variable Friends in the Excel File
27:21
Example 2: Find the Mean and Standard Deviation of the Tagged Photos in the Excel File
35:25
Example 3: Sum of Squares
38:58
Example 4: Standard Deviation
41:48
Five Number Summary & Boxplots

57m 15s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Summarizing Distributions
0:37
Shape, Center, and Spread
0:38
5 Number Summary
1:14
Boxplot: Visualizing 5 Number Summary
3:37
Boxplot: Visualizing 5 Number Summary
3:38
Boxplots on Excel
9:01
Using 'Stocks' and Using Stacked Columns
9:02
Boxplots on Excel Example
10:14
When are Boxplots Useful?
32:14
Pros
32:15
Cons
32:59
How to Determine Outlier Status
33:24
Rule of Thumb: Upper Limit
33:25
Rule of Thumb: Lower Limit
34:16
Signal Outliers in an Excel Data File Using Conditional Formatting
34:52
Modified Boxplot
48:38
Modified Boxplot
48:39
Example 1: Percentage Values & Lower and Upper Whisker
49:10
Example 2: Boxplot
50:10
Example 3: Estimating IQR From Boxplot
53:46
Example 4: Boxplot and Missing Whisker
54:35
Shape: Calculating Skewness & Kurtosis

41m 51s

Intro
0:00
Roadmap
0:16
Roadmap
0:17
Skewness Concept
1:09
Skewness Concept
1:10
Calculating Skewness
3:26
Calculating Skewness
3:27
Interpreting Skewness
7:36
Interpreting Skewness
7:37
Excel Example
8:49
Kurtosis Concept
20:29
Kurtosis Concept
20:30
Calculating Kurtosis
24:17
Calculating Kurtosis
24:18
Interpreting Kurtosis
29:01
Leptokurtic
29:35
Mesokurtic
30:10
Platykurtic
31:06
Excel Example
32:04
Example 1: Shape of Distribution
38:28
Example 2: Shape of Distribution
39:29
Example 3: Shape of Distribution
40:14
Example 4: Kurtosis
41:10
Normal Distribution

34m 33s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
What is a Normal Distribution
0:44
The Normal Distribution As a Theoretical Model
0:45
Possible Range of Probabilities
3:05
Possible Range of Probabilities
3:06
What is a Normal Distribution
5:07
Can Be Described By
5:08
Properties
5:49
'Same' Shape: Illusion of Different Shape!
7:35
'Same' Shape: Illusion of Different Shape!
7:36
Types of Problems
13:45
Example: Distribution of SAT Scores
13:46
Shape Analogy
19:48
Shape Analogy
19:49
Example 1: The Standard Normal Distribution and Z-Scores
22:34
Example 2: The Standard Normal Distribution and Z-Scores
25:54
Example 3: Sketching and Normal Distribution
28:55
Example 4: Sketching and Normal Distribution
32:32
Standard Normal Distributions & Z-Scores

41m 44s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
A Family of Distributions
0:28
Infinite Set of Distributions
0:29
Transforming Normal Distributions to 'Standard' Normal Distribution
1:04
Normal Distribution vs. Standard Normal Distribution
2:58
Normal Distribution vs. Standard Normal Distribution
2:59
Z-Score, Raw Score, Mean, & SD
4:08
Z-Score, Raw Score, Mean, & SD
4:09
Weird Z-Scores
9:40
Weird Z-Scores
9:41
Excel
16:45
For Normal Distributions
16:46
For Standard Normal Distributions
19:11
Excel Example
20:24
Types of Problems
25:18
Percentage Problem: P(x)
25:19
Raw Score and Z-Score Problems
26:28
Standard Deviation Problems
27:01
Shape Analogy
27:44
Shape Analogy
27:45
Example 1: Deaths Due to Heart Disease vs. Deaths Due to Cancer
28:24
Example 2: Heights of Male College Students
33:15
Example 3: Mean and Standard Deviation
37:14
Example 4: Finding Percentage of Values in a Standard Normal Distribution
37:49
Normal Distribution: PDF vs. CDF

55m 44s

Intro
0:00
Roadmap
0:15
Roadmap
0:16
Frequency vs. Cumulative Frequency
0:56
Frequency vs. Cumulative Frequency
0:57
Frequency vs. Cumulative Frequency
4:32
Frequency vs. Cumulative Frequency Cont.
4:33
Calculus in Brief
6:21
Derivative-Integral Continuum
6:22
PDF
10:08
PDF for Standard Normal Distribution
10:09
PDF for Normal Distribution
14:32
Integral of PDF = CDF
21:27
Integral of PDF = CDF
21:28
Example 1: Cumulative Frequency Graph
23:31
Example 2: Mean, Standard Deviation, and Probability
24:43
Example 3: Mean and Standard Deviation
35:50
Example 4: Age of Cars
49:32
Section 5: Linear Regression
Scatterplots

47m 19s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Previous Visualizations
0:30
Frequency Distributions
0:31
Compare & Contrast
2:26
Frequency Distributions Vs. Scatterplots
2:27
Summary Values
4:53
Shape
4:54
Center & Trend
6:41
Spread & Strength
8:22
Univariate & Bivariate
10:25
Example Scatterplot
10:48
Shape, Trend, and Strength
10:49
Positive and Negative Association
14:05
Positive and Negative Association
14:06
Linearity, Strength, and Consistency
18:30
Linearity
18:31
Strength
19:14
Consistency
20:40
Summarizing a Scatterplot
22:58
Summarizing a Scatterplot
22:59
Example 1: Gapminder.org, Income x Life Expectancy
26:32
Example 2: Gapminder.org, Income x Infant Mortality
36:12
Example 3: Trend and Strength of Variables
40:14
Example 4: Trend, Strength and Shape for Scatterplots
43:27
Regression

32m 2s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Linear Equations
0:34
Linear Equations: y = mx + b
0:35
Rough Line
5:16
Rough Line
5:17
Regression - A 'Center' Line
7:41
Reasons for Summarizing with a Regression Line
7:42
Predictor and Response Variable
10:04
Goal of Regression
12:29
Goal of Regression
12:30
Prediction
14:50
Example: Servings of Mile Per Year Shown By Age
14:51
Intrapolation
17:06
Extrapolation
17:58
Error in Prediction
20:34
Prediction Error
20:35
Residual
21:40
Example 1: Residual
23:34
Example 2: Large and Negative Residual
26:30
Example 3: Positive Residual
28:13
Example 4: Interpret Regression Line & Extrapolate
29:40
Least Squares Regression

56m 36s

Intro
0:00
Roadmap
0:13
Roadmap
0:14
Best Fit
0:47
Best Fit
0:48
Sum of Squared Errors (SSE)
1:50
Sum of Squared Errors (SSE)
1:51
Why Squared?
3:38
Why Squared?
3:39
Quantitative Properties of Regression Line
4:51
Quantitative Properties of Regression Line
4:52
So How do we Find Such a Line?
6:49
SSEs of Different Line Equations & Lowest SSE
6:50
Carl Gauss' Method
8:01
How Do We Find Slope (b1)
11:00
How Do We Find Slope (b1)
11:01
Hoe Do We Find Intercept
15:11
Hoe Do We Find Intercept
15:12
Example 1: Which of These Equations Fit the Above Data Best?
17:18
Example 2: Find the Regression Line for These Data Points and Interpret It
26:31
Example 3: Summarize the Scatterplot and Find the Regression Line.
34:31
Example 4: Examine the Mean of Residuals
43:52
Correlation

43m 58s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Summarizing a Scatterplot Quantitatively
0:47
Shape
0:48
Trend
1:11
Strength: Correlation ®
1:45
Correlation Coefficient ( r )
2:30
Correlation Coefficient ( r )
2:31
Trees vs. Forest
11:59
Trees vs. Forest
12:00
Calculating r
15:07
Average Product of z-scores for x and y
15:08
Relationship between Correlation and Slope
21:10
Relationship between Correlation and Slope
21:11
Example 1: Find the Correlation between Grams of Fat and Cost
24:11
Example 2: Relationship between r and b1
30:24
Example 3: Find the Regression Line
33:35
Example 4: Find the Correlation Coefficient for this Set of Data
37:37
Correlation: r vs. r-squared

52m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
R-squared
0:44
What is the Meaning of It? Why Squared?
0:45
Parsing Sum of Squared (Parsing Variability)
2:25
SST = SSR + SSE
2:26
What is SST and SSE?
7:46
What is SST and SSE?
7:47
r-squared
18:33
Coefficient of Determination
18:34
If the Correlation is Strong…
20:25
If the Correlation is Strong…
20:26
If the Correlation is Weak…
22:36
If the Correlation is Weak…
22:37
Example 1: Find r-squared for this Set of Data
23:56
Example 2: What Does it Mean that the Simple Linear Regression is a 'Model' of Variance?
33:54
Example 3: Why Does r-squared Only Range from 0 to 1
37:29
Example 4: Find the r-squared for This Set of Data
39:55
Transformations of Data

27m 8s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Why Transform?
0:26
Why Transform?
0:27
Shape-preserving vs. Shape-changing Transformations
5:14
Shape-preserving = Linear Transformations
5:15
Shape-changing Transformations = Non-linear Transformations
6:20
Common Shape-Preserving Transformations
7:08
Common Shape-Preserving Transformations
7:09
Common Shape-Changing Transformations
8:59
Powers
9:00
Logarithms
9:39
Change Just One Variable? Both?
10:38
Log-log Transformations
10:39
Log Transformations
14:38
Example 1: Create, Graph, and Transform the Data Set
15:19
Example 2: Create, Graph, and Transform the Data Set
20:08
Example 3: What Kind of Model would You Choose for this Data?
22:44
Example 4: Transformation of Data
25:46
Section 6: Collecting Data in an Experiment
Sampling & Bias

54m 44s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Descriptive vs. Inferential Statistics
1:04
Descriptive Statistics: Data Exploration
1:05
Example
2:03
To tackle Generalization…
4:31
Generalization
4:32
Sampling
6:06
'Good' Sample
6:40
Defining Samples and Populations
8:55
Population
8:56
Sample
11:16
Why Use Sampling?
13:09
Why Use Sampling?
13:10
Goal of Sampling: Avoiding Bias
15:04
What is Bias?
15:05
Where does Bias Come from: Sampling Bias
17:53
Where does Bias Come from: Response Bias
18:27
Sampling Bias: Bias from Bas Sampling Methods
19:34
Size Bias
19:35
Voluntary Response Bias
21:13
Convenience Sample
22:22
Judgment Sample
23:58
Inadequate Sample Frame
25:40
Response Bias: Bias from 'Bad' Data Collection Methods
28:00
Nonresponse Bias
29:31
Questionnaire Bias
31:10
Incorrect Response or Measurement Bias
37:32
Example 1: What Kind of Biases?
40:29
Example 2: What Biases Might Arise?
44:46
Example 3: What Kind of Biases?
48:34
Example 4: What Kind of Biases?
51:43
Sampling Methods

14m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Biased vs. Unbiased Sampling Methods
0:32
Biased Sampling
0:33
Unbiased Sampling
1:13
Probability Sampling Methods
2:31
Simple Random
2:54
Stratified Random Sampling
4:06
Cluster Sampling
5:24
Two-staged Sampling
6:22
Systematic Sampling
7:25
Example 1: Which Type(s) of Sampling was this?
8:33
Example 2: Describe How to Take a Two-Stage Sample from this Book
10:16
Example 3: Sampling Methods
11:58
Example 4: Cluster Sample Plan
12:48
Research Design

53m 54s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Descriptive vs. Inferential Statistics
0:51
Descriptive Statistics: Data Exploration
0:52
Inferential Statistics
1:02
Variables and Relationships
1:44
Variables
1:45
Relationships
2:49
Not Every Type of Study is an Experiment…
4:16
Category I - Descriptive Study
4:54
Category II - Correlational Study
5:50
Category III - Experimental, Quasi-experimental, Non-experimental
6:33
Category III
7:42
Experimental, Quasi-experimental, and Non-experimental
7:43
Why CAN'T the Other Strategies Determine Causation?
10:18
Third-variable Problem
10:19
Directionality Problem
15:49
What Makes Experiments Special?
17:54
Manipulation
17:55
Control (and Comparison)
21:58
Methods of Control
26:38
Holding Constant
26:39
Matching
29:11
Random Assignment
31:48
Experiment Terminology
34:09
'true' Experiment vs. Study
34:10
Independent Variable (IV)
35:16
Dependent Variable (DV)
35:45
Factors
36:07
Treatment Conditions
36:23
Levels
37:43
Confounds or Extraneous Variables
38:04
Blind
38:38
Blind Experiments
38:39
Double-blind Experiments
39:29
How Categories Relate to Statistics
41:35
Category I - Descriptive Study
41:36
Category II - Correlational Study
42:05
Category III - Experimental, Quasi-experimental, Non-experimental
42:43
Example 1: Research Design
43:50
Example 2: Research Design
47:37
Example 3: Research Design
50:12
Example 4: Research Design
52:00
Between and Within Treatment Variability

41m 31s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Experimental Designs
0:51
Experimental Designs: Manipulation & Control
0:52
Two Types of Variability
2:09
Between Treatment Variability
2:10
Within Treatment Variability
3:31
Updated Goal of Experimental Design
5:47
Updated Goal of Experimental Design
5:48
Example: Drugs and Driving
6:56
Example: Drugs and Driving
6:57
Different Types of Random Assignment
11:27
All Experiments
11:28
Completely Random Design
12:02
Randomized Block Design
13:19
Randomized Block Design
15:48
Matched Pairs Design
15:49
Repeated Measures Design
19:47
Between-subject Variable vs. Within-subject Variable
22:43
Completely Randomized Design
22:44
Repeated Measures Design
25:03
Example 1: Design a Completely Random, Matched Pair, and Repeated Measures Experiment
26:16
Example 2: Block Design
31:41
Example 3: Completely Randomized Designs
35:11
Example 4: Completely Random, Matched Pairs, or Repeated Measures Experiments?
39:01
Section 7: Review of Probability Axioms
Sample Spaces

37m 52s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
Why is Probability Involved in Statistics
0:48
Probability
0:49
Can People Tell the Difference between Cheap and Gourmet Coffee?
2:08
Taste Test with Coffee Drinkers
3:37
If No One can Actually Taste the Difference
3:38
If Everyone can Actually Taste the Difference
5:36
Creating a Probability Model
7:09
Creating a Probability Model
7:10
D'Alembert vs. Necker
9:41
D'Alembert vs. Necker
9:42
Problem with D'Alembert's Model
13:29
Problem with D'Alembert's Model
13:30
Covering Entire Sample Space
15:08
Fundamental Principle of Counting
15:09
Where Do Probabilities Come From?
22:54
Observed Data, Symmetry, and Subjective Estimates
22:55
Checking whether Model Matches Real World
24:27
Law of Large Numbers
24:28
Example 1: Law of Large Numbers
27:46
Example 2: Possible Outcomes
30:43
Example 3: Brands of Coffee and Taste
33:25
Example 4: How Many Different Treatments are there?
35:33
Addition Rule for Disjoint Events

20m 29s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Disjoint Events
0:41
Disjoint Events
0:42
Meaning of 'or'
2:39
In Regular Life
2:40
In Math/Statistics/Computer Science
3:10
Addition Rule for Disjoin Events
3:55
If A and B are Disjoint: P (A and B)
3:56
If A and B are Disjoint: P (A or B)
5:15
General Addition Rule
5:41
General Addition Rule
5:42
Generalized Addition Rule
8:31
If A and B are not Disjoint: P (A or B)
8:32
Example 1: Which of These are Mutually Exclusive?
10:50
Example 2: What is the Probability that You will Have a Combination of One Heads and Two Tails?
12:57
Example 3: Engagement Party
15:17
Example 4: Home Owner's Insurance
18:30
Conditional Probability

57m 19s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
'or' vs. 'and' vs. Conditional Probability
1:07
'or' vs. 'and' vs. Conditional Probability
1:08
'and' vs. Conditional Probability
5:57
P (M or L)
5:58
P (M and L)
8:41
P (M|L)
11:04
P (L|M)
12:24
Tree Diagram
15:02
Tree Diagram
15:03
Defining Conditional Probability
22:42
Defining Conditional Probability
22:43
Common Contexts for Conditional Probability
30:56
Medical Testing: Positive Predictive Value
30:57
Medical Testing: Sensitivity
33:03
Statistical Tests
34:27
Example 1: Drug and Disease
36:41
Example 2: Marbles and Conditional Probability
40:04
Example 3: Cards and Conditional Probability
45:59
Example 4: Votes and Conditional Probability
50:21
Independent Events

24m 27s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Independent Events & Conditional Probability
0:26
Non-independent Events
0:27
Independent Events
2:00
Non-independent and Independent Events
3:08
Non-independent and Independent Events
3:09
Defining Independent Events
5:52
Defining Independent Events
5:53
Multiplication Rule
7:29
Previously…
7:30
But with Independent Evens
8:53
Example 1: Which of These Pairs of Events are Independent?
11:12
Example 2: Health Insurance and Probability
15:12
Example 3: Independent Events
17:42
Example 4: Independent Events
20:03
Section 8: Probability Distributions
Introduction to Probability Distributions

56m 45s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Sampling vs. Probability
0:57
Sampling
0:58
Missing
1:30
What is Missing?
3:06
Insight: Probability Distributions
5:26
Insight: Probability Distributions
5:27
What is a Probability Distribution?
7:29
From Sample Spaces to Probability Distributions
8:44
Sample Space
8:45
Probability Distribution of the Sum of Two Die
11:16
The Random Variable
17:43
The Random Variable
17:44
Expected Value
21:52
Expected Value
21:53
Example 1: Probability Distributions
28:45
Example 2: Probability Distributions
35:30
Example 3: Probability Distributions
43:37
Example 4: Probability Distributions
47:20
Expected Value & Variance of Probability Distributions

53m 41s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Discrete vs. Continuous Random Variables
1:04
Discrete vs. Continuous Random Variables
1:05
Mean and Variance Review
4:44
Mean: Sample, Population, and Probability Distribution
4:45
Variance: Sample, Population, and Probability Distribution
9:12
Example Situation
14:10
Example Situation
14:11
Some Special Cases…
16:13
Some Special Cases…
16:14
Linear Transformations
19:22
Linear Transformations
19:23
What Happens to Mean and Variance of the Probability Distribution?
20:12
n Independent Values of X
25:38
n Independent Values of X
25:39
Compare These Two Situations
30:56
Compare These Two Situations
30:57
Two Random Variables, X and Y
32:02
Two Random Variables, X and Y
32:03
Example 1: Expected Value & Variance of Probability Distributions
35:35
Example 2: Expected Values & Standard Deviation
44:17
Example 3: Expected Winnings and Standard Deviation
48:18
Binomial Distribution

55m 15s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Discrete Probability Distributions
1:42
Discrete Probability Distributions
1:43
Binomial Distribution
2:36
Binomial Distribution
2:37
Multiplicative Rule Review
6:54
Multiplicative Rule Review
6:55
How Many Outcomes with k 'Successes'
10:23
Adults and Bachelor's Degree: Manual List of Outcomes
10:24
P (X=k)
19:37
Putting Together # of Outcomes with the Multiplicative Rule
19:38
Expected Value and Standard Deviation in a Binomial Distribution
25:22
Expected Value and Standard Deviation in a Binomial Distribution
25:23
Example 1: Coin Toss
33:42
Example 2: College Graduates
38:03
Example 3: Types of Blood and Probability
45:39
Example 4: Expected Number and Standard Deviation
51:11
Section 9: Sampling Distributions of Statistics
Introduction to Sampling Distributions

48m 17s

Intro
0:00
Roadmap
0:08
Roadmap
0:09
Probability Distributions vs. Sampling Distributions
0:55
Probability Distributions vs. Sampling Distributions
0:56
Same Logic
3:55
Logic of Probability Distribution
3:56
Example: Rolling Two Die
6:56
Simulating Samples
9:53
To Come Up with Probability Distributions
9:54
In Sampling Distributions
11:12
Connecting Sampling and Research Methods with Sampling Distributions
12:11
Connecting Sampling and Research Methods with Sampling Distributions
12:12
Simulating a Sampling Distribution
14:14
Experimental Design: Regular Sleep vs. Less Sleep
14:15
Logic of Sampling Distributions
23:08
Logic of Sampling Distributions
23:09
General Method of Simulating Sampling Distributions
25:38
General Method of Simulating Sampling Distributions
25:39
Questions that Remain
28:45
Questions that Remain
28:46
Example 1: Mean and Standard Error of Sampling Distribution
30:57
Example 2: What is the Best Way to Describe Sampling Distributions?
37:12
Example 3: Matching Sampling Distributions
38:21
Example 4: Mean and Standard Error of Sampling Distribution
41:51
Sampling Distribution of the Mean

1h 8m 48s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Special Case of General Method for Simulating a Sampling Distribution
1:53
Special Case of General Method for Simulating a Sampling Distribution
1:54
Computer Simulation
3:43
Using Simulations to See Principles behind Shape of SDoM
15:50
Using Simulations to See Principles behind Shape of SDoM
15:51
Conditions
17:38
Using Simulations to See Principles behind Center (Mean) of SDoM
20:15
Using Simulations to See Principles behind Center (Mean) of SDoM
20:16
Conditions: Does n Matter?
21:31
Conditions: Does Number of Simulation Matter?
24:37
Using Simulations to See Principles behind Standard Deviation of SDoM
27:13
Using Simulations to See Principles behind Standard Deviation of SDoM
27:14
Conditions: Does n Matter?
34:45
Conditions: Does Number of Simulation Matter?
36:24
Central Limit Theorem
37:13
SHAPE
38:08
CENTER
39:34
SPREAD
39:52
Comparing Population, Sample, and SDoM
43:10
Comparing Population, Sample, and SDoM
43:11
Answering the 'Questions that Remain'
48:24
What Happens When We Don't Know What the Population Looks Like?
48:25
Can We Have Sampling Distributions for Summary Statistics Other than the Mean?
49:42
How Do We Know whether a Sample is Sufficiently Unlikely?
53:36
Do We Always Have to Simulate a Large Number of Samples in Order to get a Sampling Distribution?
54:40
Example 1: Mean Batting Average
55:25
Example 2: Mean Sampling Distribution and Standard Error
59:07
Example 3: Sampling Distribution of the Mean
1:01:04
Sampling Distribution of Sample Proportions

54m 37s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Intro to Sampling Distribution of Sample Proportions (SDoSP)
0:51
Categorical Data (Examples)
0:52
Wish to Estimate Proportion of Population from Sample…
2:00
Notation
3:34
Population Proportion and Sample Proportion Notations
3:35
What's the Difference?
9:19
SDoM vs. SDoSP: Type of Data
9:20
SDoM vs. SDoSP: Shape
11:24
SDoM vs. SDoSP: Center
12:30
SDoM vs. SDoSP: Spread
15:34
Binomial Distribution vs. Sampling Distribution of Sample Proportions
19:14
Binomial Distribution vs. SDoSP: Type of Data
19:17
Binomial Distribution vs. SDoSP: Shape
21:07
Binomial Distribution vs. SDoSP: Center
21:43
Binomial Distribution vs. SDoSP: Spread
24:08
Example 1: Sampling Distribution of Sample Proportions
26:07
Example 2: Sampling Distribution of Sample Proportions
37:58
Example 3: Sampling Distribution of Sample Proportions
44:42
Example 4: Sampling Distribution of Sample Proportions
45:57
Section 10: Inferential Statistics
Introduction to Confidence Intervals

42m 53s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Inferential Statistics
0:50
Inferential Statistics
0:51
Two Problems with This Picture…
3:20
Two Problems with This Picture…
3:21
Solution: Confidence Intervals (CI)
4:59
Solution: Hypotheiss Testing (HT)
5:49
Which Parameters are Known?
6:45
Which Parameters are Known?
6:46
Confidence Interval - Goal
7:56
When We Don't Know m but know s
7:57
When We Don't Know
18:27
When We Don't Know m nor s
18:28
Example 1: Confidence Intervals
26:18
Example 2: Confidence Intervals
29:46
Example 3: Confidence Intervals
32:18
Example 4: Confidence Intervals
38:31
t Distributions

1h 2m 6s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
When to Use z vs. t?
1:07
When to Use z vs. t?
1:08
What is z and t?
3:02
z-score and t-score: Commonality
3:03
z-score and t-score: Formulas
3:34
z-score and t-score: Difference
5:22
Why not z? (Why t?)
7:24
Why not z? (Why t?)
7:25
But Don't Worry!
15:13
Gossett and t-distributions
15:14
Rules of t Distributions
17:05
t-distributions are More Normal as n Gets Bigger
17:06
t-distributions are a Family of Distributions
18:55
Degrees of Freedom (df)
20:02
Degrees of Freedom (df)
20:03
t Family of Distributions
24:07
t Family of Distributions : df = 2 , 4, and 60
24:08
df = 60
29:16
df = 2
29:59
How to Find It?
31:01
'Student's t-distribution' or 't-distribution'
31:02
Excel Example
33:06
Example 1: Which Distribution Do You Use? Z or t?
45:26
Example 2: Friends on Facebook
47:41
Example 3: t Distributions
52:15
Example 4: t Distributions , confidence interval, and mean
55:59
Introduction to Hypothesis Testing

1h 6m 33s

Intro
0:00
Roadmap
0:06
Roadmap
0:07
Issues to Overcome in Inferential Statistics
1:35
Issues to Overcome in Inferential Statistics
1:36
What Happens When We Don't Know What the Population Looks Like?
2:57
How Do We Know whether a sample is Sufficiently Unlikely
3:43
Hypothesizing a Population
6:44
Hypothesizing a Population
6:45
Null Hypothesis
8:07
Alternative Hypothesis
8:56
Hypotheses
11:58
Hypotheses
11:59
Errors in Hypothesis Testing
14:22
Errors in Hypothesis Testing
14:23
Steps of Hypothesis Testing
21:15
Steps of Hypothesis Testing
21:16
Single Sample HT ( When Sigma Available)
26:08
Example: Average Facebook Friends
26:09
Step1
27:08
Step 2
27:58
Step 3
28:17
Step 4
32:18
Single Sample HT (When Sigma Not Available)
36:33
Example: Average Facebook Friends
36:34
Step1: Hypothesis Testing
36:58
Step 2: Significance Level
37:25
Step 3: Decision Stage
37:40
Step 4: Sample
41:36
Sigma and p-value
45:04
Sigma and p-value
45:05
On tailed vs. Two Tailed Hypotheses
45:51
Example 1: Hypothesis Testing
48:37
Example 2: Heights of Women in the US
57:43
Example 3: Select the Best Way to Complete This Sentence
1:03:23
Confidence Intervals for the Difference of Two Independent Means

55m 14s

Intro
0:00
Roadmap
0:14
Roadmap
0:15
One Mean vs. Two Means
1:17
One Mean vs. Two Means
1:18
Notation
2:41
A Sample! A Set!
2:42
Mean of X, Mean of Y, and Difference of Two Means
3:56
SE of X
4:34
SE of Y
6:28
Sampling Distribution of the Difference between Two Means (SDoD)
7:48
Sampling Distribution of the Difference between Two Means (SDoD)
7:49
Rules of the SDoD (similar to CLT!)
15:00
Mean for the SDoD Null Hypothesis
15:01
Standard Error
17:39
When can We Construct a CI for the Difference between Two Means?
21:28
Three Conditions
21:29
Finding CI
23:56
One Mean CI
23:57
Two Means CI
25:45
Finding t
29:16
Finding t
29:17
Interpreting CI
30:25
Interpreting CI
30:26
Better Estimate of s (s pool)
34:15
Better Estimate of s (s pool)
34:16
Example 1: Confidence Intervals
42:32
Example 2: SE of the Difference
52:36
Hypothesis Testing for the Difference of Two Independent Means

50m

Intro
0:00
Roadmap
0:06
Roadmap
0:07
The Goal of Hypothesis Testing
0:56
One Sample and Two Samples
0:57
Sampling Distribution of the Difference between Two Means (SDoD)
3:42
Sampling Distribution of the Difference between Two Means (SDoD)
3:43
Rules of the SDoD (Similar to CLT!)
6:46
Shape
6:47
Mean for the Null Hypothesis
7:26
Standard Error for Independent Samples (When Variance is Homogenous)
8:18
Standard Error for Independent Samples (When Variance is not Homogenous)
9:25
Same Conditions for HT as for CI
10:08
Three Conditions
10:09
Steps of Hypothesis Testing
11:04
Steps of Hypothesis Testing
11:05
Formulas that Go with Steps of Hypothesis Testing
13:21
Step 1
13:25
Step 2
14:18
Step 3
15:00
Step 4
16:57
Example 1: Hypothesis Testing for the Difference of Two Independent Means
18:47
Example 2: Hypothesis Testing for the Difference of Two Independent Means
33:55
Example 3: Hypothesis Testing for the Difference of Two Independent Means
44:22
Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means

1h 14m 11s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
The Goal of Hypothesis Testing
1:27
One Sample and Two Samples
1:28
Independent Samples vs. Paired Samples
3:16
Independent Samples vs. Paired Samples
3:17
Which is Which?
5:20
Independent SAMPLES vs. Independent VARIABLES
7:43
independent SAMPLES vs. Independent VARIABLES
7:44
T-tests Always…
10:48
T-tests Always…
10:49
Notation for Paired Samples
12:59
Notation for Paired Samples
13:00
Steps of Hypothesis Testing for Paired Samples
16:13
Steps of Hypothesis Testing for Paired Samples
16:14
Rules of the SDoD (Adding on Paired Samples)
18:03
Shape
18:04
Mean for the Null Hypothesis
18:31
Standard Error for Independent Samples (When Variance is Homogenous)
19:25
Standard Error for Paired Samples
20:39
Formulas that go with Steps of Hypothesis Testing
22:59
Formulas that go with Steps of Hypothesis Testing
23:00
Confidence Intervals for Paired Samples
30:32
Confidence Intervals for Paired Samples
30:33
Example 1: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
32:28
Example 2: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
44:02
Example 3: Confidence Intervals & Hypothesis Testing for the Difference of Two Paired Means
52:23
Type I and Type II Errors

31m 27s

Intro
0:00
Roadmap
0:18
Roadmap
0:19
Errors and Relationship to HT and the Sample Statistic?
1:11
Errors and Relationship to HT and the Sample Statistic?
1:12
Instead of a Box…Distributions!
7:00
One Sample t-test: Friends on Facebook
7:01
Two Sample t-test: Friends on Facebook
13:46
Usually, Lots of Overlap between Null and Alternative Distributions
16:59
Overlap between Null and Alternative Distributions
17:00
How Distributions and 'Box' Fit Together
22:45
How Distributions and 'Box' Fit Together
22:46
Example 1: Types of Errors
25:54
Example 2: Types of Errors
27:30
Example 3: What is the Danger of the Type I Error?
29:38
Effect Size & Power

44m 41s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Distance between Distributions: Sample t
0:49
Distance between Distributions: Sample t
0:50
Problem with Distance in Terms of Standard Error
2:56
Problem with Distance in Terms of Standard Error
2:57
Test Statistic (t) vs. Effect Size (d or g)
4:38
Test Statistic (t) vs. Effect Size (d or g)
4:39
Rules of Effect Size
6:09
Rules of Effect Size
6:10
Why Do We Need Effect Size?
8:21
Tells You the Practical Significance
8:22
HT can be Deceiving…
10:25
Important Note
10:42
What is Power?
11:20
What is Power?
11:21
Why Do We Need Power?
14:19
Conditional Probability and Power
14:20
Power is:
16:27
Can We Calculate Power?
19:00
Can We Calculate Power?
19:01
How Does Alpha Affect Power?
20:36
How Does Alpha Affect Power?
20:37
How Does Effect Size Affect Power?
25:38
How Does Effect Size Affect Power?
25:39
How Does Variability and Sample Size Affect Power?
27:56
How Does Variability and Sample Size Affect Power?
27:57
How Do We Increase Power?
32:47
Increasing Power
32:48
Example 1: Effect Size & Power
35:40
Example 2: Effect Size & Power
37:38
Example 3: Effect Size & Power
40:55
Section 11: Analysis of Variance
F-distributions

24m 46s

Intro
0:00
Roadmap
0:04
Roadmap
0:05
Z- & T-statistic and Their Distribution
0:34
Z- & T-statistic and Their Distribution
0:35
F-statistic
4:55
The F Ration ( the Variance Ratio)
4:56
F-distribution
12:29
F-distribution
12:30
s and p-value
15:00
s and p-value
15:01
Example 1: Why Does F-distribution Stop At 0 But Go On Until Infinity?
18:33
Example 2: F-distributions
19:29
Example 3: F-distributions and Heights
21:29
ANOVA with Independent Samples

1h 9m 25s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
1:12
The Limitations of t-tests
1:13
Two Major Limitations of Many t-tests
3:26
Two Major Limitations of Many t-tests
3:27
Ronald Fisher's Solution… F-test! New Null Hypothesis
4:43
Ronald Fisher's Solution… F-test! New Null Hypothesis (Omnibus Test - One Test to Rule Them All!)
4:44
Analysis of Variance (ANoVA) Notation
7:47
Analysis of Variance (ANoVA) Notation
7:48
Partitioning (Analyzing) Variance
9:58
Total Variance
9:59
Within-group Variation
14:00
Between-group Variation
16:22
Time out: Review Variance & SS
17:05
Time out: Review Variance & SS
17:06
F-statistic
19:22
The F Ratio (the Variance Ratio)
19:23
S²bet = SSbet / dfbet
22:13
What is This?
22:14
How Many Means?
23:20
So What is the dfbet?
23:38
So What is SSbet?
24:15
S²w = SSw / dfw
26:05
What is This?
26:06
How Many Means?
27:20
So What is the dfw?
27:36
So What is SSw?
28:18
Chart of Independent Samples ANOVA
29:25
Chart of Independent Samples ANOVA
29:26
Example 1: Who Uploads More Photos: Unknown Ethnicity, Latino, Asian, Black, or White Facebook Users?
35:52
Hypotheses
35:53
Significance Level
39:40
Decision Stage
40:05
Calculate Samples' Statistic and p-Value
44:10
Reject or Fail to Reject H0
55:54
Example 2: ANOVA with Independent Samples
58:21
Repeated Measures ANOVA

1h 15m 13s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
The Limitations of t-tests
0:36
Who Uploads more Pictures and Which Photo-Type is Most Frequently Used on Facebook?
0:37
ANOVA (F-test) to the Rescue!
5:49
Omnibus Hypothesis
5:50
Analyze Variance
7:27
Independent Samples vs. Repeated Measures
9:12
Same Start
9:13
Independent Samples ANOVA
10:43
Repeated Measures ANOVA
12:00
Independent Samples ANOVA
16:00
Same Start: All the Variance Around Grand Mean
16:01
Independent Samples
16:23
Repeated Measures ANOVA
18:18
Same Start: All the Variance Around Grand Mean
18:19
Repeated Measures
18:33
Repeated Measures F-statistic
21:22
The F Ratio (The Variance Ratio)
21:23
S²bet = SSbet / dfbet
23:07
What is This?
23:08
How Many Means?
23:39
So What is the dfbet?
23:54
So What is SSbet?
24:32
S² resid = SS resid / df resid
25:46
What is This?
25:47
So What is SS resid?
26:44
So What is the df resid?
27:36
SS subj and df subj
28:11
What is This?
28:12
How Many Subject Means?
29:43
So What is df subj?
30:01
So What is SS subj?
30:09
SS total and df total
31:42
What is This?
31:43
What is the Total Number of Data Points?
32:02
So What is df total?
32:34
so What is SS total?
32:47
Chart of Repeated Measures ANOVA
33:19
Chart of Repeated Measures ANOVA: F and Between-samples Variability
33:20
Chart of Repeated Measures ANOVA: Total Variability, Within-subject (case) Variability, Residual Variability
35:50
Example 1: Which is More Prevalent on Facebook: Tagged, Uploaded, Mobile, or Profile Photos?
40:25
Hypotheses
40:26
Significance Level
41:46
Decision Stage
42:09
Calculate Samples' Statistic and p-Value
46:18
Reject or Fail to Reject H0
57:55
Example 2: Repeated Measures ANOVA
58:57
Example 3: What's the Problem with a Bunch of Tiny t-tests?
1:13:59
Section 12: Chi-square Test
Chi-Square Goodness-of-Fit Test

58m 23s

Intro
0:00
Roadmap
0:05
Roadmap
0:06
Where Does the Chi-Square Test Belong?
0:50
Where Does the Chi-Square Test Belong?
0:51
A New Twist on HT: Goodness-of-Fit
7:23
HT in General
7:24
Goodness-of-Fit HT
8:26
Hypotheses about Proportions
12:17
Null Hypothesis
12:18
Alternative Hypothesis
13:23
Example
14:38
Chi-Square Statistic
17:52
Chi-Square Statistic
17:53
Chi-Square Distributions
24:31
Chi-Square Distributions
24:32
Conditions for Chi-Square
28:58
Condition 1
28:59
Condition 2
30:20
Condition 3
30:32
Condition 4
31:47
Example 1: Chi-Square Goodness-of-Fit Test
32:23
Example 2: Chi-Square Goodness-of-Fit Test
44:34
Example 3: Which of These Statements Describe Properties of the Chi-Square Goodness-of-Fit Test?
56:06
Chi-Square Test of Homogeneity

51m 36s

Intro
0:00
Roadmap
0:09
Roadmap
0:10
Goodness-of-Fit vs. Homogeneity
1:13
Goodness-of-Fit HT
1:14
Homogeneity
2:00
Analogy
2:38
Hypotheses About Proportions
5:00
Null Hypothesis
5:01
Alternative Hypothesis
6:11
Example
6:33
Chi-Square Statistic
10:12
Same as Goodness-of-Fit Test
10:13
Set Up Data
12:28
Setting Up Data Example
12:29
Expected Frequency
16:53
Expected Frequency
16:54
Chi-Square Distributions & df
19:26
Chi-Square Distributions & df
19:27
Conditions for Test of Homogeneity
20:54
Condition 1
20:55
Condition 2
21:39
Condition 3
22:05
Condition 4
22:23
Example 1: Chi-Square Test of Homogeneity
22:52
Example 2: Chi-Square Test of Homogeneity
32:10
Section 13: Overview of Statistics
Overview of Statistics

18m 11s

Intro
0:00
Roadmap
0:07
Roadmap
0:08
The Statistical Tests (HT) We've Covered
0:28
The Statistical Tests (HT) We've Covered
0:29
Organizing the Tests We've Covered…
1:08
One Sample: Continuous DV and Categorical DV
1:09
Two Samples: Continuous DV and Categorical DV
5:41
More Than Two Samples: Continuous DV and Categorical DV
8:21
The Following Data: OK Cupid
10:10
The Following Data: OK Cupid
10:11
Example 1: Weird-MySpace-Angle Profile Photo
10:38
Example 2: Geniuses
12:30
Example 3: Promiscuous iPhone Users
13:37
Example 4: Women, Aging, and Messaging
16:07
Loading...
This is a quick preview of the lesson. For full access, please Log In or Sign up.
For more information, please see full course syllabus of Statistics
Bookmark & Share Embed

Share this knowledge with your friends!

Copy & Paste this embed code into your website’s HTML

Please ensure that your website editor is in text mode when you paste the code.
(In Wordpress, the mode button is on the top right corner.)
  ×
  • - Allow users to view the embedded video in full-size.
Since this lesson is not free, only the preview will appear on your website.
  • Discussion

  • Answer Engine

  • Download Lecture Slides

  • Table of Contents

  • Transcription

  • Related Books

Lecture Comments (5)

0 answers

Post by Professor Son on October 16, 2014

Sorry folks! I realize that example 1 is introducing concepts that weren't addressed in the lecture! These are concepts addressed in the next lecture (Sampling Distribution of the Mean).

1 answer

Last reply by: Professor Son
Thu Oct 16, 2014 1:05 AM

Post by Matt F on December 11, 2012

Hi Dr Ji,
If we know what the population 'looks like' why do we bother calculating sample distributions? Why not just get our statistics from the population directly and save ourselves making assumptions about what is actually happening in the population?
Thanks.

1 answer

Last reply by: Professor Son
Thu Oct 16, 2014 1:13 AM

Post by James Ulatowski on January 1, 2012

Lost me on the standard deviations of the samples. They are NOT equal to the population and it has been difficult to find information on such small population and small sample statistics, with replacement. It is logical that with replacement the population appears to be infinite or large. So, I will try looking at solution based only on small sample with large population. I did calculator simulation to get much smaller std dev. than the population.

Introduction to Sampling Distributions

Lecture Slides are screen-captured images of important points in the lecture. Students can download and print out these lecture slide images to do practice problems as well as take notes while watching the lecture.

  • Intro 0:00
  • Roadmap 0:08
    • Roadmap
  • Probability Distributions vs. Sampling Distributions 0:55
    • Probability Distributions vs. Sampling Distributions
  • Same Logic 3:55
    • Logic of Probability Distribution
    • Example: Rolling Two Die
  • Simulating Samples 9:53
    • To Come Up with Probability Distributions
    • In Sampling Distributions
  • Connecting Sampling and Research Methods with Sampling Distributions 12:11
    • Connecting Sampling and Research Methods with Sampling Distributions
  • Simulating a Sampling Distribution 14:14
    • Experimental Design: Regular Sleep vs. Less Sleep
  • Logic of Sampling Distributions 23:08
    • Logic of Sampling Distributions
  • General Method of Simulating Sampling Distributions 25:38
    • General Method of Simulating Sampling Distributions
  • Questions that Remain 28:45
    • Questions that Remain
  • Example 1: Mean and Standard Error of Sampling Distribution 30:57
  • Example 2: What is the Best Way to Describe Sampling Distributions? 37:12
  • Example 3: Matching Sampling Distributions 38:21
  • Example 4: Mean and Standard Error of Sampling Distribution 41:51

Transcription: Introduction to Sampling Distributions

Welcome to educator.com.0000

We are going to be talking about introducing the concept of sampling distribution.0001

Here is the roadmap we have been talking about probability distributions and that is what we call distributions of probabilities from discrete outcomes.0007

We are going on to sampling distributions and that is what we call distributions of outcomes that are continuous.0022

For now we can apply the same fundamental logic from probability distribution directly to sampling distributions.0030

We could treat them roughly similarly, but I do want to connect sampling and research methods to topics we have covered before with sampling distribution.0038

We are going to finally talk about actually generating sampling distributions.0049

First let us talk about the difference between probability and sampling distributions.0053

In probability distributions we are always looking at discrete outcomes, finite or uncountable numbers of outcomes.0060

For example in binomial distributions, if you have 10 trials, you have 0 to 10 different outcomes as possibilities and that is 11.0068

They are discrete, finite, accountable, no problem.0080

In probability distributions, what we are looking for is the probabilities of those discrete outcomes.0085

We get a list of all these probabilities and we can actually make a list because it is a finite number.0097

Let us talk about sampling distribution.0105

Sampling distributions are roughly the same idea.0109

You have the sample space and you want to know how likely each outcome is,0111

the probability of each outcome and the set of all those probabilities that is called the sampling distribution.0118

Here is the big difference, in standard discrete outcomes we are talking about continuous outcomes.0124

Before we ask what is the probability that 2 out of 4 people for random people that you pick from the United States have a bachelor’s degree.0136

Now, we might say if you pick 4 college students at random and we are not looking for the 2 anymore, that is discrete.0146

We are looking for things like what is the average GPA.0157

There is an infinite number of average GPA that you could potentially have.0164

In the way it is not finite anymore, but it is infinite number of outcomes and uncountable number of outcomes.0169

Now that is problematic because here we had like a list of all the different probabilities.0186

Can you list all these outcomes?0196

It is impossible, they are infinite.0199

By definition, not listable, not put in a table of all.0202

That is an issue with sampling distribution but sampling distributions get their power from other sources.0208

We do not have to worry as much about that but I do want to know that this is a big difference between probability distributions and sampling distributions.0216

But still we are going to be trying to find things like how we find the expected value of these distributions?0224

How do we find the probability of some outcome?0231

The same logic overall logic is going to apply for now.0233

I just go over the logic of the original probability distribution.0241

Basically we use probability to find this known population, we had known populations like things like fair coins, roll of two fair dice, or whatever something like that.0249

And from that we generate a probability distribution.0266

A whole bunch of different values for this random variable x and then the probabilities of those x.0271

Okay once we have that, then we have this sample and it is actually from an unknown population.0284

We get the sample but is it from that kind of population or from that population?0304

We do not know.0312

What we do is we take a sample and we compare it to the probability distribution and we look at is the sample very likely or very unlikely.0314

From that we judge whether these known and unknown population is similar to each other or not right.0325

Is it likely, unlikely?0333

That is roughly bad idea.0337

We went over a couple of specific examples, one of the ones that we went over in great detail0342

was the one with 2 dice where the random variable is the sum of the two dice.0352

The logic of sampling distribution is roughly similar, so we have some known population.0360

We generate a sampling distribution this time instead of a probability distribution we will talk about how to generate those later.0370

We get samples from unknown population and we compare it and we say is it likely or unlikely?0378

Same underlying logic.0386

The differences are going to be in the step how we generate is going to be different.0388

How we judge whether it is likely or unlikely this step is also going to be a little bit different in terms of the nitty-gritty like how we actually do it,0395

but the concept is the same.0407

We had this known stuff, we have this unknown stuff, we compare the unknown staff to the known stuff.0408

Let us go over that example that we knew really well.0416

The known population here was 2 fair dice that we brought and we generate it as probability distribution.0421

Here we have the probability distribution and the probability distribution where x the random variable is the sum of 2 dice.0441

We also generated all these probabilities for each sum.0454

We have the sum of the two dice, we have all the probabilities for each of those sums and each of these sums is discrete and countable.0459

There is 10, 11 of them right.0465

What we would do is let us say we rolled 2 dice and we do not know if they are fair dice or not some shady guy give it to us.0474

If we are going to roll such as 1-1, that is what our sample is then we can say okay what is the probability that x=2?0491

That is a pretty small probability, it looks like .025 or something.0510

Because it is a small probably we will say this is unlikely sample.0515

Let us say we got a sample that was something like 3-4.0523

We might compare that to this probability where x =7 and that is pretty likely there is like 16% chance.0533

We would say this is likely.0551

This sample, we would probably say if we got this it is likely that it came from fair dice.0556

If we had the sample we might say it is less likely.0565

It is not that we stop here and say these dice are unfair because there is still a chance that you could get this but is less likely and just more likely.0568

We are judging them relative to each other.0580

We are going to do something similar but some of these steps will differ, namely this one and this one.0583

How do we get a sample sampling distribution?0592

We know how to get a probability distribution.0601

Probability distributions are really straightforward because we use those fundamental rules of probability.0603

Sampling distributions are different because we cannot use probability rules necessarily.0611

Before what we did in probability distributions is we use things like the law large number and we could sample many times.0617

This is the case where we do not know use probability actions, the rules of probability.0629

We do not use those regularities.0640

We just use the law large numbers.0642

We pretend to sample many times and then we generate the probability distribution that way.0644

We could do that.0650

You could flip a coin hundreds and hundreds of times or you can use the probability principles to come up with a probability distribution.0651

This one will take a lot longer than this one, but you have to know more stuff in order to implement this.0660

But these are also shortcuts for the next one.0668

In sampling distributions you also have two different methods for you to come up with sampling distribution.0672

One of these actually applies here and that is this one.0678

We can use law of large numbers and sample many times.0684

So that is one way we can do it.0701

Unfortunately we cannot use this one.0705

There are no way we can use this one, but we can use something that we are going to learn about later called the central limit theorem.0709

But for today, we are going to focus in on this one using the law of large numbers to sample many many times.0720

I before we go on to actually nitty-gritty generating the sampling distribution here is how sampling distribution connects0729

with the concepts of the sampling, unbiased sampling and research methods.0739

Remember experimental methodology versus those other methodologies.0747

The promise before was this, if you use random sampling and by random sampling I need unbiased,0751

and if you use experimental design we promise that you could draw a conclusion about causation and promises that was the case.0759

In order to do this mathematically, the sampling distribution is sort of the engine that allows0776

this promise to come true because here is what the sampling distribution does.0792

Imagine repeating your unbiased sampling and great experimental methodology over and over and over again.0796

What kind of distribution would you see?0808

Let us say you have this great experiment that you really wanted you do to see if x changes y?0812

What would happen if we did that experiment over and over and over again?0824

Overtime, you would see the truth.0830

You would see what might really emerge from that experiment.0833

And that is how sampling distributions is going to help us meet the fruits of this promise.0838

Without sampling distributions this promise cannot fully come true.0848

We are going to talk about actually simulating, generating a sampling distribution.0853

This is the idea of how to we go from that no population and generate, create a sampling distribution.0867

One way you could do is you could use what's called simulation.0884

You can use computer programs typically to literally take random samples over and over and over again and create the sampling distribution.0888

An example of a data set that you might do this with is something like this.0902

Here we have an experimental design we want to know if hamsters who get normal amount of sleep versus less sleep which will experience more stress?0908

Maybe we have hypothesis that you know having less sleep leads to greater stress.0921

We are going to look at the independent variable of sleep in the regular or less sleep.0929

We will wake the hamsters up and the dependent variable that we might look at is their stress hormone levels.0935

Let us say we tested these hamsters, here are 10 hamsters in our lab and 7 of them were in the regular sleep group0943

and 3 of them were randomly chosen for the less sleep group.0959

These are resulting levels of stress hormone, but we want to know is are these less sleep hamsters,0965

are these insomnia hamsters are they more stressed then you would expect by random chance.0970

That is what we expect.0984

The known population that we can think about is something like, these are randomly selected stress hormones.0987

Randomly selected hamsters, that these 3 is it is not that they are more stress it is just that by chance you might get these numbers together.1001

We might randomly select through a computer program 3 of the entire set of hamsters, so randomly select them and generate the sampling distribution.1016

And maybe one thing we might want to do is get their means of 3 hamsters at a time.1031

Pick 3 random hamsters, get their means and put it in my sampling distribution.1037

Get 3 random hamsters, get the mean and put it in my sampling distribution over and over again until1041

we get a whole bunch of means like a dot plot of a whole bunch of means.1047

The means might range from 25 all the way to 64.1059

The mean can be greater than or equal to or greater than 64 or less than or equal to 25.1064

It has to be somewhere between.1071

We have these less sleep hamsters, sleep deprived hamsters and so maybe will calculate their mean and so that is 55 + 55 + 64 / 3 = 58.1073

Here we have x bar + 58 and we want to know is this likely or unlikely in given the sampling distribution?1118

We want to ask likely or unlikely?1129

If it is likely then we cannot separate whether this was due to less sleep or due to chance but it is unlikely we might say we do not think it is randomly selected.1138

It is not that you randomly selected these hamsters.1149

It is unlikely that you randomly selected these hamsters.1156

It is more likely that something special has been done to them and we know what that special thing is it is that they have been deprived to sleep.1162

In that way you can make some conclusions.1168

You cannot necessarily say for certain whether sleep causes hormone levels to rise.1171

It is not that you looked at the mechanism and saw the sleep causing hormones.1181

It is not necessarily that but you can say whether this is the likely sample given random processes or you can say it is unlikely given random processes.1187

That is really all we can know from this kind of logic.1197

This kind of logic will take us far.1204

Here we go, here are all our hamsters and here is our known sample of less sleep hamsters.1207

It is those guys but you know maybe it is just that they are randomly picked.1220

In order to get generate this sampling distribution of the mean here is what we did.1233

I am just summarizing in steps what we talked about before.1243

First what we did was take random sample of 3 hamsters and here we will call that n because n is the size of the sample.1249

Then what we did was we computed a summary statistic, we computed the mean but we could have computed standard deviation.1265

We could have computed your median, sum summary statistic.1281

Number 3 is the important step, repeat steps 1 and 2.1291

You do this over and over again, that is what it means to stimulate a sampling distribution and the 4th is examine and plot resulting sample statistics.1303

Here these are all means, that is the mean, that is the mean, that is the mean, it is a mean of 3 that have been selected.1323

If we repeat these 2 go over and over again and we plot it then we will see the distribution of sample statistics.1339

That is why it is called a sampling distribution.1349

Once we have a sampling distribution, then you can see it has the expected value.1357

We can find the mean of this thing.1368

It is like a middle mean.1369

It is a mean of mean.1371

We can find the standard deviation of this thing.1373

That is what we mean by expected values.1376

Okay, so now let us compare these two things.1381

We know the logic of probability distributions, now let us apply the same logic to sampling distribution.1392

Here we go known population 2 fair dice, probability distribution is the same thing I need if it just small.1400

Here is a likely sample, here is a unlikely sample and we know it is likely by looking at this probability distribution.1408

We know this is unlikely by looking it up in this probability distribution.1418

Pretty straightforward.1423

What about in terms of the hamsters stuff?1425

The known population is all the hamsters in our study not only that but our mechanism is just like here we think it is 2 fair dice.1429

Here we think it is all hamsters and they were 3 chosen randomly and we can simulate that process.1444

We could choose 3 randomly and find a sampling distribution and these are all means now.1457

Not only that but we could find a likely sample, which might be something like 47.1465

If we found a mean of 3 hamsters so our sleep deprived hamsters were had a mean of 471474

in their stress hormone levels we might say this is very similar to chance.1486

It seems like they are not that different from just picking hamsters at random1493

but we might have an unlikely sample and here in the example I should do we had 58 and 58 is over here.1500

And that might show us that is really unlikely that we would choose 3 hamsters at random and get such a high mean.1510

And so in that sense, we can start thinking this is likely to have come from random generation.1522

This is less likely to have come from random generation.1531

We talked about a very specific method of simulating sample distributions like take 3 hamsters at a time, compute their means.1535

I am just going to say that the same 4 steps but I am going to say that in just more general terms.1551

Take a random sample of size n from the population, whatever your population is, whatever your n size is, your sample size.1556

Then you compute a summary statistic and this could be the mean, median, and it could be mode, it could be variance, it could be a whole bunch of different things.1565

All the summary statistics that we have talked about earlier and it could be any one of those things.1580

Then you repeat 1 and 2 many times and that is the simulation part where we pretending to do this many, many times.1586

And that is why it is really helpful that we have computer programs that can help us do this many,1594

many times then we do not have to actually draw beads or patterns.1599

Finally we want to display and examine the distribution of sample statistics.1603

We will have a whole bunch of means or we will have a whole bunch of variances or whole bunch of standard deviation, or inter quartile ranges, whatever you want.1609

In that way, this is the general method of simulating the sample statistics.1624

Once you have that simulation this part is going from the known population and usually the known population is the random process,1630

the default because you generate on nonrandom processes but it is really easy to generate random processes.1645

And then we get our sampling distribution.1653

This part is this area here simulating.1658

That is that part.1664

Now there is another part and that is the part where we now compare 2 samples.1674

Rather we compare sample to the sampling distribution and we decide likely or unlikely.1684

That is another part that we haven't talked about here.1701

In order to do this, what you need to do is compute the summary statistics for your sample.1706

If you have a whole bunch of variances to compute for your sample and then you compare it to your sampling distribution.1712

And we make a call it is likely than unlikely.1721

Even though we know all these stuffs there still some little bugger questions that remain.1725

In these cases we know what the population looks like, like we have a group of hamsters1732

and we know what it looks like and we are generating these random processes.1741

What happens when we have no idea what the population looks like.1747

We do not have a nice list of 10 hamsters.1753

What if we want to know stuff like what if you randomly draw from any high school student in the US?1755

We do not have all those numbers.1761

Like in order to do the sampling process you would have to have all the giant list of GPAs of high school students in the US and pull randomly from them.1763

What if you do not have that population?1777

Also tell me how sampling distributions for summary statistics, other than the mean because we have only looked at the one with the mean and what it does look like.1779

We have a sort of answer for this guess, but we want to know can we just pick one randomly like which one to use?1793

The 3rd unanswered question is how to know whether a sample is sufficiently unlikely?1802

So far we have been just eyeballing it like we look at it and say that seems unlikely, 2 person that seems unlikely 5% that seems unlikely.1811

10% that seems more likely.1821

It seems like we are just making a judgment call but how do we know whether it is truly unlikely or just our opinion?1824

Do we always have to simulate a large number of samples in order to get a sampling distribution because that seem like that can be really hard do.1833

These are the questions that remain, but we just went through the intro so these will be answered later on.1846

Let us get in example 1, consider the sampling distribution of the mean so the summary statistic is the mean just like the hamster example.1859

Sampling distribution of the mean of a random sample of size and taken from a population of size N with the mean of mu and stdev(sigma).1872

So what is saying is just use these as constants, pretended they have been given to you.1887

If N=n what are the mean and standard error of the sampling distribution?1894

Here I should show you that there is actually a new little notation that you should know.1903

The mean of the sampling distribution of mean looks like this.1910

It is a mu because it is the expected value and expected values are theoretical populations.1915

It is an expected value of mean so here we would put a little x bar.1923

It is a bunch of little sample means and same thing with standard error.1930

This should also say standard deviation.1939

The special name for the standard deviation of the sampling distribution because that tends to be long and we use that concept over and over again.1943

We just call it standard error but it is really just the standard deviation of the sampling distribution.1951

Here is the sampling distribution of means, so we call it Sigma X bar, sub x bar.1959

If our entire population is size N and we take sampling distribution of size N basically N = n.1966

What would be the mean and standard error?1981

Well, the mean of my sampling distribution should be the same exactly as the mean1984

of my population because basically we are sampling the entire population.1995

Think about the standard deviation remember we are calculating standard deviation.2010

We are getting the average deviations.2017

When we get the average deviation that ends up, when you take the mean over and over again that mean is going to be the same every single time.2021

Before, if your population have whatever standard deviation now it is going to be super tiny because you are going to have no standard deviation and spread.2036

You are just going to get the same mean every single time.2047

This is going to be super small or maybe close to nothing because there is no spread or mean.2053

What if n is very small and N is very large so our population is enormous?2065

Our sample is small.2076

What about in that case?2083

What will our mean of the sampling distribution look like?2087

All we know is that we do not for sure that it is going to be the same.2092

That is all we can say.2101

It might be close because if you take a whole bunch of these means it might be close but maybe we are going to be less sure.2102

What about if the N is very large and n is very small?2113

What would the spread of our sampling distribution looks like?2130

In this case it might be smaller but it would not be necessarily super small.2134

Maybe smaller but not necessarily super small.2142

What about mu sub x bar?2156

Here we are not sure.2161

Is it going to be similar to the mean of the population?2163

Is it going to be different?2171

Let us think about taking one out.2173

Any one single mean might be very different from the population but think about taking a whole bunch of those means and get the mean of that.2176

When you get the mean sort of you get the middle.2189

Here we have this giant population.2192

We take all these samples out.2195

We have all these samples and then we take the middle of that.2196

It should be the middle of the population.2200

Maybe we will take an educated guess and say that might be equal to mu too.2205

We are not saying that any single one of them is going to be equal to the mu but the average of the whole bunch of those means2211

maybe equal to the mu because we are always moving to the center.2226

Example 2, what might be the best way to describe sampling distributions?2230

Let us think about how to describe regular old distributions.2239

Remember shape, center, and spread?2246

Maybe that will apply to sampling distributions, shape, center, and spread.2252

We could actually find the shape of it, the center, for example mu sub x bar or if you use something else like sampling distribution of standard deviations.2262

It might be mu sub sigma, also spread, standard deviation of x bar.2280

Maybe that is the way we could describe sampling distributions as well.2296

Example 3, 3 very small populations are given each with a mu of 30.2300

This is truly a small population there is only 2 items in this population and there is only 5 in this one, and there is only 3 items in this one.2311

Match this to the corresponding sampling distribution of the sample mean n =2.2322

Assume replacements if you draw 2 out you will put them back or else after once.2328

Let us look at this.2338

This one it seems to go from 10 all the way to 50 so you could have a mean of 10.2341

You could also have a mean of 50.2349

That is possible here.2352

Think about a population where there are a bunch of 10 because every time you draw a 10 you will replace it.2354

You could get a mean of 10.2364

Here you could also get a mean of 10 and here you cannot get a mean of 10.2368

There is no mean of 10 here.2374

A cannot go possibly with C but A could go possibly with A or B.2377

I should probably call this 1, 2, 3, just so that we do not get confused.2384

Can A give a mean of 15 or 20?2390

Can you put 10 and 15 together in any way and divide it by 2 in any way to get 15 or 20 or even 25 or 35?2401

No, there is no possible way 10 and 15 can be combined together and divide it by 2 in any way to give you 15.2414

But here we can have a mean of 10 or 15, and we can have a mean of 30 which is 10 + 50 = 60 ÷ 2 = 30.2423

Those are the 3 types of mean you could have so I would say goes with that one.2437

I would say B would go with this one because in B you could have a mean of 10 or 50 and you could have all these means in between.2445

If you got 10 and 20 and average them together that would be 15.2456

This one has those possibilities, these different possibilities that this one does not have.2461

Let us move onto this one.2474

Here we know that you cannot have a mean of 10 or 50 but you can have a mean of 20, 25, 30 and because of that we know that this one goes with this one.2478

We have also used the other ones.2487

Here we see that the things you have in your population limits the kinds of means that you will see in your sampling distribution of the mean.2494

Here it is the sampling distributions of the sample mean but same idea.2506

Example 4, here are very small populations again A, B, and C.2511

Estimate the sampling distributions mean and compare them to the populations mean.2517

Which standard deviation is smaller the population or the corresponding sampling distribution?2522

I am just going to renumber these again and let us estimate these means.2528

Here the mean looks like 30 and here the mean might be a little bit greater than 30.2534

Here we know the mean because it is 30, 30, and 30.2558

What we see is that the sample means even though n is smaller because it is only 2 we see that the sampling distribution of the mean,2567

the expected value is very similar to the actual population mean.2593

Estimate them, compare them, very similar.2601

Which standard deviation is smaller, the population standard deviation or the corresponding sampling distribution?2608

It might be helpful if we find out what the population standard deviation would look like.2619

Here we have something like A, B, and C.2629

A is 10 and 50, B is 10, 20, 30, 40, 50, and C is just 23 and 40.2635

Let us find the standard deviations of these populations.2648

The reason why we use standard deviation of the population is because we want it to divide by n rather than n-1.2658

We could just put it all in blue.2668

I want to test them to make sure I could use these blank ones so that I can just copy and paste the process.2672

The one that has the greatest spread is A.2689

The middle population here this one has the middle spread and this one has the least spread.2693

Just to give you an idea this populations and standard deviations is 20 but if you estimate this, is this less than the standard deviation of 20 or greater?2706

If you think about the standard deviation of 20 and another 20 would be that and usually within 3 standard deviation you have almost 99%.2735

Here what we see is I already have 1 standard deviation you have almost everybody in there.2757

I would say this standard deviation is smaller.2766

Here mu sub x bar that is the same but sigma sub x bar is smaller that sigma.2771

What about here?2783

Here the sigma would be something like 14 and if I go about 14 that would be like that.2786

14 would be that 1, 2, 3.2808

Even when I go out about 1 standard deviation I will basically cover the entire space.2821

Here although the mu sub x bar is similar our standard error is smaller that our standard deviation of the population.2827

Let us that same logic for the last one that has a standard deviation of approximately 8 and here let us go out about 8.2843

8 would be like that.2853

Again, we see that although the mu is similar our standard deviation or standard error is less than sigma.2858

One thing we find is that typically the population standard deviation is smaller than the corresponding sample distributions or what we call standard error.2877

That is your introduction to sampling distributions.2885

Thanks for using www.educator.com.2895

Educator®

Please sign in to participate in this lecture discussion.

Resetting Your Password?
OR

Start Learning Now

Our free lessons will get you started (Adobe Flash® required).
Get immediate access to our entire library.

Membership Overview

  • Available 24/7. Unlimited Access to Our Entire Library.
  • Search and jump to exactly what you want to learn.
  • *Ask questions and get answers from the community and our teachers!
  • Practice questions with step-by-step solutions.
  • Download lecture slides for taking notes.
  • Track your course viewing progress.
  • Accessible anytime, anywhere with our Android and iOS apps.