Business Statistics (QUAN 2600 | Weber State)
Table of Contents
[[#Chapter 1: Introduction to Statistics]]
[[#Chapter 2: Data Visualization]]
[[#Chapter 3: Numerical Measurements]]
[[#Chapter 4: Probability]]
[[#Chapter 5: Discrete Probability Distributions]]
[[#Chapter 6: Continuous Probability Distributions]]
[[#Chapter 7: Sampling and Sampling Distributions]]
[[#Chapter 8: Interval Estimation]]
[[#Chapter 9: Hypothesis Testing]]
[[#Key Formulas Summary]]
Chapter 1: Introduction to Statistics
Core Concepts
Statistics = Organizing disorganized data to understand and communicate information
Types of Statistics:
- Descriptive Statistics: Summary of data (tabular, graphical, numerical)
- Statistical Inference: Using sample data to make estimates about populations
Data and Variables
Data Sources:
- Experimental Data: Randomly assigned control/treatment groups (causal relationships)
- Observational Data: Non-experimental observations (surveys, studies)
- Existing Data: Internal records, government data, public databases
Key Terms:
- Element: Entity on which data are collected (IDs)
- Variable: Characteristic of interest (columns)
- Observation: Complete set of variables for an entity (rows)
- Population: All elements of interest
- Sample: Subset of population
- Census: Data collection for entire population
Scales of Measurement
Categorical (Qualitative):
- Nominal: Labels with no order (names, colors)
- Ordinal: Labels with meaningful order (rankings, grades)
Quantitative (Numerical):
- Interval: Numbers with fixed units, no true zero (temperature in Celsius)
- Ratio: Numbers with true zero (height, weight, income)
Analytics Types
- Descriptive: What happened in the past
- Predictive: Using models to forecast future
- Prescriptive: Optimal course of action
Chapter 2: Data Visualization
Frequency Distributions
Basic Concepts:
- Frequency: Number of observations in each category
- Relative Frequency: Frequency ÷ Total observations
- Percent Frequency: Relative frequency × 100
For Quantitative Data:
- Use 5-20 classes
- Class Width = Range ÷ Number of classes (round up)
- Cumulative Frequency: ≤ upper limit of class
Charts and Graphs
Categorical Data:
- Bar Charts: Fixed width bars, can be sorted
- Side-by-Side Bar Charts: Compare two variables
- Stacked Bar Charts: Variables stacked in bars
- Pie Charts: 50% = 180°, 1% = 3.6°
Quantitative Data:
- Dot Plots: Simple summary for small datasets
- Histograms: Connected rectangles showing distribution
- Stem-and-Leaf: Shows rank order and shape simultaneously
Two Variables:
- Crosstabulations: Tabular summary of two variables
- Scatter Diagrams: Relationship between quantitative variables
- Trendlines: Approximate relationship line
Distribution Shapes
- Symmetric: Bell-shaped, mean = median
- Positive Skew: Tail extends right, mean > median
- Negative Skew: Tail extends left, mean < median
Chapter 3: Numerical Measurements
Measures of Location
Mean
Weighted Mean
Geometric Mean
- Used for growth rates
- Growth factor = 1 + return percentage
Median
- Middle value when data is ordered
- If n is even: average of two middle values
Mode
- Most frequently occurring value
- Can be bimodal or multimodal
Percentiles
- Quartiles (Q): 25% increments
- Deciles (D): 10% increments
- Five-Number Summary: Min, Q₁, Q₂, Q₃, Max
Z-Score (Standardized Value)
- Measures standard deviations from mean
- |z| > 3 indicates outlier
Measures of Variability
Range
Interquartile Range (IQR)
Variance
Population Variance: $$\sigma^2 = \frac{\sum (x_i - \mu)^2}{N}$$
Sample Variance: $$s^2 = \frac{\sum (x_i - \bar{x})^2}{n-1}$$
Standard Deviation
Coefficient of Variation
Outlier Detection Methods
Z-Score Method: |z| > 3 IQR Method:
- Lower limit = Q₁ - 1.5(IQR)
- Upper limit = Q₃ + 1.5(IQR)
Distribution Rules
Chebyshev's Theorem
- Applies to any distribution
- z must be > 1
Empirical Rule (Bell-Shaped Distributions)
- 68.26% within 1σ
- 95.44% within 2σ
- 99.74% within 3σ
Measures of Association
Covariance
Correlation Coefficient
- Range: -1 ≤ r ≤ 1
- +1: Perfect positive linear relationship
- -1: Perfect negative linear relationship
- 0: No linear relationship
Chapter 4: Probability
Basic Concepts
Probability Scale: 0 ≤ P(E) ≤ 1
- 0 = impossible
- 0.5 = equally likely
- 1 = certain
Sample Space: All possible outcomes Sample Point: Single outcome Event: Collection of sample points
Counting Rules
Multi-part Experiments: $$\text{Total Outcomes} = (n_1)(n_2)...(n_k)$$
Combinations (order doesn't matter): $$C_n^r = \frac{n!}{r!(n-r)!}$$
Permutations (order matters): $$P_n^r = \frac{n!}{(n-r)!}$$
With Replacement: $$x^y$$
Assigning Probabilities
Methods:
- Classical: Equal probability for all outcomes
- Relative Frequency: Based on historical data
- Subjective: Based on belief/judgment
Probability Relationships
Complement
Addition Law
For Mutually Exclusive Events: $$P(A \cup B) = P(A) + P(B)$$
Conditional Probability
Multiplication Law
For Independent Events: $$P(A \cap B) = P(A) \cdot P(B)$$
Chapter 5: Discrete Probability Distributions
Random Variables
Discrete Random Variable: Countable outcomes
- Can be finite or infinite
Probability Distribution f(x):
- f(x) ≥ 0 for all x
- Σf(x) = 1
Expected Value and Variance
Expected Value (Mean)
Variance
Standard Deviation
Bivariate Distributions
Linear Combination
Combined Variance
Correlation Coefficient
Binomial Distribution
Properties:
- n identical trials
- Two outcomes per trial (success/failure)
- Constant probability p
- Independent trials
Binomial Probability
Where:
- x = number of successes
- n = number of trials
- p = probability of success
Binomial Expected Value
Binomial Variance
Chapter 6: Continuous Probability Distributions
Continuous Distributions
Key Concepts:
- Probability Density Function: Total area = 1
- Probability = area under curve
- P(X = specific value) = 0
Uniform Distribution
Probability Density Function
Expected Value
Variance
Normal Distribution
Probability Density Function
Properties:
- μ = mean (also median and mode)
- σ = standard deviation (controls width)
- Symmetric, bell-shaped
Standard Normal Distribution
- μ = 0, σ = 1
- Denoted as Z
Converting to Standard Normal
Empirical Rule
- 68.27% within 1σ
- 95.45% within 2σ
- 99.73% within 3σ
Chapter 7: Sampling and Sampling Distributions
Sampling Concepts
Key Terms:
- Element: Entity on which data are collected
- Population (N): All elements of interest
- Sample (n): Subset of population
- Frame: List of elements for sampling
- Simple Random Sample: Each element has equal selection chance
Population Types:
- Finite: Can count all elements
- Infinite: Cannot count all elements (treat as infinite if n/N ≤ 0.05)
Point Estimation
Point Estimators:
estimates μ (population mean) - s estimates σ (population standard deviation)
estimates p (population proportion)
Sample Mean
Sample Standard Deviation
Sample Proportion
Sampling Distribution of
Properties:
- E(
) = μ (unbiased estimator) - When population is normal:
is normal for any n - When population is not normal:
is approximately normal for large n (Central Limit Theorem)
Standard Error of Mean
Infinite Population: $$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$$
Finite Population: $$\sigma_{\bar{x}} = \sqrt{\frac{N-n}{N-1}} \cdot \frac{\sigma}{\sqrt{n}}$$
Central Limit Theorem
- For large n (≥30),
is approximately normal - For highly skewed data, use n ≥ 50
Sampling Distribution of
Properties:
- E(
) = p - Approximately normal when np ≥ 5 and n(1-p) ≥ 5
Standard Error of Proportion
Infinite Population: $$\sigma_{\bar{p}} = \sqrt{\frac{p(1-p)}{n}}$$
Finite Population: $$\sigma_{\bar{p}} = \sqrt{\frac{N-n}{N-1}} \cdot \sqrt{\frac{p(1-p)}{n}}$$
Chapter 8: Interval Estimation
Confidence Intervals
General Form: Point Estimator ± Margin of Error
Confidence Level (1-α):
- 90%: z₀.₀₅ = 1.645
- 95%: z₀.₀₂₅ = 1.96
- 99%: z₀.₀₀₅ = 2.576
Population Mean (σ Known)
Confidence Interval
Margin of Error
Population Mean (σ Unknown)
Confidence Interval
Use t-distribution with df = n-1
Sample Size Determination
For Mean
For Proportion
Where p* is planning value (use 0.5 if unknown for largest sample size)
Population Proportion
Confidence Interval
Margin of Error
Requirements: n
Chapter 9: Hypothesis Testing
Hypothesis Structure
Null Hypothesis (H₀): Tentative assumption
- Contains =, ≤, or ≥
- Assumed true until evidence suggests otherwise
Alternative Hypothesis (Hₐ): Deviation from assumption
- Contains ≠, <, or >
- What we're trying to prove
Types of Errors
Type I Error (α):
- Rejecting H₀ when it's true
- Level of significance
- Can be controlled
Type II Error (β):
- Failing to reject H₀ when it's false
- Difficult to control
Hypothesis Testing Steps
- State Hypotheses (H₀ and Hₐ)
- Choose Significance Level (α)
- Calculate Test Statistic
- Find P-value
- Make Decision (Compare p-value to α)
Decision Rule:
- If p-value < α: Reject H₀
- If p-value ≥ α: Do not reject H₀
Test Statistics
Population Mean (σ Known)
P-value Calculation
One-tailed test: P(Z > z) or P(Z < z) Two-tailed test: 2 × P(Z > |z|)
Test Types
- One-tailed: H₀ contains ≤ or ≥, Hₐ contains < or >
- Two-tailed: H₀ contains =, Hₐ contains ≠
Key Formulas Summary
Descriptive Statistics
| Measure | Formula |
|---|---|
| Sample Mean | |
| Sample Variance | |
| Sample Standard Deviation | |
| Z-Score | |
| Correlation |
Probability
| Concept | Formula |
|---|---|
| Combinations | |
| Permutations | |
| Addition Law | |
| Conditional | $P(A |
| Binomial |
Sampling Distributions
| Distribution | Standard Error |
|---|---|
| Mean (Infinite) | |
| Mean (Finite) | |
| Proportion (Infinite) | |
| Proportion (Finite) |
Confidence Intervals
| Parameter | Confidence Interval |
|---|---|
| Mean (σ known) | |
| Mean (σ unknown) | |
| Proportion |
Sample Size
| Parameter | Sample Size Formula |
|---|---|
| Mean | |
| Proportion |
Hypothesis Testing
| Test | Test Statistic |
|---|---|
| Mean (σ known) | |
| Mean (σ unknown) |
Common Z-Values
- 90% Confidence: z₀.₀₅ = 1.645
- 95% Confidence: z₀.₀₂₅ = 1.96
- 99% Confidence: z₀.₀₀₅ = 2.576
Important Notes
When to Use Z vs T
- Use Z when: σ is known, or n ≥ 30 with s
- Use T when: σ is unknown and n < 30
- Degrees of freedom: df = n - 1
Normal Distribution Conditions
- Sampling distribution of
: Normal population OR n ≥ 30 (CLT) - Sampling distribution of
: np ≥ 5 AND n(1-p) ≥ 5 - Finite population correction: Use when n/N > 0.05
Key Concepts to Remember
- Unbiased estimator: E(estimator) = parameter
- Central Limit Theorem: n ≥ 30 for normal approximation
- Type I error: α = P(reject H₀ | H₀ true)
- P-value: Probability of observing test statistic or more extreme
- Confidence level: 1 - α