Translated from SimpRead - Original URL: www.cnblogs.com
The Complete Guide to MySQL Indexes: Supercharge Your Query Speed
Still struggling with slow database queries? A simple index can speed up your queries by tens or even hundreds of times! Today, I’ll walk you through the mysteries of MySQL indexing in the most beginner-friendly way possible.
From what indexes are, to how to design efficient ones, and all the way to real-world optimization techniques — this guide will turn you from a database novice into a query optimization expert!
1. What Is an Index? Why Does It Matter?
An Index Is Like a Dictionary’s Table of Contents
Imagine trying to find the word “programmer” in a 1000-page dictionary. How would you do it?
- Without a table of contents: You’d flip page by page starting from page 1 — possibly scanning 500 pages before finding it.
- With a table of contents: You go straight to the section for words starting with “程”, which points you directly to page 300 — instant success.
A database index works exactly like that table of contents — it helps us quickly locate data.
The Magic Effect of Indexes
| Scenario | No Index | With Index | Performance Gain |
|---|---|---|---|
| Querying 1 million records | Scans 1 million rows | Scans 3–4 rows | Over 250,000x faster |
| User login verification | 50ms | 1ms | 50x faster |
| Order search | 200ms | 5ms | 40x faster |
Real Example
-- Slow query without index
SELECT * FROM users WHERE email = 'john@example.com';
-- Execution time: 1.2 seconds (scanned 500,000 rows)
-- After adding index on email field
CREATE INDEX idx_email ON users(email);
SELECT * FROM users WHERE email = 'john@example.com';
-- Execution time: 0.01 seconds (directly located 1 row)
See the difference? Same query — 120 times faster!
2. Under the Hood: The Magic of B+ Trees
What Is a B+ Tree?
Don’t be intimidated by the name — B+ trees are actually quite intuitive. Think of them as an upside-down tree structure.
How B+ Tree Search Works
Let’s use a simple example:
-- Suppose we want to find user with id = 75
SELECT * FROM users WHERE id = 75;
Search Steps:
- Step 1: Start at the root node. Since 75 falls between 50 and 100, follow the middle branch.
- Step 2: Reach the leaf node and find the location of
id=75. - Step 3: Retrieve the full user record using this position.
This entire process takes only 3 disk I/O operations, while a full table scan could require tens of thousands!
Why Is B+ Tree So Fast?
| Feature | Advantage | Practical Impact |
|---|---|---|
| Multilevel balanced structure | Very low tree height | Fewer disk accesses |
| Linked leaf nodes | Supports range queries | Fast ORDER BY and pagination |
| Data stored only in leaves | Smaller internal nodes | More index data fits in memory |
3. Types of MySQL Indexes Explained
1. Primary Key Index
The primary key is a special kind of index — think of it like an ID card number.
-- Creating a primary key index
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT, -- Automatically creates PK index
name VARCHAR(50),
email VARCHAR(100)
);
-- Primary key lookup is blazing fast
SELECT * FROM users WHERE id = 12345; -- Milliseconds response
Characteristics of Primary Key Index:
- Unique and non-null
- Only one per table
- Best query performance
- Data physically stored in primary key order
2. Unique Index
-- Add unique index on email
CREATE UNIQUE INDEX idx_email ON users(email);
-- Inserting duplicate email fails
INSERT INTO users(name, email) VALUES('Zhang San', 'test@qq.com'); -- Success
INSERT INTO users(name, email) VALUES('Li Si', 'test@qq.com'); -- Fails, duplicate email
3. Normal Index (Regular Index)
Most commonly used type:
-- Add normal index on name
CREATE INDEX idx_name ON users(name);
-- Fast user lookup
SELECT * FROM users WHERE name = 'Zhang San';
4. Composite Index (Multi-column Index)
Combines multiple fields — more powerful:
-- Create composite index
CREATE INDEX idx_name_age_city ON users(name, age, city);
-- These queries can use the index:
SELECT * FROM users WHERE name = 'Zhang San'; -- ✓ Uses index
SELECT * FROM users WHERE name = 'Zhang San' AND age = 25; -- ✓ Uses index
SELECT * FROM users WHERE name = 'Zhang San' AND age = 25 AND city = 'Beijing'; -- ✓ Uses index
SELECT * FROM users WHERE age = 25; -- ✗ Cannot use index
SELECT * FROM users WHERE city = 'Beijing'; -- ✗ Cannot use index
Composite Index Rule: Leftmost Prefix Principle
-- Index: (name, age, city)
-- Effectively creates three indexes:
-- 1. (name)
-- 2. (name, age)
-- 3. (name, age, city)
4. Golden Rules for Index Design
Rule 1: Index Fields Used in WHERE Clauses
-- Frequent queries
SELECT * FROM orders WHERE user_id = 123;
SELECT * FROM orders WHERE status = 'paid';
SELECT * FROM orders WHERE create_time > '2024-01-01';
-- Should create these indexes
CREATE INDEX idx_user_id ON orders(user_id);
CREATE INDEX idx_status ON orders(status);
CREATE INDEX idx_create_time ON orders(create_time);
Rule 2: Index ORDER BY Fields
-- Frequently sorted by creation time
SELECT * FROM articles ORDER BY create_time DESC LIMIT 10;
-- Create index for lightning-fast sorting
CREATE INDEX idx_create_time ON articles(create_time);
Rule 3: Order Matters in Composite Indexes
-- Common query pattern
SELECT * FROM users WHERE city = 'Beijing' AND age > 25 ORDER BY create_time;
-- Place high-selectivity fields first
CREATE INDEX idx_city_age_create_time ON users(city, age, create_time);
Rule 4: Use Covering Indexes for Faster Queries
-- If only these fields are needed
SELECT id, name, email FROM users WHERE age = 25;
-- Create covering index — avoids going back to the table
CREATE INDEX idx_age_name_email ON users(age, name, email);
5. Practical Case Study: Optimizing an Order System
Scenario Description
Suppose we have an order table:
CREATE TABLE orders (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
order_no VARCHAR(50) NOT NULL,
status ENUM('pending', 'paid', 'shipped', 'completed', 'cancelled'),
total_amount DECIMAL(10,2),
create_time DATETIME,
update_time DATETIME
);
Common Query Scenarios & Optimization
Scenario 1: User Views Their Orders
-- Original slow query
SELECT * FROM orders WHERE user_id = 12345 ORDER BY create_time DESC;
-- Optimization
CREATE INDEX idx_user_id_create_time ON orders(user_id, create_time);
Optimization Results:
- Before: Scanned 500,000 rows, took 800ms
- After: Direct access, took just 5ms
Scenario 2: Query Orders by Status
-- Find pending payments after Jan 1, 2024
SELECT * FROM orders WHERE status = 'pending' AND create_time > '2024-01-01';
-- Optimization
CREATE INDEX idx_status_create_time ON orders(status, create_time);
Scenario 3: Exact Lookup by Order Number
-- Search by order number
SELECT * FROM orders WHERE order_no = 'ORD20240101001';
-- Optimization
CREATE UNIQUE INDEX idx_order_no ON orders(order_no);
Performance Comparison Before and After
| Query Type | Before (ms) | After (ms) | Speedup |
|---|---|---|---|
| User order query | 800 | 5 | 160x |
| Status filter | 1200 | 8 | 150x |
| Order number lookup | 600 | 2 | 300x |
6. Important Caveats: Avoid These Pitfalls
Pitfall 1: Don’t Index Small Tables
-- Bad practice: Adding index to a tiny dictionary table
CREATE TABLE dict_status (
id INT PRIMARY KEY,
name VARCHAR(20)
);
-- This table has only ~100 rows — indexing wastes space
Pitfall 2: Avoid Low-Cardinality Columns
-- Bad example: Gender has only two values (male/female)
CREATE INDEX idx_gender ON users(gender); -- Meaningless due to poor selectivity
Pitfall 3: More Indexes ≠ Better Performance
-- Bad example: Index every single column
CREATE INDEX idx_name ON users(name);
CREATE INDEX idx_age ON users(age);
CREATE INDEX idx_city ON users(city);
CREATE INDEX idx_phone ON users(phone);
CREATE INDEX idx_email ON users(email);
-- Too many indexes severely hurt INSERT/UPDATE performance
Pitfall 4: Field Order in Composite Indexes Matters
-- Wrong order
CREATE INDEX idx_age_name ON users(age, name);
SELECT * FROM users WHERE name = 'Zhang San'; -- Won't use index
-- Correct order
CREATE INDEX idx_name_age ON users(name, age);
SELECT * FROM users WHERE name = 'Zhang San'; -- Can use index
7. Advanced Index Optimization Tips
Tip 1: Use EXPLAIN to Analyze Queries
-- Check whether index is being used
EXPLAIN SELECT * FROM orders WHERE user_id = 12345;
Interpreting EXPLAIN Output:
| Field | Description | Good Value | Bad Value |
|---|---|---|---|
| type | Access type | const, eq_ref, ref | ALL, index |
| key | Index used | Specific index name | NULL |
| rows | Estimated rows scanned | As low as possible | Large number |
| Extra | Additional info | Using index | Using filesort |
Tip 2: Monitor Slow Queries```
– Enable slow query log
SET GLOBAL slow_query_log = ‘ON’;
SET GLOBAL long_query_time = 1; – Log queries taking longer than 1 second
– View slow query log
SHOW VARIABLES LIKE ‘slow_query_log_file’;
### Tip 3: Regularly Analyze Table Statistics
– Update table statistics to help the optimizer make better decisions
ANALYZE TABLE orders;
### Tip 4: Use Prefix Indexes Appropriately
– For very long string columns, use a prefix index
CREATE INDEX idx_title_prefix ON articles(title(20)); – Index only the first 20 characters
8. Advanced Index Features
--------
### 1. Functional Indexes (MySQL 8.0+)
– Create an index on a computed field
ALTER TABLE orders ADD INDEX idx_year ((YEAR(create_time)));
– This query can now use the index
SELECT * FROM orders WHERE YEAR(create_time) = 2024;
### 2. Descending Indexes (MySQL 8.0+)
– Create a descending index
CREATE INDEX idx_create_time_desc ON orders(create_time DESC);
– Faster for descending order sorting
SELECT * FROM orders ORDER BY create_time DESC LIMIT 10;
### 3. Invisible Indexes
– Create an invisible index (useful for testing)
CREATE INDEX idx_test ON orders(status) INVISIBLE;
– Make it visible after performance testing
ALTER INDEX idx_test VISIBLE;
9. Index Maintenance: Keep Indexes in Top Shape
----------------
### Regularly Check Index Usage
– View index usage statistics
SELECT
schema_name,
table_name,
index_name,
rows_selected,
rows_inserted,
rows_updated,
rows_deleted
FROM performance_schema.table_io_waits_summary_by_index_usage
WHERE schema_name = ‘your_database’;
### Remove Unused Indexes
– Find indexes that have never been used
SELECT
t.table_schema,
t.table_name,
t.index_name
FROM information_schema.statistics t
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage p
ON t.table_schema = p.object_schema
AND t.table_name = p.object_name
AND t.index_name = p.index_name
WHERE p.index_name IS NULL
AND t.table_schema = ‘your_database’
AND t.index_name != ‘PRIMARY’;
### Rebuild Fragmented Indexes
– Check index fragmentation level
SHOW TABLE STATUS WHERE name = ‘orders’;
– Rebuild the index
ALTER TABLE orders ENGINE=InnoDB;
10. Indexing Strategies in Real Projects
------------
### E-commerce System Index Design
– Products table
CREATE TABLE products (
id BIGINT PRIMARY KEY,
category_id INT,
name VARCHAR(200),
price DECIMAL(10,2),
stock INT,
status TINYINT,
create_time DATETIME,
-- Core indexes
INDEX idx_category_status_price (category_id, status, price),
INDEX idx_name (name),
INDEX idx_create_time (create_time)
);
– Orders table
CREATE TABLE orders (
id BIGINT PRIMARY KEY,
user_id BIGINT,
status TINYINT,
total_amount DECIMAL(10,2),
create_time DATETIME,
-- Core indexes
INDEX idx_user_id_create_time (user_id, create_time),
INDEX idx_status_create_time (status, create_time)
);
### Social System Index Design
– User follow table
CREATE TABLE user_follows (
id BIGINT PRIMARY KEY,
follower_id BIGINT, – Follower
following_id BIGINT, – Followed user
create_time DATETIME,
-- Core indexes
INDEX idx_follower_id (follower_id), -- Query who I'm following
INDEX idx_following_id (following_id), -- Query who follows me
UNIQUE KEY uk_follow (follower_id, following_id) -- Prevent duplicate follows
);
11. Performance Testing and Optimization Case Studies
------------
### Case 1: Optimizing User Login
**Scenario:** User login authentication
– Query before optimization
SELECT id, password_hash FROM users WHERE email = ‘user@example.com’;
– Performance test results
– Data size: 1 million users
– Query time: average 850ms
– Rows scanned: average 500,000 rows
**Optimization Plan:**
– 1. Create unique index on email
CREATE UNIQUE INDEX idx_email ON users(email);
– 2. Create covering index (to avoid table lookup)
CREATE INDEX idx_email_password ON users(email, password_hash);
**Optimization Results:**
<table><thead><tr><th>Metric</th><th>Before</th><th>After</th><th>Improvement</th></tr></thead><tbody><tr><td>Query Time</td><td>850ms</td><td>2ms</td><td>425x</td></tr><tr><td>Rows Scanned</td><td>500,000 rows</td><td>1 row</td><td>500,000x</td></tr><tr><td>CPU Usage</td><td>85%</td><td>5%</td><td>17x</td></tr></tbody></table>
### Case 2: Pagination Query Optimization
**Scenario:** Paginated product list query
– Before: Traditional pagination (very slow for deep pages)
SELECT * FROM products
WHERE category_id = 5
ORDER BY create_time DESC
LIMIT 50000, 20; – Page 2500, extremely slow
– After: Cursor-based pagination
SELECT * FROM products
WHERE category_id = 5 AND create_time < ‘2024-01-15 10:30:00’
ORDER BY create_time DESC
LIMIT 20;
**Performance Comparison:**
<table><thead><tr><th>Page</th><th>Traditional Pagination</th><th>Cursor Pagination</th><th>Performance Gain</th></tr></thead><tbody><tr><td>Page 1</td><td>5ms</td><td>3ms</td><td>1.7x</td></tr><tr><td>Page 100</td><td>50ms</td><td>3ms</td><td>16.7x</td></tr><tr><td>Page 1000</td><td>500ms</td><td>3ms</td><td>166.7x</td></tr><tr><td>Page 5000</td><td>2500ms</td><td>3ms</td><td>833.3x</td></tr></tbody></table>
12. Summary and Best Practices
----------
### Golden Rules of Index Design
**1. Basic Principles:**
* Always define a primary key index
* Create indexes on fields frequently used in WHERE clauses
* Index fields commonly used in ORDER BY clauses
* Index fields with high cardinality (distinct values)
**2. Composite Index Principles:**
* Follow the leftmost prefix principle
* Place higher-cardinality fields first
* Create composite indexes on frequently combined query fields
**3. Performance Principles:**
* More indexes are not always better
* Regularly review and remove unused indexes
* Monitor slow queries and optimize them promptly
### Common Index Misconceptions
<table><thead><tr><th>Misconception</th><th>Explanation</th><th>Best Practice</th></tr></thead><tbody><tr><td>Index every column</td><td>Wastes space and degrades write performance</td><td>Only index frequently queried columns</td></tr><tr><td>Ignore composite index order</td><td>Can cause index inefficiency or failure</td><td>Design according to the leftmost prefix rule</td></tr><tr><td>Don't monitor index usage</td><td>Leads to accumulation of unused indexes</td><td>Regularly audit and clean up unused indexes</td></tr><tr><td>Create indexes on small tables</td><td>Not worth the overhead</td><td>Avoid indexing small tables (<1,000 rows)</td></tr></tbody></table>
### Complete Index Optimization Workflow
flowchart TD
A[Identify Slow Queries] --> B[Analyze Query Patterns]
B --> C[Design Appropriate Indexes]
C --> D[Create Indexes]
D --> E[Test Performance]
E --> F{Performance Requirements Met?}
F -->| No | G[Revise Index Design]
F -->| Yes | H[Deploy to Production]
G --> C
H --> I[Continuous Monitoring]
I --> J[Regular Optimization]
Remember, index optimization is an ongoing process that requires continuous adjustment as your business evolves. A well-designed indexing strategy can improve your database performance by tens or even hundreds of times—that's the power of indexes!
With these indexing techniques, you can dramatically speed up your database queries and say goodbye to slow performance. Remember: good index design = faster queries = better user experience = more successful products!
> Want to learn more database optimization tips and real-world practices? Follow my WeChat public account **【一只划水的程序猿】** (A Swimming Programmer), where you'll find practical technical insights and down-to-earth programming tips to rapidly boost your skills! Don’t forget to like, save, and share with others who might benefit!