表初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
CREATE TABLE `t2` (
`id` INT(11) NOT NULL,
`a` INT(11) DEFAULT NULL,
`b` INT(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB;

DROP PROCEDURE IF EXISTS idata;
DELIMITER ;;
CREATE PROCEDURE idata()
BEGIN
DECLARE i INT;
SET i=1;
WHILE (i <= 1000) DO
INSERT INTO t2 VALUES (i,i,i);
SET i=i+1;
END WHILE;
END;;
DELIMITER ;
CALL idata();

CREATE TABLE t1 LIKE t2;
INSERT INTO t1 (SELECT * FROM t2 WHERE id<=100);

Index Nested-Loop Join

1
2
3
4
5
6
7
8
9
10
11
-- 使用JOIN,优化器可能会选择t1或t2作为驱动表
-- 使用STRAIGHT_JOIN,使用固定的连接关系,t1为驱动表,t2为被驱动表
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.a);

mysql> EXPLAIN SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.a);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 100 | 100.00 | Using where |
| 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | test.t1.a | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+

执行过程

  1. 从t1读取一行数据R
  2. 从R中取出字段a,然后到t2去查找
  3. 取出t2中满足条件的行,与R组成一行,作为结果集的一部分
  4. 重复上面步骤,直至遍历t1完毕

扫描行数

  1. 对驱动表t1做全表扫描,需要扫描100行
  2. 对每一行R,根据字段a去t2查找,走的是树搜索过程
    • 构造的数据都是一一对应,总共扫描100行
  3. 因此,整个执行流程,总扫描行数为200行
1
2
3
4
5
# Time: 2019-03-10T11:06:13.271095Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.001391 Lock_time: 0.000135 Rows_sent: 100 Rows_examined: 200
SET timestamp=1552215973;
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.a);

不使用Join

  1. 执行SELECT * FROM t1,扫描100行
  2. 循环遍历100行数据
    • 从每一行R中取出字段a的值$R.a
    • 执行SELECT * FROM t2 WHERE a=$R.a
    • 把返回的结果和R构成结果集的一行
  3. 对比Join
    • 同样扫描了200行,但总共执行了101条语句,客户端还需要自己拼接SQL语句和结果

选择驱动表

  1. 上面的查询语句,驱动表走全部扫描被驱动表走树搜索
  2. 假设被驱动表的行数为M
    • 每次在被驱动表上查一行数据,需要先搜索辅助索引a,再搜索主键索引
    • 因此,在被驱动表上查一行的时间复杂度是 $2*\log_2 M$
  3. 假设驱动表的行数为N,需要扫描驱动表N行
  4. 整个执行过程,时间复杂度为 $N + N*2*\log_2 M$
    • N对扫描行数的影响更大,因此选择小表做驱动表

Simple Nested-Loop Join

1
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.b);
  1. 被驱动表t2的字段b上没有索引,因此每次到t2去做匹配的时候,都要做一次全表扫描
  2. 按照上面的算法,时间复杂度为 $N + N*M$,总扫描行数为100,100次(10W
    • 假如t1和t2都是10W行数据,那么总扫描次数为10,000,100,000次(100亿
    • 因此,MySQL本身没有使用Simple Nested-Loop Join算法

Block Nested-Loop Join

针对场景:被驱动表上没有可用的索引

join_buffer充足

执行过程

  1. 把t1的数据读入线程内存join_buffer,执行的是SELECT *,因此会把整个t1读入join_buffer
  2. 扫描t2,把t2中的每一行取出来,与join_buffer中的数据做对比
    • 如果满足join条件的行,作为结果集的一部分返回
1
2
3
4
5
6
7
8
-- 默认为256KB
-- 4194304 Bytes == 4 MB
mysql> SHOW VARIABLES LIKE '%join_buffer_size%';
+------------------+---------+
| Variable_name | Value |
+------------------+---------+
| join_buffer_size | 4194304 |
+------------------+---------+

EXPLAIN

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> EXPLAIN SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.b);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 100 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+

# Time: 2019-03-10T12:19:57.245356Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.010132 Lock_time: 0.000192 Rows_sent: 100 Rows_examined: 1100
SET timestamp=1552220397;
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.b);
  1. 整个过程中,对t1和t2都做了一次全表扫描,总扫描行数为1100
  2. 由于join_buffer以无序数组的方式组织的,因此对t2的每一行数据,都需要做100次判断
    • 因此,在内存中的总判断次数为100,000次
  3. Simple Nested-Loop Join的扫描行数也是100,000次,时间复杂度是一样的
    • Block Nested-Loop Join的100,000次判断是内存操作速度会快很多
    • Simple Nested-Loop Join可能会涉及磁盘操作(全表扫描)

选择驱动表

  1. 假设小表的行数为N,大表的行数为M
  2. 两个表都要做一次全表扫描,总扫描行数为M+N
  3. 内存中的判断次数是M*N
  4. 此时,选择大表还是小表作为驱动表,_没有任何差异_

join_buffer不足

1
2
3
4
5
6
7
8
-- 放不下t1的所有数据,采取分段放的策略
SET join_buffer_size=1200;

# Time: 2019-03-10T12:30:32.194726Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.009459 Lock_time: 0.000559 Rows_sent: 100 Rows_examined: 2100
SET timestamp=1552221032;
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.a=t2.b);

执行过程

  1. 扫描t1,顺序读取数据行放入join_buffer,放完第88行后join_buffer满,继续第2步
  2. 扫描t2,把t2中的每一行取出来,跟join_buffer中的数据做对比
    • 如果满足join条件的行,作为结果集的一部分返回
  3. 清空join_buffer(为了复用,体现Block的核心思想)
  4. 继续扫描t1,顺序取最后12行数据加入join_buffer,继续执行第2步

性能

  1. 由于t1被分成了两次加入join_buffer,导致t2会被扫描两次,因此总扫描行数为2100
  2. 但是内存的判断次数还是不变的,依然是100,000次

选择驱动表

  1. 假设驱动表的数据行数为N,需要分K段才能完成算法流程,被驱动表的数据行数为M
    • K并非常数,N越大K越大,定义:$K=N*\lambda, \lambda \in (0,1]$
    • join_buffer_size固定且t1和2表类似的情况下,$\lambda$是常量
  2. 扫描行数为 $N + \lambda*N*M$
    • 减少N比减少M,扫描的行数会更小
    • 因此选择小表当驱动表
  3. 内存判断次数为 $N*M$(无需考虑
  4. 如果要减少$\lambda$的值,可以加大join_buffer_size的值,一次性放入的行越多,分段就越少

对比Simple Nested-Loop Join

  1. Simple Nested-Loop Join需要对被驱动表全表扫描
  2. 对被驱动表做全表扫描的时候,如果数据没有在Buffer Pool中,就需要等待这部分数据从磁盘读入
    • 从磁盘读入数据到内存,会影响正常业务的Buffer Pool命中率
      • Simple Nested-Loop Join算法天然会对被驱动表的数据做多次访问
      • 因此,更容易将这些数据页放到Buffer Pool的头部
  3. 即使被驱动表的数据都在内存中BNL算法的遍历成本更低
    • Simple Nested-Loop Join遍历的是Buffer Pool,采用链表的形式
    • BNL遍历的是join_buffer,采用数组的形式
  4. 因此BNL算法的性能会更好

小表

1
2
-- 恢复为默认值256KB
SET join_buffer_size=262144;

过滤行数

t1为驱动表

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> EXPLAIN SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.b=t2.b) WHERE t2.id<=50;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 50 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

# Time: 2019-03-10T13:15:50.346563Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.001006 Lock_time: 0.000162 Rows_sent: 50 Rows_examined: 150
SET timestamp=1552223750;
SELECT * FROM t1 STRAIGHT_JOIN t2 ON (t1.b=t2.b) WHERE t2.id<=50;

t2为驱动表

join_buffer只需要放入t2的前50行,因此t2的前50行相对于t1的所有行来说是一个更小的表

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> EXPLAIN SELECT * FROM t2 STRAIGHT_JOIN t1 ON (t1.b=t2.b) WHERE t2.id<=50;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 50 | 100.00 | Using where |
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

# Time: 2019-03-10T13:18:26.656339Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.000965 Lock_time: 0.000150 Rows_sent: 50 Rows_examined: 150
SET timestamp=1552223906;
SELECT * FROM t2 STRAIGHT_JOIN t1 ON (t1.b=t2.b) WHERE t2.id<=50;

优化器选择

1
2
3
4
5
6
7
8
-- 选择t2作为驱动表
mysql> EXPLAIN SELECT * FROM t1 JOIN t2 ON (t1.b=t2.b) WHERE t2.id<=50;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 50 | 100.00 | Using where |
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

列数量

t1为驱动表

t1只查字段b,如果将t1放入join_buffer,只需要放入字段b的值

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> EXPLAIN SELECT t1.b,t2.* FROM t1 STRAIGHT_JOIN t2 ON (t1.b=t2.b) WHERE t2.id<=100;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

# Time: 2019-03-10T13:23:55.558748Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.002742 Lock_time: 0.000123 Rows_sent: 100 Rows_examined: 200
SET timestamp=1552224235;
SELECT t1.b,t2.* FROM t1 STRAIGHT_JOIN t2 ON (t1.b=t2.b) WHERE t2.id<=100;

t2为驱动表

t2要查所有的字段,如果将t2放入join_buffer,要放入三个字段idab,因此t1是更小的表

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> EXPLAIN SELECT t1.b,t2.* FROM t2 STRAIGHT_JOIN t1 on (t1.b=t2.b) WHERE t2.id<=100;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where |
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

# Time: 2019-03-10T13:24:51.561116Z
# User@Host: root[root] @ localhost [] Id: 8
# Query_time: 0.002680 Lock_time: 0.000907 Rows_sent: 100 Rows_examined: 200
SET timestamp=1552224291;
SELECT t1.b,t2.* FROM t2 STRAIGHT_JOIN t1 on (t1.b=t2.b) WHERE t2.id<=100;

优化器选择

1
2
3
4
5
6
7
8
-- 但优化器依然选择了t2作为驱动表
mysql> EXPLAIN SELECT t1.b,t2.* FROM t2 JOIN t1 on (t1.b=t2.b) WHERE t2.id<=100;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t2 | NULL | range | PRIMARY | PRIMARY | 4 | NULL | 100 | 100.00 | Using where |
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 100 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------------------+

小结

选择驱动表时,应该是按照各自的条件过滤,然后计算参与join的各个字段的总数据量,数量量小的表,才是小表

常见问题

  1. 能否可以使用Join
    • 如果使用Index Nested-Loop Join,即用上了被驱动表上的索引,其实问题不大
    • 如果使用Block Nested-Loop Join扫描行数可能会过多尽量避免使用,通过EXPLAIN确认
  2. 选择小表还是大表作为驱动表
    • 如果使用Index Nested-Loop Join,选择小表作为驱动表
    • 如果使用Block Nested-Loop Join
      • join_buffer充足时,没有区别
      • join_buffer不足时(更常见),选择小表作为驱动表
    • 结论:选择小表做驱动表

LEFT JOIN

表初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
CREATE TABLE a(f1 INT, f2 INT, INDEX(f1)) ENGINE=InnoDB;
CREATE TABLE b(f1 INT, f2 INT) ENGINE=InnoDB;
INSERT INTO a VALUES (1,1),(2,2),(3,3),(4,4),(5,5),(6,6);
INSERT INTO b VALUES (3,3),(4,4),(5,5),(6,6),(7,7),(8,8);

-- Q1
mysql> SELECT * FROM a LEFT JOIN b ON (a.f1=b.f1) AND (a.f2=b.f2);
+------+------+------+------+
| f1 | f2 | f1 | f2 |
+------+------+------+------+
| 3 | 3 | 3 | 3 |
| 4 | 4 | 4 | 4 |
| 5 | 5 | 5 | 5 |
| 6 | 6 | 6 | 6 |
| 1 | 1 | NULL | NULL |
| 2 | 2 | NULL | NULL |
+------+------+------+------+

-- Q2
mysql> SELECT * FROM a LEFT JOIN b ON (a.f1=b.f1) WHERE (a.f2=b.f2);
+------+------+------+------+
| f1 | f2 | f1 | f2 |
+------+------+------+------+
| 3 | 3 | 3 | 3 |
| 4 | 4 | 4 | 4 |
| 5 | 5 | 5 | 5 |
| 6 | 6 | 6 | 6 |
+------+------+------+------+

Q1

1
2
3
4
5
6
7
mysql> EXPLAIN SELECT * FROM a LEFT JOIN b ON (a.f1=b.f1) AND (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | NULL |
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
  1. 驱动表是表a,被驱动表是表b,与使用STRAIGHT_JOIN的效果一致
  2. 由于表b的字段f1上没有索引,所以使用的是BNL算法
    • 把表a的内容读入join_buffer
      • 因为是SELECT *,所以字段f1和字段f2都被放入到join_buffer
    • 顺序扫描表b,对于每一行数据,判断JOIN条件(a.f1=b.f1 and a.f2=b.f2)是否满足
      • 如果满足条件,作为结果集的一行返回
      • 如果语句中有WHERE字句,先判断WHERE部分满足条件后,再返回
    • 表b扫描完成后,对于没有被匹配的表a的行,把剩余字段补上NULL,再放入到结果集

Q2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
mysql> EXPLAIN SELECT * FROM a LEFT JOIN b ON (a.f1=b.f1) WHERE (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where |
| 1 | SIMPLE | a | NULL | ref | f1 | f1 | 5 | test.b.f1 | 1 | 16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G;
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f1` = `test`.`b`.`f1`) and (`test`.`a`.`f2` = `test`.`b`.`f2`))

  1. 驱动表是表b
  2. 如果一条JOIN语句的Extra字段什么都没写,表示使用的是NLJ算法
    • 顺序扫描表b,每一行用b.f1去表a查,匹配到记录后判断a.f2=b.f2是否满足
    • 如果满足条件的话,作为结果集的一部分返回
    • 在MySQL里,NULL跟任何值执行等值判断和不等值判断的结果都是NULL
      • SELECT NULL = NULL,返回的也是NULL
      • WHERE (a.f2=b.f2)表示查询结果里不会包含**b.f2为NULL**的行
  3. 虽然使用的是LEFT JOIN,但语义跟JOIN是一致的
    • 优化器把这条语句的LEFT JOIN改写成了JOIN,参照SHOW WARNINGS的输出
    • 因为表a的字段f1上有索引,就把表b作为驱动表,可以用上NLJ算法

小结

  1. 使用LEFT JOIN,_左边的表不一定是驱动表_
  2. 因此,如果要使用LEFT JOIN语义
    • 就不能把被驱动表的字段放在WHERE条件里面的等值判断不等值判断
    • 必须都写在ON里面

Q3 + Q4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Q3
mysql> EXPLAIN SELECT * FROM a JOIN b ON (a.f1=b.f1) AND (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where |
| 1 | SIMPLE | a | NULL | ref | f1 | f1 | 5 | test.b.f1 | 1 | 16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+

mysql> SHOW WARNINGS\G;
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f2` = `test`.`b`.`f2`) and (`test`.`a`.`f1` = `test`.`b`.`f1`))

-- Q4
mysql> EXPLAIN SELECT * FROM a JOIN b ON (a.f1=b.f1) WHERE (a.f2=b.f2);
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 6 | 100.00 | Using where |
| 1 | SIMPLE | a | NULL | ref | f1 | f1 | 5 | test.b.f1 | 1 | 16.67 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

mysql> SHOW WARNINGS\G;
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select `test`.`a`.`f1` AS `f1`,`test`.`a`.`f2` AS `f2`,`test`.`b`.`f1` AS `f1`,`test`.`b`.`f2` AS `f2` from `test`.`a` join `test`.`b` where ((`test`.`a`.`f1` = `test`.`b`.`f1`) and (`test`.`a`.`f2` = `test`.`b`.`f2`))
  1. Q3和Q4都被改写成
    • SELECT * FROM a JOIN b WHERE (a.f1=b.f1) AND (a.f2=b.f2)
  2. JOIN语句:_将判断条件是否全部放在ON部分是没有区别的_

参考资料

《MySQL实战45讲》