找出需調(diào)優(yōu)的慢SQL后,先通過EXPLAIN查看執(zhí)行計劃,然后通過如下方法優(yōu)化SQL:下推更多計算至存儲層MySQL,適當(dāng)增加索引,優(yōu)化執(zhí)行計劃。

下推更多的計算

PolarDB-X 1.0會盡可能將更多的計算下推到存儲層MySQL。下推計算能夠減少數(shù)據(jù)傳輸,減少網(wǎng)絡(luò)層和PolarDB-X 1.0層的開銷,提升SQL語句的執(zhí)行效率。PolarDB-X 1.0支持下推幾乎所有算子,包括:

  • 過濾條件,如WHEREHAVING中的條件。
  • 聚合算子,如COUNTGROUP BY等,會分成兩階段進行聚合計算。
  • 排序算子,如ORDER BY
  • JOIN和子查詢,兩邊JOIN Key分片方式必須一樣,或其中一邊為廣播表。

如下示例講解如何將更多的計算下推到MySQL來加速執(zhí)行。

> EXPLAIN select * from customer, nation where c_nationkey = n_nationkey and n_regionkey = 3;
Project(c_custkey="c_custkey", c_name="c_name", c_address="c_address", c_nationkey="c_nationkey", c_phone="c_phone", c_acctbal="c_acctbal", c_mktsegment="c_mktsegment", c_comment="c_comment", n_nationkey="n_nationkey", n_name="n_name", n_regionkey="n_regionkey", n_comment="n_comment")
  BKAJoin(condition="c_nationkey = n_nationkey", type="inner")
    Gather(concurrent=true)
      LogicalView(tables="nation", shardCount=2, sql="SELECT * FROM `nation` AS `nation` WHERE (`n_regionkey` = ?)")
    Gather(concurrent=true)
      LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT * FROM `customer` AS `customer` WHERE (`c_nationkey` IN ('?'))")

若執(zhí)行計劃中出現(xiàn)了BKAJOIN,BKAJOIN每次從左表獲取一批數(shù)據(jù),就會拼成一個IN查詢?nèi)〕鲇冶硐嚓P(guān)聯(lián)的行,并在最后執(zhí)行JOIN操作。由于左表數(shù)據(jù)量很大,需要取很多次才能完成查詢,執(zhí)行很慢。

無法下推JOIN的原因是:當(dāng)前情況下,nation是按主鍵n_nationkey切分的,而本查詢的JOIN Key是c_custkey,二者不同,所以下推失敗。

考慮到nation表數(shù)據(jù)量并不大、且?guī)缀鯖]有修改操作,可以將其重建成如下廣播表:

CREATE TABLE `nation` (
  `n_nationkey` int(11) NOT NULL,
  `n_name` varchar(25) NOT NULL,
  `n_regionkey` int(11) NOT NULL,
  `n_comment` varchar(152) DEFAULT NULL,
  PRIMARY KEY (`n_nationkey`)
) BROADCAST;  --- 聲明為廣播表

修改后,可以看到執(zhí)行計劃中不再出現(xiàn)JOIN,幾乎所有計算都被下推到存儲層MySQL執(zhí)行(LogicalView中),而上層僅僅是將結(jié)果收集并返回給用戶(Gather算子),執(zhí)行性能大大增強。

> EXPLAIN select * from customer, nation where c_nationkey = n_nationkey and n_regionkey = 3;

Gather(concurrent=true)
  LogicalView(tables="customer_[0-7],nation", shardCount=8, sql="SELECT * FROM `customer` AS `customer` INNER JOIN `nation` AS `nation` ON ((`nation`.`n_regionkey` = ?) AND (`customer`.`c_nationkey` = `nation`.`n_nationkey`))")

更多關(guān)于下推的原理和優(yōu)化,請參見查詢改寫與下推

增加索引

如果下推SQL中出現(xiàn)(物理)慢SQL,可以給分表增加索引來解決,這里不再詳述。

PolarDB-X 1.0自5.4.1版本開始支持全局二級索引,可以通過增加GSI的方式使邏輯表擁有多個拆分維度。

下面以一個慢SQL作為示例來講解如何通過GSI下推更多算子。

> EXPLAIN select o_orderkey, c_custkey, c_name from orders, customer
          where o_custkey = c_custkey and o_orderdate = '2019-11-11' and o_totalprice > 100;

Project(o_orderkey="o_orderkey", c_custkey="c_custkey", c_name="c_name")
  HashJoin(condition="o_custkey = c_custkey", type="inner")
    Gather(concurrent=true)
      LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_custkey`, `c_name` FROM `customer` AS `customer`")
    Gather(concurrent=true)
      LogicalView(tables="orders_[0-7]", shardCount=8, sql="SELECT `o_orderkey`, `o_custkey` FROM `orders` AS `orders` WHERE ((`o_orderdate` = ?) AND (`o_totalprice` > ?))")

執(zhí)行計劃中,orders按照o_orderkey拆分而customer按照c_custkey拆分,由于拆分維度不同JOIN算子不能下推。

考慮到2019-11-11當(dāng)天總價高于100的訂單非常多,跨分片JOIN耗時很高,需要在orders表上創(chuàng)建一個GSI來使得JOIN算子可以下推。

查詢中使用到了orders表的o_orderkeyo_custkeyo_orderdateo_totalprice四列,其中o_orderkeyo_custkey分別是主表和索引表的拆分鍵,o_orderdateo_totalprice作為覆蓋列包含在索引中用于避免回表。

> create global index i_o_custkey on orders(`o_custkey`) covering(`o_orderdate`, `o_totalprice`)
        DBPARTITION BY HASH(`o_custkey`) TBPARTITION BY HASH(`o_custkey`) TBPARTITIONS 4;

增加GSI并通過force index(i_o_custkey)強制使用索引后,跨分片JOIN變?yōu)镸ySQL上的局部JOIN (IndexScan中),并且通過覆蓋列避免了回表操作,查詢性能得到提升。

> EXPLAIN select o_orderkey, c_custkey, c_name from orders force index(i_o_custkey), customer
          where o_custkey = c_custkey and o_orderdate = '2019-11-11' and o_totalprice > 100;

Gather(concurrent=true)
  IndexScan(tables="i_o_custkey_[0-7],customer_[0-7]", shardCount=8, sql="SELECT `i_o_custkey`.`o_orderkey`, `customer`.`c_custkey`, `customer`.`c_name` FROM `i_o_custkey` AS `i_o_custkey` INNER JOIN `customer` AS `customer` ON (((`i_o_custkey`.`o_orderdate` = ?) AND (`i_o_custkey`.`o_custkey` = `customer`.`c_custkey`)) AND (`i_o_custkey`.`o_totalprice` > ?))")

更多關(guān)于全局二級索引的使用細節(jié),請參見使用全局二級索引

執(zhí)行計劃調(diào)優(yōu)

說明 以下內(nèi)容適用于PolarDB-X 1.0 5.3.12及以上版本。

大多數(shù)情況下,PolarDB-X 1.0的查詢優(yōu)化器可以自動產(chǎn)生最佳的執(zhí)行計劃。但少數(shù)情況下,可能因為統(tǒng)計信息存在缺失、誤差等,導(dǎo)致生成的執(zhí)行計劃不夠好,這時可以通過Hint來干預(yù)優(yōu)化器行為,使之生成更好的執(zhí)行計劃。

如下示例將講解執(zhí)行計劃的調(diào)優(yōu)。

下面的查詢,PolarDB-X 1.0查詢優(yōu)化器綜合了JOIN兩邊的代價。

> EXPLAIN select o_orderkey, c_custkey, c_name from orders, customer
          where o_custkey = c_custkey and o_orderdate = '2019-11-15' and o_totalprice < 10;

Project(o_orderkey="o_orderkey", c_custkey="c_custkey", c_name="c_name")
  HashJoin(condition="o_custkey = c_custkey", type="inner")
    Gather(concurrent=true)
      LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_custkey`, `c_name` FROM `customer` AS `customer`")
    Gather(concurrent=true)
      LogicalView(tables="orders_[0-7]", shardCount=8, sql="SELECT `o_orderkey`, `o_custkey` FROM `orders` AS `orders` WHERE ((`o_orderdate` = ?) AND (`o_totalprice` < ?))")

但是,實際上2019-11-15這一天總價低于10元的訂單數(shù)量很小,只有幾條,這時候用BKAJOIN是比Hash JOIN更好的選擇(關(guān)于BKAJOIN和Hash JOIN的介紹,請參見JOIN與子查詢的優(yōu)化和執(zhí)行)。

通過如下/*+TDDL:BKA_JOIN(orders, customer)*/Hint強制優(yōu)化器使用BKAJOIN(LookupJOIN):

> EXPLAIN /*+TDDL:BKA_JOIN(orders, customer)*/ select o_orderkey, c_custkey, c_name from orders, customer
          where o_custkey = c_custkey and o_orderdate = '2019-11-15' and o_totalprice < 10;

Project(o_orderkey="o_orderkey", c_custkey="c_custkey", c_name="c_name")
  BKAJoin(condition="o_custkey = c_custkey", type="inner")
    Gather(concurrent=true)
      LogicalView(tables="orders_[0-7]", shardCount=8, sql="SELECT `o_orderkey`, `o_custkey` FROM `orders` AS `orders` WHERE ((`o_orderdate` = ?) AND (`o_totalprice` < ?))")
    Gather(concurrent=true)
      LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_custkey`, `c_name` FROM `customer` AS `customer` WHERE (`c_custkey` IN ('?'))")

可以選擇執(zhí)行加如下Hint的查詢:

/*+TDDL:BKA_JOIN(orders, customer)*/ select o_orderkey, c_custkey, c_name from orders, customer where o_custkey = c_custkey and o_orderdate = '2019-11-15' and o_totalprice < 10;

以上操作加快了SQL查詢速度。為了讓Hint發(fā)揮作用,可以將應(yīng)用中的SQL加上Hint,或者更方便的方式是使用執(zhí)行計劃管理(Plan Management)功能對該SQL固定執(zhí)行計劃。具體操作如下:

BASELINE FIX SQL /*+TDDL:BKA_JOIN(orders, customer)*/ select o_orderkey, c_custkey, c_name from orders, customer where o_custkey = c_custkey and o_orderdate = '2019-11-15';

這樣一來,對于這條SQL(參數(shù)可以不同),PolarDB-X 1.0都會采用如上固定的執(zhí)行計劃。

更多關(guān)于執(zhí)行計劃管理的信息,請參見執(zhí)行計劃管理