2013年7月3日 星期三

32 Tips To Speed Up Your MySQL Queries

0

32 Tips To Speed Up Your MySQL Queries

If you are interested in how to create fast MySQL queries, this article is for you

1. Use persistent connections to the database to avoid connection overhead.
(使用持久化到數據庫的連接,以避免連接開銷。)

2. Check all tables have PRIMARY KEYs on columns with high cardinality (many rows match the key value). Well,`gender` column has low cardinality (selectivity), unique user id column has high one and is a good candidate to become a primary key.
(檢查所有表的主鍵列上的高基數(多少行匹配的鍵值)。嗯,`性別`列有低基數(選擇性),唯一的用戶ID列有高一,成為一個主鍵是一個很好的候選人。)

3. All references between different tables should usually be done with indices (which also means they must have identical data types so that joins based on the corresponding columns will be faster). Also check that fields that you often need to search in (appear frequently in WHERE, ORDER BY or GROUP BY clauses) have indices, but don’t add too many: the worst thing you can do is to add an index on every column of a table (I haven’t seen a table with more than 5 indices for a table, even 20-30 columns big). If you never refer to a column in comparisons, there’s no need to index it.
(所有的不同的表之間的引用通常應該 indices (這也意味著它們必須有相同的數據類型,因此,加入對應的列的基礎上會更快)。此外,還要檢查字段,你經常需要搜索(經常出現在哪裡,ORDER BY或GROUP BY子句)indices,但不要加太多,你可以做的最糟糕的事情是每列添加索引一個表(我還沒有看到超過5 indices 為表的一個表,甚至20-30列大)。如果你從來沒有參照比較中的一列,有沒有必要建立索引。)

4. Using simpler permissions when you issue GRANT statements enables MySQL to reduce permission-checking overhead when clients execute statements.
(當你使用簡單的權限發出GRANT語句讓MySQL來減少客戶端執行語句時權限檢查的開銷。)

5. Use less RAM per row by declaring columns only as large as they need to be to hold the values stored in them.
(使用較少的RAM,每行只大的,因為他們需要持有的值存儲在他們的聲明列。)

6. Use leftmost index prefix — in MySQL you can define index on several columns so that left part of that index can be used a separate one so that you need less indices.
(使用索引的最左邊前綴 - 在 MySQL 中,你可以定義索引幾列,因此,該指數的左半部分,可以使用單獨的一個,這樣,你需要較少的 indices。)

7. When your index consists of many columns, why not to create a hash column which is short, reasonably unique, and indexed? Then your query will look like:
(當 index 包括許多列,為什麼不創建一個 hash 是短暫的,合理的獨特和索引的列?然後,您的查詢將看起來像這樣:)

SELECT *
FROM table
WHERE hash_column = MD5( CONCAT(col1, col2) )
AND col1='aaa' AND col2='bbb';


8. Consider running ANALYZE TABLE (or myisamchk --analyze from command line) on a table after it has been loaded with data to help MySQL better optimize queries.
(考慮運行ANALYZE TABLE(或myisamchk - 分析命令行)放在 table 上後,它已被載入數據幫助MySQL更好地優化查詢的.)
ANALYZE [NO_WRITE_TO_BINLOG | LOCAL] TABLE
    tbl_name [, tbl_name] ...
Column Value
Table The table name
Op Always analyze
Msg_type status, error, info, note, or warning
Msg_text An informational message

http://dev.mysql.com/doc/refman/5.0/en/analyze-table.html

9. Use CHAR type when possible (instead of VARCHAR, BLOB or TEXT) — when values of a column have constant length: MD5-hash (32 symbols), ICAO or IATA airport code (4 and 3 symbols), BIC bank code (3 symbols), etc. Data in CHAR columns can be found faster rather than in variable length data types columns.
(使用CHAR類型(而不是VARCHAR,BLOB或TEXT) - 當有固定長度的列值:MD5哈希(32個符號),國際民航組織或機場的IATA代碼(4和3個符號),BIC銀行代碼(3個符號)等CHAR列中的數據,可以發現更快,而不是在可變長度的數據類型列。)

10. Don’t split a table if you just have too many columns. In accessing a row, the biggest performance hit is the disk seek needed to find the first byte of the row.
(如果你只是有太多列。不要把一個表。在訪問行,最大的性能損失,需要找到該行的第一個字節的磁盤尋道。)

11. A column must be declared as NOT NULL if it really is — thus you speed up table traversing a bit.
(A柱必須聲明為NOT NULL如果真的是 - 從而加快表遍歷位。)

12. If you usually retrieve rows in the same order like expr1, expr2, ..., make ALTER TABLE ... ORDER BY expr1, expr2, ... to optimize the table.
(如果你平時以相同的順序檢索行如表達式1,表達式2,...,使ALTER TABLE... ORDER BY表達式1,表達式2,...優化表。)

13. Don’t use PHP loop to fetch rows from database one by one just because you can — use IN instead, e.g.
(不要直接使用PHP循環從數據庫讀取行1只是因為你可以 - 使用一個代替,例如:)


SELECT *
FROM `table`
WHERE `id` IN (1,7,13,42);

14. Use column default value, and insert only those values that differs from the default. This reduces the query parsing time.
(使用列默認值,然後將只有那些不同於默認值。這降低了的查詢解析時間。)

15. Use INSERT DELAYED or INSERT LOW_PRIORITY (for MyISAM) to write to your change log table. Also, if it’s MyISAM, you can add DELAY_KEY_WRITE=1 option — this makes index updates faster because they are not flushed to disk until the table is closed.
(使用INSERT DELAYED 或INSERT LOW_PRIORITY(MyISAM)寫信給您的更改日誌表。此外,如果是MyISAM表,你可以添加DELAY_KEY_WRITE= 1選項 - 這使得索引更新快,因為它們不會刷新到磁盤,直到表關閉。)

16. Think of storing users sessions data (or any other non-critical data) in MEMORY table — it’s very fast.
(想想存儲用戶會話數據(或任何其他非關鍵數據)MEMORY表 - 這是非常快的。)
http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html

17. For your web application, images and other binary assets should normally be stored as files. That is, store only a reference to the file rather than the file itself in the database.
(對於你的web應用,圖像和其他二進制資產通常應被作為文件存儲。也就是說,存儲文件只是一個參考,而不是在數據庫中的文件本身。)

18. If you have to store big amounts of textual data, consider using BLOB column to contain compressed data (MySQL’s COMPRESS() seems to be slow, so gzipping at PHP side may help) and decompressing the contents at application server side. Anyway, it must be benchmarked.
(如果你有文本數據存儲金額較大,可以考慮使用BLOB列包含壓縮數據(MySQL's COMPRESS()似乎是緩慢的,所以在PHP端使用gzip壓縮可能會有所幫助),解壓縮在應用服務器端的內容。無論如何,它必須將基準。)

19. If you often need to calculate COUNT or SUM based on information from a lot of rows (articles rating, poll votes, user registrations count, etc.), it makes sense to create a separate table and update the counter in real time, which is much faster. If you need to collect statistics from huge log tables, take advantage of using a summary table instead of scanning the entire log table every time.
(如果你經常需要計算計數或總結的基礎上很多行信息(文章評價,投票表決時,註冊用戶數,等等),它創建一個單獨的表,並實時更新計數器,這是有道理要快得多。如果你需要從龐大的日誌表收集統計信息,利用使用匯總表,而不是每次掃描整個日誌表。)

20. Don’t use REPLACE (which is DELETE+INSERT and wastes ids): use INSERT … ON DUPLICATE KEY UPDATE instead (i.e. it’s INSERT + UPDATE if conflict takes place). The same technique can be used when you need first make a SELECT to find out if data is already in database, and then run either INSERT or UPDATE. Why to choose yourself — rely on database side.
(不要使用REPLACE(DELETE+ INSERT和wastes ids):使用INSERT... ON DUPLICATE KEY UPDATE ,而不是(即它的INSERT+UPDATE ,如果衝突發生)。同樣的技術也被使用時,你首先需要做一個選擇,以找出是否數據已經在數據庫,然後運行無論是INSERT或UPDATE。為什麼要選擇自己 - 依靠數據庫側面。)

21. Tune MySQL caching: allocate enough memory for the buffer (e.g. SET GLOBAL query_cache_size = 1000000) and define query_cache_min_res_unit depending on average query resultset size.
(調整MySQL的緩存:緩衝區分配足夠的內存(如 SET GLOBAL query_cache_size= 1000000),並定義平均查詢結果集的大小取決於query_cache_min_resultset_unit 變量。)

22. Divide complex queries into several simpler ones — they have more chances to be cached, so will be quicker.
(將複雜查詢分成幾個簡單的 - 他們有更多的機會進行高速緩存,所以會更快。)

23. Group several similar INSERTs in one long INSERT with multiple VALUES lists to insert several rows at a time: quiry will be quicker due to fact that connection + sending + parsing a query takes 5-7 times of actual data insertion (depending on row size). If that is not possible, use START TRANSACTION and COMMIT, if your database is InnoDB, otherwise use LOCK TABLES — this benefits performance because the index buffer is flushed to disk only once, after all INSERT statements have completed; in this case unlock your tables each 1000 rows or so to allow other threads access to the table.
(集團幾個類似的 INSERTs 在一個很長的INSERT多個值列出一次 I插入幾行:查詢方式將更快,因為事實上,連接+發送+解析查詢需要5-7倍的實際數據插入(取決於行大小)。如果這是不可能的,使用START TRANSACTION和COMMIT,如果你的數據庫是InnoDB,否則使用LOCK TABLES - 這得益於性能,因為索引緩衝區刷新到磁盤上只有一次,所有INSERT語句完成後,在這種情況下,打開數據表鎖每1000行左右,以允許其他線程訪問表。)

24. When loading a table from a text file, use LOAD DATA INFILE (or my tool for that), it’s 20-100 times faster.
(從文本文件中加載一個表時,使用LOAD DATA INFILE(或我的工具),它的速度快20-100倍。)

25. Log slow queries on your dev/beta environment and investigate them. This way you can catch queries which execution time is high, those that don’t use indexes, and also — slow administrative statements (like OPTIMIZE TABLE and ANALYZE TABLE)
(慢查詢日誌上的開發/測試環境和調查。這樣你就可以趕上查詢執行時間為高,那些不使用索引,也 - 慢行政報表(如OPTIMIZE TABLE和ANALYZE TABLE))
http://dev.mysql.com/doc/refman/5.0/en/slow-query-log.html

26. Tune your database server parameters: for example, increase buffers size.
(調整你的數據庫服務器參數,例如:增加緩衝區大小。)

27. If you have lots of DELETEs in your application, or updates of dynamic format rows (if you have VARCHAR, BLOB or TEXT column, the row has dynamic format) of your MyISAM table to a longer total length (which may split the row), schedule running OPTIMIZE TABLE query every weekend by crond. Thus you make the defragmentation, which means more speed of queries. If you don’t use replication, add LOCAL keyword to make it faster.
(如果你有大量的在你的應用程序中 DELETEs  或 updates  您的MyISAM表動態格式的行(如果你有VARCHAR,BLOB或TEXT列,該行有動態格式)的總長度較長(可能拆分行)調度的crond運行OPTIMIZE TABLE查詢每個週末。因此,你做碎片整理,這意味著更多的查詢速度。如果你不使用複製,添加 LOCAL 關鍵字,使其更快。)

28. Don’t use ORDER BY RAND() to fetch several random rows. Fetch 10-20 entries (last by time added or ID) and make array_random() on PHP side. There are also other solutions.
(不要使用ORDER BY RAND()來獲取一些隨機行。取10-20項(最後按時間添加或ID)和PHP側array_random()。也有其他的解決方案。)

29. Consider avoiding using of HAVING clause — it’s rather slow.
(考慮避免使用HAVING子句 - 這是相當緩慢的。)

30. In most cases, a DISTINCT clause can be considered as a special case of GROUP BY; so the optimizations applicable to GROUP BY queries can be also applied to queries with a DISTINCT clause. Also, if you use DISTINCT, try to use LIMIT (MySQL stops as soon as it finds row_count unique rows) and avoid ORDER BY (it requires a temporary table in many cases).
(在大多數情況下,一個DISTINCT子句可以被視為GROUP BY的一個特例,所以GROUP BY查詢適用的優化也可以應用於使用DISTINCT子句的查詢。另外,如果你使用DISTINCT,嘗試使用LIMIT(MySQL的停止,只要它發現ROW_COUNT獨特的行),避免ORDER BY(在許多情況下,它需要一個臨時表)。)


31. When I read “Building scalable web sites”, I found that it worth sometimes to de-normalise some tables (Flickr does this), i.e. duplicate some data in several tables to avoid JOINs which are expensive. You can support data integrity with foreign keys or triggers.
(當我讀到“Building scalable web sites”,我發現值得有時去一些表(Flickr的這個)正常化,即複製一些數據,以避免在多個表的連接,JOINs是昂貴的。可以支持外鍵或觸發器的數據完整性。)

32. If you want to test a specific MySQL function or expression, use BENCHMARK function to do that.
(如果你想測試一個特定的MySQL函數或表達式,使用BENCHMARK 功能做到這一點。)

中文為Google 翻譯

http://www.ajaxline.com/node/2099

0 意見:

張貼留言