Optimizing MySQL Application Queries

For a long time, I tried to avoid the topic of database optimization. It felt like it was something somebody else could learn for me and avoid having yet another skill and learning curve, a task best delegated to more specialized employees or coworkers, after all, I am a System Architect and Designer, a programmer, not a System administrator (a job which I personally enjoy). Some attempts were more successful than others, but for years, I never felt like I had someone I could count on for this.

✅

The last person I worked with on this project did a great job and left some very useful procedures that I based this article on.

I insisted on staying away from it, that is, until this past week. The inevitable happened, and I was forced to face it head-on. A reseller informed us that his applications were working way too slowly. I had no idea where the problem could be, but I suspected that it lay somewhere deep within the database or (hopefully) in the server itself. I had no choice but to learn.

What I discovered was that it was far easier than I imagined. And, to be honest, a lot of that is thanks to generative AI, which has changed everything, here too.

Let's dive in.

The Tools

My newfound database optimization arsenal consists of just a few key tools. Together, they form a workflow that takes you from a vague sense of a problem to a precise, actionable solution.

mysqltuner
etckeeper
MySQL mysqld.cnf
Percona pt-query-digest
Your favorite generative AI

The Workflow

1. The Quick Audit: `mysqltuner`

mysqltuner is a Perl script that acts as your friendly MySQL performance auditor. You run it from the command line, and it connects to your MySQL instance to give you a quick report card. It analyzes everything from your buffer settings and thread cache hits to your key-read efficiency and a host of other metrics. It's not a diagnostic tool for a specific query, but rather a "general health check" that gives you a high-level overview of where your configuration is weak.

💡

The important message here is that any optimization you can make to MySQL itself will benefit all applications, so we need to read the output of mysqltuner and implement the recommendations we can.

In the general case of Evolutivo, and in particular in the case I was debugging this week, the (edited) output at the start was

[!!] /var/log/mysql/error.log contains 48 warning(s).
[OK] /var/log/mysql/error.log does not contain any error.
[OK] Maximum reached memory usage: 26.8G (45.55% of installed RAM)
[OK] Maximum possible memory usage: 28.1G (47.72% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 0% (22/36M)
[!!] Highest connection usage: 89%  (135/151)
[OK] Aborted connections: 0.00%  (4/176256)
[!!] name resolution is active : a reverse name resolution is made for each new connection and can reduce performance
[OK] Sorts requiring temporary tables: 0% (193 temp sorts / 1M sorts)
[!!] Joins performed without indexes: 50789
[OK] Temporary tables created on disk: 0% (121 on disk / 1M total)
[OK] Thread cache hit rate: 99% (1K created / 176K connections)
[!!] Table cache hit rate: 0% (2K open / 1M opened)
[OK] Open file limit used: 2% (218/10K)
[OK] Table locks acquired immediately: 99% (712K immediate / 713K locks)
[!!] Binlog cache memory access: 83.85% (225656 Memory / 269115 Total)
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 16.0G/4.7G
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 2.0G * 2/16.0G should be equal to 25%
[OK] InnoDB buffer pool instances: 16
[--] Number of InnoDB Buffer Pool Chunk : 128 for 16 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 99.99% (3807017853 hits/ 3807246008 total)
[!!] InnoDB Write Log efficiency: 60.97% (2079430 hits/ 3410771 total)
[OK] InnoDB log waits: 0.00% (0 waits / 1331341 writes)
-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    Control warning line(s) into /var/log/mysql/error.log file
    Reduce or eliminate persistent connections to reduce connection usage
    Configure your accounts with ip or subnets only, then update your configuration with skip-name-resolve=1
    Adjust your join queries to always utilize indexes
    Increase table_open_cache gradually to avoid file descriptor limits
    Read this before increasing table_open_cache over 64: https://bit.ly/1mi7c4C
    Read this before increasing for MariaDB https://mariadb.com/kb/en/library/optimizing-table_open_cache/
    This is MyISAM only table_cache scalability problem, InnoDB not affected.
    See more details here: https://bugs.mysql.com/bug.php?id=49177
    This bug already fixed in MySQL 5.7.9 and newer MySQL versions.
    Beware that open_files_limit (10000) variable 
    should be greater than table_open_cache (2000)
    Increase binlog_cache_size (Actual value: 32768)
Variables to adjust:
    max_connections (> 151)
    wait_timeout (< 28800)
    interactive_timeout (< 28800)
    join_buffer_size (> 80.0M, or always use indexes with JOINs)
    table_open_cache (> 2000)
    binlog_cache_size (16.0M)

Reading the procedures that we have implemented as a company in the past and the output above the settings that stand out are

table_definition_cache: When a client establishes a connection to a MySQL server, the server maintains a cache of table definitions in memory so that it can quickly look up metadata about tables that are used in queries. The table_definition_cache variable controls the size of this cache. Evolutivo has a LOT of tables, so we need to increase this size to avoid having to go to the disk for the information.
max_connections: in a server with many installations, even of other applications like WordPress, PrestaShop, Nextcloud, … which all use the MySQL server, this value has to be high
innodb_redo_log_capacity: The innodb_redo_log_capacity setting controls the total size of the redo log files. A larger capacity allows MySQL to defer flushing changes from memory to disk, which can significantly improve write performance, especially during high-volume operations.
skip-name-resolve: this avoids extra network calls, but be careful, some applications require this setting if not set to work with IP or subnets only.

After setting these variables (and some other work), I ended up with this output

[OK] Maximum reached memory usage: 17.3G (29.46% of installed RAM)
[OK] Maximum possible memory usage: 36.1G (61.29% of installed RAM)
[OK] Overall possible memory usage with other process is compatible with memory available
[OK] Slow queries: 2% (19K/735K)
[OK] Highest usage of available connections: 6% (16/251)
[!!] Aborted connections: 4.03%  (295/7328)
[OK] Sorts requiring temporary tables: 0% (10 temp sorts / 46K sorts)
[!!] Joins performed without indexes: 217
[OK] Temporary tables created on disk: 0% (1 on disk / 18K total)
[OK] Thread cache hit rate: 99% (58 created / 7K connections)
[OK] Table cache hit rate: 29% (4K open / 16K opened)
[OK] Open file limit used: 5% (520/10K)
[OK] Table locks acquired immediately: 99% (74K immediate / 74K locks)
[OK] Binlog cache memory access: 100.00% (12160 Memory / 12160 Total)
[OK] InnoDB File per table is activated
[OK] InnoDB buffer pool / data size: 16.0G/4.7G
[OK] Ratio InnoDB log file size / InnoDB Buffer pool size: 2.0G * 2/16.0G should be equal to 25%
[OK] InnoDB buffer pool instances: 16
[--] Number of InnoDB Buffer Pool Chunk : 128 for 16 Buffer Pool Instance(s)
[OK] Innodb_buffer_pool_size aligned with Innodb_buffer_pool_chunk_size & Innodb_buffer_pool_instances
[OK] InnoDB Read buffer efficiency: 99.90% (44294781 hits/ 44339899 total)
[!!] InnoDB Write Log efficiency: 73.46% (154640 hits/ 210498 total)
[OK] InnoDB log waits: 0.00% (0 waits / 55858 writes)
-------- Recommendations ---------------------------------------------------------------------------
General recommendations:
    Control warning line(s) into /var/log/mysql/error.log file
    Control error line(s) into /var/log/mysql/error.log file
    MySQL was started within the last 24 hours - recommendations may be inaccurate
    Reduce or eliminate unclosed connections and network issues
    Adjust your join queries to always utilize indexes
Variables to adjust:
    join_buffer_size (> 80.0M, or always use indexes with JOINs)

A significant improvement!

❗

Note that this step is mandatory for any production server using MySQL applications, and it should be repeated from time to time as things change and server usage becomes real.

But, before you do that, let’s see the other tools.

2. The Safety Net: `etckeeper`

This tool isn't directly related to MySQL performance, but it's a total lifesaver for any Linux server. etckeeper is a collection of scripts that hooks into your package manager (apt, dnf, etc.) to automatically commit changes to the /etc directory into a local Git repository.

In our case, we'll be modifying the core MySQL configuration file, mysqld.cnf. Having Git track and version every single change we make is of immeasurable value. If you make a mistake, you can simply roll back. This simple safety net removes a lot of the fear and hesitation from performing server-level changes.

💡

This is a must-have tool!

3. The Configuration File: `mysqld.cnf`

This is the main configuration file for your MySQL server. It's typically located in /etc/mysql/mysql.cnf or /etc/my.cnf. mysqltuner will suggest several changes to this file, such as adjusting buffer sizes or increasing certain cache limits. You'll make these adjustments, save the file, and restart MySQL. Don't worry—etckeeper will be there to save you if anything goes wrong.

You will have to activate slow query analysis and let your users work normally for some days and repeat during a few weeks.

4. The Forensic Analysis: `pt-query-digest`

This is the most crucial tool in the entire workflow. pt-query-digest is a script from Percona Toolkit that analyzes your MySQL slow query log. It reads through millions of queries and then, like a skilled detective, presents you with a report that tells you exactly which queries are causing the most pain. It will identify the worst offenders based on total execution time, lock time, and number of rows examined. This report is the goldmine of information you need to move forward.

💡

Percona is an incredible company, a serious reference for anyone who needs to take their database performance to a top-notch level.

5. The Final, Easy Step: Generative AI

Now comes the fun part. You have the raw data from pt-query-digest, which, to the untrained eye, can still be a bit overwhelming, it looks like this:

# Query 11: 0.00 QPS, 0.00x concurrency, ID 0x37107AE09E34F1D59DD462CB636DEBDB at byte 1188119
# This item is included in the report because it matches --limit.
# Scores: V/M = 0.00
# Time range: 2025-08-30T10:32:43 to 2025-08-30T13:26:10
# Attribute    pct   total     min     max     avg     95%  stddev  median
# ============ === ======= ======= ======= ======= ======= ======= =======
# Count          0      44
# Exec time      0      2s    25ms    69ms    34ms    42ms     9ms    31ms
# Lock time      0   103us     1us     9us     2us     5us     1us     1us
# Rows sent      0      48       1       2    1.09    1.96    0.28    0.99
# Rows examine   8   2.25M  52.28k  52.28k  52.28k  52.28k       0  52.28k
# Query size     0   2.32k      54      54      54      54       0      54
# String:
# Databases    xxxxx
# Hosts        localhost
# Users        xxxxx
# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms  ################################################################
# 100ms
#    1s
#  10s+
# Tables
#    SHOW TABLE STATUS FROM `xxxxx` LIKE 'vtiger_email_track'\G
#    SHOW CREATE TABLE `xxxxx`.`vtiger_email_track`\G
# EXPLAIN /*!50100 PARTITIONS*/
select * from vtiger_email_track where mailid='627858'\G

The final step is to feed that output into your favorite generative AI tool.

And then... you watch.

The AI will tell you exactly what is wrong with the query and, more importantly, how to fix it. It will explain that a missing index on a WHERE clause is causing a full table scan, or that a join condition is inefficient. It will then give you the precise ALTER TABLE commands needed to create the necessary indexes.

For the output above, it gave me this response

This pt-query-digest output shows a query that, while individually fast, is highly inefficient and contributes to overall database load due to frequent, repetitive full table scans.

Analysis

Frequent Execution: The query runs 44 times, indicating it's part of a routine operation (e.g., loading a user's mailbox view).

Excessive Rows Examined: The most critical issue is that each execution of the query examines an average of 52,280 rows to return only about one row. The total number of rows examined is a staggering 2.25 million. This is a classic symptom of a full table scan, where the database has to read every single row in the table to find the one that matches the WHERE clause.

Execution Time: While the average execution time is only 34 milliseconds, a full table scan for a single row is still a waste of resources. The cumulative time across all 44 executions is 2 seconds, and with more users or a larger dataset, this problem would quickly escalate.

The Culprit: The WHERE clause where mailid='627858' is the source of the problem. The mailid column is not indexed, forcing the database to perform an inefficient full table scan.

Recommendations

To solve this, you must add an index to the mailid column in the vtiger_email_track table. This is a simple but highly effective change.
Add an Index to mailid: Run the following SQL command to create an index on the mailid column.
 ALTER TABLE vtiger_email_track ADD INDEX `idx_mailid` (`mailid`);
This will allow MySQL to use a highly efficient B-tree index to quickly locate the specific row(s) that match the mailid value, reducing the Rows examine count from tens of thousands to a handful.
By adding this index, you will dramatically reduce the load on your database, speed up this frequently run query, and improve the overall performance of the application that relies on it.

💡

It's that easy, that powerful!

Sincerely, it is mindblowing to have extracted that response and conclusion from the output of pt-query-digest. In the past, this process would have required hours of research, EXPLAIN plan analysis, and a deep understanding of database internals. Now, with a handful of powerful, open-source tools, a dash of generative AI, and some local brain intelligence, you can go from being intimidated by MySQL to an optimization wizard in a single afternoon.

As a recap, the procedure is:

Optimizing MySQL Application Queries

The Tools

The Workflow

1. The Quick Audit: `mysqltuner`

2. The Safety Net: `etckeeper`

3. The Configuration File: `mysqld.cnf`

4. The Forensic Analysis: `pt-query-digest`

5. The Final, Easy Step: Generative AI

Analysis

Recommendations

Go ahead, give it a try!

Comments

More from this blog

The Relentless Push

The 60% Effort Rule in the Age of AI

The "Average" Problem: Why AGI is a Moving Goalpost

Stop Guessing, Start Measuring Your Git Repository

Hacktoberfest 2025

Command Palette

The Tools

The Workflow

1. The Quick Audit: mysqltuner

2. The Safety Net: etckeeper

3. The Configuration File: mysqld.cnf

4. The Forensic Analysis: pt-query-digest

5. The Final, Easy Step: Generative AI

Analysis

Recommendations

Go ahead, give it a try!

Comments

More from this blog

1. The Quick Audit: `mysqltuner`

2. The Safety Net: `etckeeper`

3. The Configuration File: `mysqld.cnf`

4. The Forensic Analysis: `pt-query-digest`