Specifically, the two optimization strategies I want to compare are optimizing MySQL and caching. Point out upfront that these optimizations are orthogonal, and the only reason you should choose one over the other is that they both cost resources, namely development time.
Optimizing MySQL
When optimizing MySQL, you usually first check the query statement sent to mysql, and then run the explain command. A common approach after a little review is to add indexes or make some adjustments to the schema.
Advantages
1. An optimized query is fast for all users using the application. Because indexes retrieve data at logarithmic complexity speed (aka fractionalization, as you search a phone book, gradually narrowing the search scope), and maintain good performance as the amount of data increases. Caching the results of an unindexed query may sometimes perform worse as the data grows. As data grows, users who miss the cache may get a bad experience and the application becomes unavailable.
2. When optimizing MySQL, you don’t need to worry about cache invalidation or cached data expiration.
3. Optimizing MySQL can simplify the technical architecture, making it easier to copy and work in the development environment.
Disadvantages
1. Some queries cannot improve performance through indexing alone, and may also need to change the mode. In some cases, this may be very troublesome for some applications.
2. Some schema changes may be used for denormalization (data backup). Although this is a common technique for DBAs, it requires ownership to ensure that everything is updated by the application, or triggers need to be installed to guarantee such changes.
3. Some optimization methods may be unique to MySQL. That is, if the underlying software is ported to work on multiple databases, it is difficult to ensure that some of the more complex optimization techniques other than adding indexes are universal.
Using caching
This kind of optimization requires people to analyze the actual situation of the application, and then separate the expensive processing parts from MySQL and replace them with third-party caches, such as memcached or Redis.
Advantages
1. Caching will work well for some queries that are difficult to optimize by MySql itself, such as large-scale aggregation or grouping queries.
2. Caching may be a good solution to improve the throughput of the system. For example, the response speed is very slow when multiple people access the application at the same time.
3. The cache may be easier to build on top of another application. For example: your application may be the front end of another software package that uses MySQL to store data, and it is very difficult to make any database changes to this software package.
Disadvantages
1. If the data provides multiple external access paradigms (for example, displayed in different forms on different pages), then it may be difficult to expire or update the cache, and/or it may be necessary to tolerate expired data The data. A feasible alternative is to design a more sophisticated caching mechanism. Of course, it also has the disadvantage that obtaining the cache multiple times will increase the latency.
2. Caching an expensive object may have a potential performance difference for users who miss the cache (see Advantages of Optimizing MySQL #1). Some good performance practices suggest that you should try to minimize the differences between users, rather than just average them out (which caches tend to do).
3. Naive cache implementation is unable to deal with some subtle vulnerabilities, such as avalanche effect. Just last week I helped someone whose database server was overwhelmed by multiple user requests trying to regenerate the same cached content at the same time. The correct strategy is to introduce a certain level of locking to serialize cache regeneration requests.
Summary
Under normal circumstances, I would recommend users to optimize MySQL first, because this is the most appropriate solution in my opinion at the beginning. But in the long run, most applications will have some use cases that require the above solutions to be implemented at the same time to some extent.