Cache Frequently Asked Questions

No. article first appeared in public preserved egg blackboard

Author worked Jingdong, guarantee stability, agile, advanced JAVA, micro-architecture in-depth understanding of service



Cache penetration:

Cache penetration refers to a certain query data does not exist, because the cache is not passive when writing a hit, and in consideration of fault-tolerant, if not find out from the storage layer data is not written to the cache, which will lead to the absence of data each request to the storage layer to make inquiries, lost the meaning of the cache. When large flow, might DB hung up, if someone does not take advantage of the presence of frequent attacks our key applications, this is the loophole.



There are many ways to effectively solve the problem of penetration of the cache, the most common is the use of Bloom filter, all possible data hash to a sufficiently large bitmap in a certain absence of data will be the bitmap interception off, thus avoiding queries pressure on the underlying storage system. There is also a more simple and crude way (we use is this), if a query returned an empty data (whether data does not exist, or system failure), we still see the empty cache results, but it’s expiration time will be very short, no longer than five minutes.


Cache avalanche

Cache refers to the use of the same avalanche of-date when we set the cache, the cache at the same time lead to failure at some point, all requests forwarded to DB, DB instantaneous pressure is too large avalanche.



Avalanche effect when the cache invalidation impact on the underlying system is very terrible. Most system designers to consider to ensure cache single-threaded (process) by way of locking or write queue, in order to avoid failure of a large number of concurrent requests fall on the underlying storage system. Here to share a simple program to tell when a cache miss time spread, for example, we can add a random value in the original expiration time basis, such as random 1-5 minutes, so that the repetition rate of the expiration time of each cache will be reduced, it is difficult to raise the event a collective failure.


Cache breakdown

An extremely “hot” data at the point in time expired, leading to a sharp rise in the backend DB pressure.

For some key set expiration time, if these key may be ultra concurrently access certain point in time, it is a very “hot” data. This time, it is necessary to consider the question: cache is “breakdown” of the problem, and this difference is that here avalanche cache for a key cache, the former is a lot of key.

At a point in time when the cache expires, just at this point in time there are a large number of concurrent requests this Key over, these requests are usually found in the cache expires load data from the backend DB and set back into the cache, this time a large concurrent request It may momentarily overwhelmed by the backend DB.



    Use mutex: perceived cache invalidation, when to query DB, the use of distributed lock so that only one thread to load the database data, the failure of the locking thread can wait.

    Manual Expiration: Never set the expiration time on redis, will feature the presence of key expiration time corresponding to the value, if found to be expired, to build the cache through a background asynchronous thread, which is the “manual” expired.


Author: Jingdong Senior Engineer – Liang Songhua, in-depth understanding of the stability of security, agile development, JAVA advanced, micro-service architecture

Leave a Reply