nintendo switch

The role, implementation, and reflection of singleflight

title: The Role, Implementation, and Reflection of singleflight
date: 2021-12-14 17:31:29
tags: ['go']

Recently, I learned and implemented singleflight in GeeCache and wanted to share my understanding in an article.

What is it?

First, let's introduce the concept of cache breakdown:

At the moment a key expires in the cache, there are a large number of requests for that key, which all hit the database, causing a sudden increase in database requests and pressure.

It's easy to understand. Simply think of the cache as a map[string]interface{}, and the get(key) operation can be divided into three steps:

  1. Check if the key exists in the map. If it does, return the value directly.
  2. If the key does not exist, call fn(key) to retrieve the data from the database.
  3. After the call is completed and the database returns the result, cache the result in the map and return it.

When there is a sudden influx of requests and the key does not exist in the map, the first request will go to step 2 and call fn(key) to access the database. While the fn(key) of the first request has not returned yet, subsequent requests arrive. The result can only be cached after the function call is completed, but at this time the function has not returned yet. Therefore, the subsequent requests will also see that the key does not exist in the cache and continue to call fn(key) to access the database, ultimately leading to a large number of requests directly hitting the database, just like the cache has been broken.

How to solve this problem? One straightforward idea is to let the subsequent requests "notice" that fn is being called at this moment, so that they don't make redundant calls and wait for the existing fn to return the result. This is exactly what singleflight does.

How to do it?

Let's first refer to the implementation in groupcache:

// Package singleflight provides a duplicate function call suppression
// mechanism.
package singleflight

import "sync"

// call is an in-flight or completed Do call
type call struct {
	wg  sync.WaitGroup
	val interface{}
	err error

// Group represents a class of work and forms a namespace in which
// units of work can be executed with duplicate suppression.
type Group struct {
	mu sync.Mutex       // protects m
	m  map[string]*call // lazily initialized

// Do executes and returns the results of the given function, making
// sure that only one execution is in-flight for a given key at a
// time. If a duplicate comes in, the duplicate caller waits for the
// original to complete and receives the same results.
func (g *Group) Do(key string, fn func() (interface{}, error)) (interface{}, error) {
	if g.m == nil {
		g.m = make(map[string]*call)
	if c, ok := g.m[key]; ok {
		return c.val, c.err
	c := new(call)
	g.m[key] = c

	c.val, c.err = fn()
	delete(g.m, key)

	return c.val, c.err

The implementation is very simple. It abstracts a function call into the call structure, which stores the return value val and err of the function call, as well as a sync.WaitGroup used to implement "singleton".

Group is the core of the implementation for preventing duplicate calls. It internally maintains a mapping from keys to function calls and protects the mapping with a mutex.

When calling the Do method:

  1. The mapping is lazily initialized.
  2. Check if the function call for the key already exists. If it does, wait for the function to return the result.
  3. If it doesn't exist, initialize a new function call, save it in the mapping, and then call the function. After the function call is completed, remove it from the mapping.

In this code, the use of sync.WaitGroup is particularly clever. As I mentioned in my previous article, Understanding sync.Cond in Go:

sync.WaitGroup is also used for goroutine synchronization, but its use case is exactly the opposite of sync.Cond. The latter is mostly used for multiple goroutines waiting and single goroutine notifying, while the former is mostly used for single goroutine waiting for multiple goroutines to complete.

In this case, the author achieves a similar effect to sync.Cond by using sync.WaitGroup flexibly, which can be considered elegant.

What are the issues?

The above code works fine when fn returns normally, but we have to consider exceptional cases. What if fn encounters a problem during execution?

Consider a scenario where fn is delayed for some reason and does not return for a long time. This will cause a large number of requests to be blocked at the c.wg.Wait() position, which may lead to:

  • A large increase in the number of goroutines
  • A sharp increase in memory usage
  • ...

How to solve this? We can refer to the official implementation. In the official extended version, two public methods are added to Group:

  • func (g *Group) DoChan(key string, fn func() (interface{}, error)) <-chan Result

    DoChan is like Do but returns a channel that will receive the results when they are ready.

    The returned channel will not be closed.

    DoChan is similar to Do, but it returns a channel that will receive the results when they are ready.

    The returned channel will not be closed.

  • func (g *Group) Forget(key string)

    Forget tells the singleflight to forget about a key. Future calls to Do for this key will call the function rather than waiting for an earlier call to complete.

    Forget tells the singleflight to forget about a key. Future calls to Do for this key will call the function rather than waiting for an earlier call to complete.

The first method, DoChan, can effectively solve the above problem: because the result is returned as a channel instead of a value, the user can control the timeout to prevent requests from being blocked for a long time:

ch := g.DoChan(key, func() (interface{}, error) {
    return result, err

timeout := time.After(500 * time.Millisecond)

select {
case <-timeout:
        // Timeout
case <-ch:
    // Return the result

The main use case of the second method, Forget, can be found in the article sync.singleflight 到底怎么用才对?:

In some scenarios where availability is extremely important, a certain request saturation is often required to ensure the final success rate of the business. For downstream services, there is not much difference between one request and multiple requests. In this case, using singleflight is just to reduce the number of requests. Therefore, using Forget() can increase the concurrency of downstream requests:

v, _, shared := g.Do(key, func() (interface{}, error) {
    go func() {
        time.Sleep(10 * time.Millisecond)
        fmt.Printf("Deleting key: %v\n", key)
    ret, err := find(context.Background(), key)
    return ret, err

When one concurrent request exceeds 10ms, a second request will be initiated. At this time, only one request within 10ms can be initiated at most, resulting in a maximum concurrency of 100 QPS. The impact of a single request failure is greatly reduced.


In no particular order:

Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.