singleflight的作用、實現及思考

最近學習實現了 GeeCache 中的singleflight，寫篇文章談談自己的理解。

是什麼？#

首先介紹一下快取擊穿的概念：

一個存在的 key，在快取過期的一刻，同時有大量的請求，這些請求都會擊穿到 DB ，造成瞬時 DB 請求量大、壓力驟增。

其實很好理解，將快取簡單理解成map[string]interface{}，get(key)主要分為三步：

檢查 key 是否存在於 map 中，如存在則直接返回
key 不存在，調用fn(key)從資料庫中獲取資料
調用完成，資料庫返回結果，將返回的結果快取到 map 中並返回

當出現瞬時大量請求且 key 不存在於 map 中時，第一個請求會走到步驟二調用fn(key)訪問資料庫，在第一個請求的fn(key)還未返回時，後續請求到達。函數調用完成後才能快取結果，但此時函數還未返回，所以後續請求同樣會看到 key 不存在於快取中，繼續調用fn(key)訪問資料庫，最終導致大量請求直接落到資料庫，就像快取被擊穿一樣。

如何解決這個問題？一個很直接的想法是讓後續請求 “察覺” 到此時fn正在調用，讓後續請求不要重複調用，等待此時存在的fn返回結果即可。這就是singleflight做到的事情。

如何做？#

我們首先參考 groupcache 中的實現：

// Package singleflight provides a duplicate function call suppression
// mechanism.
package singleflight

import "sync"

// call is an in-flight or completed Do call
type call struct {
	wg  sync.WaitGroup
	val interface{}
	err error
}

// Group represents a class of work and forms a namespace in which
// units of work can be executed with duplicate suppression.
type Group struct {
	mu sync.Mutex       // protects m
	m  map[string]*call // lazily initialized
}

// Do executes and returns the results of the given function, making
// sure that only one execution is in-flight for a given key at a
// time. If a duplicate comes in, the duplicate caller waits for the
// original to complete and receives the same results.
func (g *Group) Do(key string, fn func() (interface{}, error)) (interface{}, error) {
	g.mu.Lock()
	if g.m == nil {
		g.m = make(map[string]*call)
	}
	if c, ok := g.m[key]; ok {
		g.mu.Unlock()
		c.wg.Wait()
		return c.val, c.err
	}
	c := new(call)
	c.wg.Add(1)
	g.m[key] = c
	g.mu.Unlock()

	c.val, c.err = fn()
	c.wg.Done()

	g.mu.Lock()
	delete(g.m, key)
	g.mu.Unlock()

	return c.val, c.err
}

實現非常簡單，將一次函數調用抽象為call結構體，其中保存了函數調用的返回結果val和err，以及一個用於實現 “單例” 的sync.WaitGroup。

Group是實現非重複調用的核心，內建了 key 到函數調用的映射，以及保護映射的互斥鎖。

在調用Do方法時：

懶加載映射
查看 key 對應的函數調用是否存在，如果已經存在則直接等待函數返回結果
不存在則初始化一個新的函數調用，將其保存到映射中後調用函數，函數調用完成後刪除映射

在這段程式碼中，sync.WaitGroup使用的尤其巧妙。我在上篇文章有提過：

sync.WaitGroup 同樣用於協程同步，但應用場景與 sync.Cond 剛好相反，後者多用於多協程等待，單協程通知，而前者多用於單協程等待多協程執行完畢。

而在此處，作者通過靈活使用sync.WaitGroup，達到了類似於sync.Cond的效果，堪稱優雅。

有什麼問題？#

上述程式碼在fn正常返回的情況下不會有任何問題，但我們不得不考慮異常情況，如果fn執行遇到問題呢？

考慮一種場景，fn由於若干原因遲遲未返回，那麼會有大量請求阻塞在c.wg.Wait()位置，這可能會導致：

協程數量暴增
記憶體使用暴漲
……

如何解決？我們可以參考官方的實現。可以看到官方的拓展版本裡，為Group拓展了兩個公開方法：

func (g *Group) DoChan(key string, fn func() (interface{}, error)) <-chan Result

DoChan is like Do but returns a channel that will receive the results when they are ready.

The returned channel will not be closed.

DoChan 類似 Do，但會返回一個當結果就緒時收到結果的 channel。

返回的 channel 不會被關閉。
func (g *Group) Forget(key string)

Forget tells the singleflight to forget about a key. Future calls to Do for this key will call the function rather than waiting for an earlier call to complete.

Forget 告訴 singleflight 遺忘一個 key。將來對該 key Do 的調用會調用這個函數，而不是等待先前的調用完成。

前者 DoChan 可以很好地解決上述問題：因為返回的結果是 channel 而不是值，用戶可以對其做超時控制，防止請求長時間阻塞：

ch := g.DoChan(key, func() (interface{}, error) {
    ...
    return result, err
})

timeout := time.After(500 * time.Millisecond)

select {
case <-timeout:
        // 超時
    return
case <-ch:
    // 返回結果
}

而後者的主要應用場景，我在sync.singleflight 到底怎麼用才對？找到了答案：

在一些對可用性要求極高的場景下，往往需要一定的請求飽和度來保證業務的最終成功率。一次請求還是多次請求，對於下游服務而言並沒有太大區別，此時使用 singleflight 只是為了降低請求的數量級，那麼使用 Forget () 提高下游請求的並發:
v, _, shared := g.Do(key, func() (interface{}, error) {
    go func() {
        time.Sleep(10 * time.Millisecond)
        fmt.Printf("Deleting key: %v\n", key)
        g.Forget(key)
    }()
    ret, err := find(context.Background(), key)
    return ret, err
})
當有一個並發請求超過 10ms，那麼將會有第二個請求發起，此時只有 10ms 內的請求最多發起一次請求，即最大並發：100 QPS。單次請求失敗的影響大大降低。

參考資料#

以下順序不分先後：