首頁  >  文章  >  後端開發  >  如何將 Kubernetes 支援的領導者選舉添加到您的 Go 應用程式中

如何將 Kubernetes 支援的領導者選舉添加到您的 Go 應用程式中

WBOY
WBOY原創
2024-07-20 09:15:39908瀏覽

How to add Kubernetes-powered leader election to your Go apps

最初由部落格發佈

Kubernetes 標準庫充滿了寶石,隱藏在生態系統中的許多不同的子包中。我最近發現了一個這樣的範例 k8s.io/client-go/tools/leaderelection,它可用於向 Kubernetes 叢集內執行的任何應用程式新增領導者選舉協定。本文將討論什麼是領導者選舉,它是如何在這個 Kubernetes 套件中實現的,並提供一個範例來說明如何在我們自己的應用程式中使用這個函式庫。

領導人選舉

領導者選舉是一個分散式系統概念,是高可用軟體的核心構建塊。它允許多個並發進程相互協調並選舉一個「領導者」進程,然後該進程負責執行同步操作,例如寫入資料儲存。

這在分散式資料庫或快取等系統中非常有用,在這些系統中,多個進程正在運行以針對硬體或網路故障建立冗餘,但無法同時寫入儲存以確保資料一致性。如果領導者進程在未來某個時刻變得無回應,剩餘進程將啟動新的領導者選舉,最終選擇一個新進程作為領導者。

利用這個概念,我們可以創建具有單一領導者和多個備用副本的高可用軟體。

在 Kubernetes 中,controller-runtime 套件使用領導者選舉來使控制器具有高可用性。在控制器部署中,僅當進程是領導者並且其他副本處於等待狀態時才會發生資源協調。如果 Leader Pod 沒有回應,剩餘的副本將選出一個新的 Leader 來執行後續協調並恢復正常運作。

Kubernetes 租賃

該程式庫使用 Kubernetes Lease 或分散式鎖,可以由進程取得。租約是由單一身分在給定期限內持有的原生 Kubernetes 資源,並具有續約選項。 以下是文件中的範例規格:

apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
  labels:
    apiserver.kubernetes.io/identity: kube-apiserver
    kubernetes.io/hostname: master-1
  name: apiserver-07a5ea9b9b072c4a5f3d1c3702
  namespace: kube-system
spec:
  holderIdentity: apiserver-07a5ea9b9b072c4a5f3d1c3702_0c8914f7-0f35-440e-8676-7844977d3a05
  leaseDurationSeconds: 3600
  renewTime: "2023-07-04T21:58:48.065888Z"

k8s 生態係以三種方式使用租約:

  1. 節點心跳:每個節點都有對應的Lease資源,並不斷更新其renewTime欄位。如果 Lease 的 renewTime 一段時間沒有更新,則該 Node 將被污染為不可用,並且不會再為其調度 Pod。
  2. Leader Election:在這種情況下,Lease 用來透過讓 Leader 更新 Lease 的holderIdentity 來協調多個進程。具有不同身分的備用副本陷入等待租約到期的狀態。如果租約確實到期,並且領導者沒有續訂,則會進行新的選舉,其中剩餘的副本嘗試通過用自己的持有人身份更新其持有者身份來獲得租約的所有權。由於 Kubernetes API 伺服器不允許更新過時的對象,因此只有一個備用節點能夠成功更新租約,此時它將作為新的領導者繼續執行。
  3. API 伺服器身分:從 v1.26 開始,作為測試功能,每個 kube-apiserver 副本將透過建立專用租約來發布其身分。由於這是一個相對較小的新功能,因此除了運行的 API 伺服器數量之外,無法從 Lease 物件派生太多其他內容。但這確實為未來的 k8s 版本中的這些 Lease 添加更多元資料留下了空間。

現在讓我們透過編寫範例程式來探索租賃的第二個用例,以演示如何在領導者選舉場景中使用它們。

範例程式

在此程式碼範例中,我們使用 Leaderelection 套件來處理領導者選舉和租約操作細節。

package main

import (
    "context"
    "fmt"
    "os"
    "time"

    "k8s.io/client-go/tools/leaderelection"
    rl "k8s.io/client-go/tools/leaderelection/resourcelock"
    ctrl "sigs.k8s.io/controller-runtime"
)

var (
    // lockName and lockNamespace need to be shared across all running instances
    lockName      = "my-lock"
    lockNamespace = "default"

    // identity is unique to the individual process. This will not work for anything,
    // outside of a toy example, since processes running in different containers or
    // computers can share the same pid.
    identity      = fmt.Sprintf("%d", os.Getpid())
)

func main() {
    // Get the active kubernetes context
    cfg, err := ctrl.GetConfig()
    if err != nil {
        panic(err.Error())
    }

    // Create a new lock. This will be used to create a Lease resource in the cluster.
    l, err := rl.NewFromKubeconfig(
        rl.LeasesResourceLock,
        lockNamespace,
        lockName,
        rl.ResourceLockConfig{
            Identity: identity,
        },
        cfg,
        time.Second*10,
    )
    if err != nil {
        panic(err)
    }

    // Create a new leader election configuration with a 15 second lease duration.
    // Visit https://pkg.go.dev/k8s.io/client-go/tools/leaderelection#LeaderElectionConfig
    // for more information on the LeaderElectionConfig struct fields
    el, err := leaderelection.NewLeaderElector(leaderelection.LeaderElectionConfig{
        Lock:          l,
        LeaseDuration: time.Second * 15,
        RenewDeadline: time.Second * 10,
        RetryPeriod:   time.Second * 2,
        Name:          lockName,
        Callbacks: leaderelection.LeaderCallbacks{
            OnStartedLeading: func(ctx context.Context) { println("I am the leader!") },
            OnStoppedLeading: func() { println("I am not the leader anymore!") },
            OnNewLeader:      func(identity string) { fmt.Printf("the leader is %s\n", identity) },
        },
    })
    if err != nil {
        panic(err)
    }

    // Begin the leader election process. This will block.
    el.Run(context.Background())

}

leaderelection 包的優點在於它提供了一個基於回調的框架來處理領導者選舉。這樣,您可以以精細的方式對特定的狀態變化採取行動,並在選舉新領導者時適當地釋放資源。透過在單獨的 goroutine 中運行這些回調,該套件利用 Go 強大的並發支援來有效地利用機器資源。

測試一下

為了測試這一點,讓我們使用 kind 啟動一個測試叢集。

$ kind create cluster

將範例程式碼複製到 main.go 中,建立一個新模組(go mod init Leaderelectiontest)並整理它(go mod tidy)以安裝其依賴項。執行 go run main.go 後,您應該會看到以下輸出:

$ go run main.go
I0716 11:43:50.337947     138 leaderelection.go:250] attempting to acquire leader lease default/my-lock...
I0716 11:43:50.351264     138 leaderelection.go:260] successfully acquired lease default/my-lock
the leader is 138
I am the leader!

The exact leader identity will be different from what's in the example (138), since this is just the PID of the process that was running on my computer at the time of writing.

And here's the Lease that was created in the test cluster:

$ kubectl describe lease/my-lock
Name:         my-lock
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  coordination.k8s.io/v1
Kind:         Lease
Metadata:
  Creation Timestamp:  2024-07-16T15:43:50Z
  Resource Version:    613
  UID:                 1d978362-69c5-43e9-af13-7b319dd452a6
Spec:
  Acquire Time:            2024-07-16T15:43:50.338049Z
  Holder Identity:         138
  Lease Duration Seconds:  15
  Lease Transitions:       0
  Renew Time:              2024-07-16T15:45:31.122956Z
Events:                    <none>

See that the "Holder Identity" is the same as the process's PID, 138.

Now, let's open up another terminal and run the same main.go file in a separate process:

$ go run main.go
I0716 11:48:34.489953     604 leaderelection.go:250] attempting to acquire leader lease default/my-lock...
the leader is 138

This second process will wait forever, until the first one is not responsive. Let's kill the first process and wait around 15 seconds. Now that the first process is not renewing its claim on the Lease, the .spec.renewTime field won't be updated anymore. This will eventually cause the second process to trigger a new leader election, since the Lease's renew time is older than its duration. Because this process is the only one now running, it will elect itself as the new leader.

the leader is 604
I0716 11:48:51.904732     604 leaderelection.go:260] successfully acquired lease default/my-lock
I am the leader!

If there were multiple processes still running after the initial leader exited, the first process to acquire the Lease would be the new leader, and the rest would continue to be on standby.

No single-leader guarantees

This package is not foolproof, in that it "does not guarantee that only one client is acting as a leader (a.k.a. fencing)". For example, if a leader is paused and lets its Lease expire, another standby replica will acquire the Lease. Then, once the original leader resumes execution, it will think that it's still the leader and continue doing work alongside the newly-elected leader. In this way, you can end up with two leaders running simultaneously.

To fix this, a fencing token which references the Lease needs to be included in each request to the server. A fencing token is effectively an integer that increases by 1 every time a Lease changes hands. So a client with an old fencing token will have its requests rejected by the server. In this scenario, if an old leader wakes up from sleep and a new leader has already incremented the fencing token, all of the old leader's requests would be rejected because it is sending an older (smaller) token than what the server has seen from the newer leader.

Implementing fencing in Kubernetes would be difficult without modifying the core API server to account for corresponding fencing tokens for each Lease. However, the risk of having multiple leader controllers is somewhat mitigated by the k8s API server itself. Because updates to stale objects are rejected, only controllers with the most up-to-date version of an object can modify it. So while we could have multiple controller leaders running, a resource's state would never regress to older versions if a controller misses a change made by another leader. Instead, reconciliation time would increase as both leaders need to refresh their own internal states of resources to ensure that they are acting on the most recent versions.

Still, if you're using this package to implement leader election using a different data store, this is an important caveat to be aware of.

Conclusion

Leader election and distributed locking are critical building blocks of distributed systems. When trying to build fault-tolerant and highly-available applications, having tools like these at your disposal is critical. The Kubernetes standard library gives us a battle-tested wrapper around its primitives to allow application developers to easily build leader election into their own applications.

While use of this particular library does limit you to deploying your application on Kubernetes, that seems to be the way the world is going recently. If in fact that is a dealbreaker, you can of course fork the library and modify it to work against any ACID-compliant and highly-available datastore.

Stay tuned for more k8s source deep dives!

以上是如何將 Kubernetes 支援的領導者選舉添加到您的 Go 應用程式中的詳細內容。更多資訊請關注PHP中文網其他相關文章!

陳述:
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn