v3rpc: map lease.ErrNotPrimary to codes.Unavailable instead of codes.Unknown#21856
v3rpc: map lease.ErrNotPrimary to codes.Unavailable instead of codes.Unknown#21856crawfordxx wants to merge 1 commit into
Conversation
…Unknown When LeaseRenew returns lease.ErrNotPrimary, togRPCError falls through to status.Error(codes.Unknown, ...) because ErrNotPrimary is absent from the toGRPCErrorMap. This surfaces to clients as gRPC code 2 (Unknown) with message "not a primary lessor", which gives no indication that a retry on a different member is appropriate. Add ErrGRPCNotPrimary = codes.Unavailable alongside the other lease errors, register it in errStringToError and the client-side error vars, and map lease.ErrNotPrimary to it in toGRPCErrorMap. Unavailable is consistent with ErrGRPCNoLeader and ErrGRPCLeaderChanged and signals to clients that the current member is temporarily unavailable for this operation. Fixes etcd-io#21671 Signed-off-by: crawfordxx <crawfordxx@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: crawfordxx The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @crawfordxx. Thanks for your PR. I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Which issue(s) does the PR fix:
Fixes #21671
Release notes for end users (ALL commits must be considered).
Reviewers should verify clarity and quality.
Summary
lease.ErrNotPrimarywas missing fromtoGRPCErrorMapinserver/etcdserver/api/v3rpc/util.go. WhenLeaseRenewreturned thiserror (e.g. after a leader switch causes
ensureLeadership()to fail),togRPCErrorfell through to the catch-allstatus.Error(codes.Unknown, …),surfacing as gRPC code 2 (Unknown) with the message
"not a primary lessor"— giving clients no signal that retrying on adifferent member would succeed.
Root cause
togRPCErroruses a direct map lookup (toGRPCErrorMap[err]), so everysentinel error that should map to a specific gRPC code must be listed
explicitly.
lease.ErrNotPrimarywas never added.Fix
api/v3rpc/rpctypes/error.go— addErrGRPCNotPrimary = codes.Unavailablealongside the other lease errors, register it in
errStringToErrorand theclient-side
ErrNotPrimaryvar.server/etcdserver/api/v3rpc/util.go— maplease.ErrNotPrimary → rpctypes.ErrGRPCNotPrimary.codes.Unavailableis consistent withErrGRPCNoLeaderandErrGRPCLeaderChangedand is the correct code for "this member can't servethe request right now, try another one".