Skip to content

Fix retries during inference#523

Draft
borzunov wants to merge 1 commit into
mainfrom
fix-inference-retry
Draft

Fix retries during inference#523
borzunov wants to merge 1 commit into
mainfrom
fix-inference-retry

Conversation

@borzunov

@borzunov borzunov commented Sep 28, 2023

Copy link
Copy Markdown
Collaborator

#331 introduced a bug during inference retries that caused this:

[INFO] Route found: 0:18 via1EBzGt
[WARN] [petals.client.inference_session.step:327] Caught exception when running inference via RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) (retry in 2 sec): AssertionError("Broken input cache: span=RemoteSpanInfo(peer_id=<libp2p.peer.id.ID (12D3KooWLRHAtX9ccW9i1NvpPLigwGX9MstGw3oyCZwmh21EBzGt)>, start=0, end=18, server_info=ServerInfo(state=<ServerState.ONLINE: 2>, throughput=1040.823002928876, public_name=':duck:FYY:sun_with_face:', version='2.2.0', network_rps=1185.4980980484086, forward_rps=9887.81852782432, inference_rps=343.100557603763, adapters=(), torch_dtype='bfloat16', quant_type='nf4', using_relay=False, cache_tokens_left=1179648, next_pings={...})) shape=torch.Size([1, 579, 8192]) position=0 n_input_tokens=1")
Comment thread src/petals/client/inference_session.py Outdated
# If there is a failed span, this code replaces it, otherwise it just adds new ones
if server_idx < n_prev_spans:
updated_sessions[0].history = self._server_sessions[server_idx].history
updated_sessions[0].position = self._position

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix

@borzunov borzunov marked this pull request as draft September 28, 2023 17:33
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from 160211a to 567e34b Compare September 28, 2023 18:40
@borzunov borzunov force-pushed the fix-inference-retry branch 2 times, most recently from ef59cf6 to 3f70ab6 Compare October 23, 2023 16:50
@borzunov borzunov force-pushed the fix-inference-retry branch from 9289d93 to 63282af Compare October 23, 2023 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant