-
Notifications
You must be signed in to change notification settings - Fork 2.6k
[Bug]: Configuration for custom model endpoint does not work #1460
Description
Bug Description
Hi,
I have llm model run by vllm.
I'm using command:
hermes model
than
http://vllm:8000/v1
and this returns error: httpx.ReadError: [Errno 104] Connection reset by peer
when I set:
http://vllm:8000
than in vllm logs I see an error: "POST /chat/completions HTTP/1.1" 404 Not Found
curl from environment where agent is installed is working, for example:
curl http://vllm:8000/v1/models
the most annoying is lack of any information how final url to llm inference service looks like.
It's hard to debug and to see where is a problem.
And the biggest problem in this application is lack of service verification during setup.
So during hermes model
should be verification using some end point for example models
if there is communication, and fast feedback to the user if there are some problems
and url must be changed
Steps to Reproduce
run hermes chat
than type hi
there is error: httpx.ReadError: [Errno 104] Connection reset by peer
Expected Behavior
During installation there was sequence:
⚠ An inference provider is required for Hermes to work.
Select your inference provider:
↑/↓ Navigate Enter Select Esc Skip Ctrl+C Exit
◆ Custom OpenAI-Compatible Endpoint
Works with any API that follows OpenAI's chat completions spec
API base URL (e.g., https://api.example.com/v1): http://vllm:8000/v1
API key:
Model name (e.g., gpt-4, claude-3-opus) [anthropic/claude-opus-4.6]: llm
✓ Custom endpoint configured
so because example is: https://api.example.com/v1
than I use url with /v1 suffix
Actual Behavior
there is lack any information in vllm logs about access to the /v1/chat/completions service
Affected Component
Setup / Installation
Messaging Platform (if gateway-related)
N/A (CLI only)
Operating System
ubuntu:26.04
Python Version
bash: python: command not found
Hermes Version
0.2.0
Relevant Logs / Traceback
Root Cause Analysis (optional)
No response
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?
- I'd like to fix this myself and submit a PR