0

I am doing some tests using Ollama on local computer, with Llama 3.2, which consists in prompting a task against a document.

I read that after having reached maximum context, I should restart the session: https://www.reddit.com/r/ollama/comments/1hrsoav/comment/n2bgj2r/

It is confusing statement for me because I don't know what a session is when I start the server on a local computer.

It makes wonder if I should restart the ollama server each time I run an experiment.

The experiment consist in executing a prompt on a document; I am testing the effects of different prompts, and of context size; I am re-running a test over and over, each time assigning the results to the same variable, and repeat, like so:

from ollama import chat, ChatResponse, Options

def get_completion(prompt: str, system_prompt="", prefill=""):
    response = chat(
        model=MODEL_NAME,
        options=Options(
            max_tokens=2000,
            temperature=0.0,
            num_ctx=2048*4,
        ),
        messages=[
            {"role": "system", "content": system_prompt},  
            {"role": "user", "content": prompt},
            {"role": "assistant", "content": prefill}
            
        ]
    )
    return response.message.content

And then save the results

results = get_completion(PROMPT, SYSTEM_PROMPT, PREFILL)
# Save on file

# Change Prompt
# Repeat

However, not sure if internally the model is keeping memory of prior prompts and their results: is the last prompt independent, or biased from prior "chats" ?

Should I restart the server each time to ensure the test of prompt is indepentent from prior tests, or not necessary ?

Grateful if you could clarify what a session is when ollama run a model on a local computer and prompt are executed in juptyter's cells (I mean, the chat is not continuous).

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.