By the way we can get the AZURE_API_KEY from the Home page.
2. Chat Client
2.1. Simple Chat Demonstration
Let's test a simple conversation with our chat model:
1messages =[{"role":"user","content":"What is 2+2?"}]2response = client.chat.completions.create(3 model="gpt-4.1-mini",4 messages=messages
5)6print(response.choices[0].message.content)
From this we get:
2.2. System Prompt and User Prompt
For a chat client there are two kinds of prompt to control the behaviour of a chat model:
The system prompt is intended to be more the overall instructions that sets the context for the task.
The user prompt is the actual question coming from the user.
2.3. Example: Read my Resume (pdf format) and answer the Question in that Resume
2.3.1. Preparation for Background Data
Let's load our resume (in pdf format) into texts:
1from pypdf import PdfReader
23reader = PdfReader("me/james_lee.pdf")4linkedin =""5for page in reader.pages:6 text = page.extract_text()7if text:8 linkedin += text
910withopen("me/summary.txt","r", encoding="utf-8")as f:11 summary = f.read()1213name ="James Lee"
This prepares the varibles linkedin, summary and name.
2.3.2. Preparation for System Prompt
1system_prompt =f"""
2You are acting as {name}. You are answering questions on {name}'s website,
3particularly questions related to {name}'s career, background, skills and experience.
4Your responsibility is to represent {name} for interactions on the website as faithfully as possible.
5You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions.
6Be professional and engaging, as if talking to a potential client or future employer who came across the website.
7If you don't know the answer, say so.
8"""910system_prompt +=f"\n\n## Summary:\n{summary}\n\n## LinkedIn Profile:\n{linkedin}\n\n"11system_prompt +=f"With this context, please chat with the user, always staying in character as {name}."
Now we can put constraint on the response schema via:
1# note it is parse(), not create()2response = client.chat.completions.parse(3 model="gpt-4.1-mini",4 messages=messages,5 response_format=Evaluation
6)7result = response.fhoices[0].message.parsed
2.5. System Prompt can be Dynamic
Note that system prompt can be changed according to the incoming message before we push the data to the LLM for answer:
1defchat(message, history):2if"patent"in message:3 system = system_prompt + "\n\nEverything in your reply needs to be in pig latin - \
4 it is mandatory that you respond only and entirely in pig latin"
5else:6 system = system_prompt
7// the updated system prompt:8 messages =[{"role":"system","content": system}] \
9+ history \
10+[{"role":"user","content": message}]11 response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)12 reply =response.choices[0].message.content
1314# we evalute the resposne via another model:15 evaluation = evaluate(reply, message, history)1617if evaluation.is_acceptable:18print("Passed evaluation - returning reply")19else:20print("Failed evaluation - retrying")21print(evaluation.feedback)22 reply = rerun(reply, message, history, evaluation.feedback)23return reply
We are free the prepend any adjusted system prompt to messages and inject it into our chat client.
When we need to rerun the flow, we adjust the system prompt again in rerun:
1defevaluate(reply, message, history)-> Evaluation:2# sytesm prompt: provide the rule for the evluation:3 messages =[{"role":"system","content": evaluator_system_prompt}] \
4# user prompt: provide the data for the evaluation, ask the chat client to evalute this as well5+[{"role":"user","content": evaluator_user_prompt(reply, message, history)}]6 response = gemini.beta.chat.completions.parse(model="gemini-2.0-flash", messages=messages, response_format=Evaluation)7return response.choices[0].message.parsed
89defrerun(reply, message, history, feedback):10 updated_system_prompt = system_prompt +"\n\n## Previous answer rejected\nYou just tried to reply, but the quality control rejected your reply\n"11 updated_system_prompt +=f"## Your attempted answer:\n{reply}\n\n"12 updated_system_prompt +=f"## Reason for rejection:\n{feedback}\n\n"13 messages =[{"role":"system","content": updated_system_prompt}]+ history +[{"role":"user","content": message}]14 response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)15return response.choices[0].message.content
3. Agents and Tools
This part is now replaced my MCP, but we leave a record here to understand what's happening under the hood.
The purpose of this section is to explain how complex is bringing tools into the applications and thus why we need MCP for the abstraction.
3.1. Define Tools
Usually we define tools as ordinary functions:
1defrecord_user_details(email, name="Name not provided", notes="not provided"):2 push(f"Recording interest from {name} with email {email} and notes {notes}")3return{"recorded":"ok"}45defrecord_unknown_question(question):6 push(f"Recording {question} asked that I couldn't answer")7return{"recorded":"ok"}
Next we define the metadata for the tools so that our chat client can grab the right choice(s) based on their descriptions and the user prompt (i.e., user message):
1record_user_details_json ={2"name":"record_user_details",3"description":"Use this tool to record that a user is interested in being in touch and provided an email address",4"parameters":{5"type":"object",6"properties":{7"email":{8"type":"string",9"description":"The email address of this user"10},11"name":{12"type":"string",13"description":"The user's name, if they provided it"14}15,16"notes":{17"type":"string",18"description":"Any additional information about the conversation that's worth recording to give context"19}20},21"required":["email"],22"additionalProperties":False23}24}2526record_unknown_question_json ={27"name":"record_unknown_question",28"description":"Always use this tool to record any question that couldn't be answered as you didn't know the answer",29"parameters":{30"type":"object",31"properties":{32"question":{33"type":"string",34"description":"The question that couldn't be answered"35},36},37"required":["question"],38"additionalProperties":False39}40}
1defhandle_tool_calls(tool_calls):2 results =[]3for tool_call in tool_calls:4 tool_name = tool_call.function.name
5 arguments = json.loads(tool_call.function.arguments)6print(f"Tool called: {tool_name}", flush=True)7 tool =globals().get(tool_name)8 result = tool(**arguments)if tool else{}9 results.append({"role":"tool","content": json.dumps(result),"tool_call_id": tool_call.id})10return results
1112defchat(message, history):13 messages =[{"role":"system","content": system_prompt}]+ history +[{"role":"user","content": message}]14 done =False15whilenot done:1617# This is the call to the LLM - see that we pass in the tools json18 response = client.chat.completions.create(19 model="gpt-4.1-mini", messages=messages, tools=tools)2021 finish_reason = response.choices[0].finish_reason
2223# If the LLM wants to call a tool, we do that!24if finish_reason=="tool_calls":25 message = response.choices[0].message
26 tool_calls = message.tool_calls
27 results = handle_tool_calls(tool_calls)28 messages.append(message)29 messages.extend(results)30else:31 done =True32return response.choices[0].message.content
Bringing tools into chat client is so tedious, complex and not maintainable, and the advent of MCP now simplifies the tooling process.