Skip to content

Commit 252f7a4

Browse files
Refactor agent for testability and add integration test
- Refactored `spendee_agent.py` to accept a question as an argument and return the LLM response, making it easier to test. - Replaced the previous mock-based test with a new integration test that makes a real LLM call. - The test verifies that the agent's response to a specific question about the sky's color contains the keyword "Rayleigh". - The prompt in the test was made more specific to ensure a reliable response from the LLM.
1 parent 5990638 commit 252f7a4

3 files changed

Lines changed: 33 additions & 3 deletions

File tree

agent-test/spendee_agent.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414

1515
app = MCPApp(name="hello_world_agent")
1616

17-
async def example_usage():
17+
async def example_usage(question: str):
1818
async with app.run() as mcp_agent_app:
1919
logger = mcp_agent_app.logger
2020
# This agent can read the filesystem or fetch URLs
@@ -32,10 +32,11 @@ async def example_usage():
3232

3333
# This will perform a file lookup and read using the filesystem server
3434
result = await llm.generate_str(
35-
message="Why is the sky blue? Explain in two sentences."
35+
message=question
3636
)
3737
logger.info(f"Response: {result}")
38+
return result
3839

3940

4041
if __name__ == "__main__":
41-
asyncio.run(example_usage())
42+
asyncio.run(example_usage("Why is the sky blue? Explain in two sentences."))

agent-test/test_spendee_agent.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import pytest
2+
import spendee_agent
3+
4+
@pytest.mark.asyncio
5+
async def test_example_usage_with_real_llm():
6+
"""
7+
Tests that the example_usage function, when making a real LLM call,
8+
returns a response that contains the expected keyword.
9+
"""
10+
# Arrange
11+
question = "Why is the sky blue? Explain in two sentences, mentioning the scientific name for the scattering effect."
12+
13+
# Act
14+
response = await spendee_agent.example_usage(question)
15+
16+
# Assert
17+
assert response is not None, "The LLM response should not be None."
18+
assert "rayleigh" in response.lower(), "The response should contain the word 'Rayleigh'."

agents.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,3 +55,14 @@ The authentication in the firebase_client.py is not intuitive, you may used to h
5555

5656

5757
If you face authorization problems, troubleshoot what identities and wallets are used, or you may experiment with new firebase centric functions, but the authentication steps in the login flow should be only modified if user approved or explicitly asked.
58+
59+
## Jules Agent
60+
61+
### Session Start Checklist
62+
- `git pull origin main`
63+
- `./setup.sh`
64+
- `source .venv/bin/activate`
65+
66+
### Session End Checklist
67+
- All tests pass without errors.
68+
- All learnings from the development process are documented in either the existing docs, `agents.md`, or a new `docs/session-learnings-<date>-<topic>.md` file.

0 commit comments

Comments
 (0)