# math **Repository Path**: xlldkj/math ## Basic Information - **Project Name**: math - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2026-06-03 - **Last Updated**: 2026-06-03 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # math-tutor-agent A long-running HTTP service that wraps **MiniMax (M2)** into a high-school math tutoring agent. Designed to be called by an external AI agent (e.g. an OpenAI-compatible orchestrator, a Claude/GPT tool-use loop, or a custom dispatcher) over plain HTTP on `127.0.0.1:8765`. The service holds three hard product constraints that are enforced by **prompt + guard + response flag** at every layer: | # | Constraint | Endpoint | |---|--------------------------------------------------------|-------------------------| | 1 | Image question parsing — ignore "参考答案" / 解/… | `POST /v1/image/parse` | | 2 | Tutoring — guidance only, **never** a final answer | `POST /v1/ask` | | 3 | Grading — point out the error, **never** the answer | `POST /v1/grade` | --- ## Table of contents 1. [Quick start](#quick-start) 2. [Configuration](#configuration) 3. [Running the service](#running-the-service) 4. [HTTP API reference](#http-api-reference) 5. [AI agent integration](#ai-agent-integration) 6. [Three hard constraints — how they're enforced](#three-hard-constraints--how-theyre-enforced) 7. [Project layout](#project-layout) 8. [Tests](#tests) 9. [Security & secrets](#security--secrets) 10. [Limitations](#limitations) 11. [License](#license) --- ## Quick start > Requires **Python ≥ 3.10**, a working internet connection to > `https://api.minimaxi.com`, and a valid MiniMax API key. ```bash git clone math-tutor-agent cd math-tutor-agent # 1. Create and activate a virtualenv python3 -m venv .venv source .venv/bin/activate # 2. Install dependencies pip install -r requirements.txt # 3. Configure your MiniMax key cp .env.example .env # edit .env and replace MINIMAX_API_KEY= # 4. Start the service in the background ./start.sh # 5. Verify it's healthy curl -s http://127.0.0.1:8765/v1/health # → {"status":"ok","minimax_reachable":true,...} # 6. Tail logs tail -f logs/app.log # 7. Stop the service ./stop.sh ``` If you'd rather run it in the foreground (e.g. under `systemd` or in a container), skip `start.sh` and run: ```bash uvicorn app.main:app --host 127.0.0.1 --port 8765 ``` --- ## Configuration All configuration is read from environment variables (or a `.env` file in the project root). See [`.env.example`](.env.example) for the full template. | Variable | Required | Default | Notes | |------------------------|----------|--------------------------------------|----------------------------------------------------| | `MINIMAX_API_KEY` | **yes** | *(empty)* | MiniMax official API key. **Never commit this.** | | `MINIMAX_BASE_URL` | no | `https://api.minimaxi.com/v1` | OpenAI-compatible chat-completion endpoint. | | `MINIMAX_VLM_MODEL` | no | `MiniMax-M3` | Multimodal model for `/v1/image/parse`. | | `MINIMAX_TEXT_MODEL` | no | `MiniMax-M3` | Text model for `/v1/ask` and `/v1/grade`. | | `MINIMAX_TIMEOUT` | no | `60` | Seconds before a MiniMax request is aborted. | | `MATH_TUTOR_HOST` | no | `127.0.0.1` | Bind address. Loopback only by default. | | `MATH_TUTOR_PORT` | no | `8765` | Bind port. | | `MATH_TUTOR_LOG_LEVEL` | no | `INFO` | `DEBUG` / `INFO` / `WARNING` / `ERROR`. | > ⚠️ **Never put a real key in `.env.example`.** The example file is tracked in > git and exists solely as a template. The actual `.env` is in `.gitignore`. --- ## Running the service ### `start.sh` / `stop.sh` `start.sh` activates the local `.venv` (if present), reads `.env`, and runs `uvicorn` in the background with `nohup`, redirecting output to `logs/app.log` and writing the child PID to `.math-tutor.pid`. `stop.sh` reads that PID file and sends `SIGTERM` (then `SIGKILL` after a 3 s grace period). ```bash ./start.sh # Starting math-tutor-agent on 127.0.0.1:8765... # math-tutor-agent started: PID=12345, log=logs/app.log # - health: curl http://127.0.0.1:8765/v1/health # - stop: ./stop.sh tail -f logs/app.log # 2026-06-03 13:46:00 INFO app.main :: math-tutor-agent 0.1.0 starting on 127.0.0.1:8765 # 2026-06-03 13:46:00 INFO uvicorn.error :: Application startup complete. ./stop.sh # Stopping math-tutor-agent (PID 12345)... # math-tutor-agent stopped. ``` ### Under `systemd` / Docker / a process manager `start.sh` is just a thin wrapper. The canonical entry point is: ```bash python3 -m uvicorn app.main:app --host 127.0.0.1 --port 8765 ``` You can drop it into a `systemd` unit, a Dockerfile, a `supervisord` program, or a `Procfile` verbatim. Make sure `MINIMAX_API_KEY` is in the process environment. --- ## HTTP API reference The service speaks OpenAPI 3 — once it is running, interactive docs are at `http://127.0.0.1:8765/docs`. ### `GET /v1/health` Cheap liveness probe. Pings MiniMax with a 32-token completion to verify the key + network path. ```bash curl -s http://127.0.0.1:8765/v1/health ``` ```json { "status": "ok", "minimax_reachable": true } ``` If `minimax_reachable` is `false`, the service is up but will return `502 BAD_GATEWAY` from any `/v1/ask` / `/v1/grade` / `/v1/image/parse` call. ### `POST /v1/session` / `DELETE /v1/session/{sid}` Sessions are the unit of memory for the ask/grade endpoints. The store is **in-memory and process-local** — restarting the service clears all sessions, and a multi-replica deployment will not share state. ```bash SID=$(curl -s -X POST http://127.0.0.1:8765/v1/session | jq -r .session_id) echo "$SID" # e.g. 6f1c2c4e8a6f4d27a3e2c8a3b9d1f5a0 curl -s -X DELETE "http://127.0.0.1:8765/v1/session/$SID" # → {"session_id":"6f1c...","deleted":true} ``` You may also pass an explicit `history` array in the request body of `/v1/ask`, in which case the server-side session is not consulted. ### `POST /v1/image/parse` *(constraint 1)* `multipart/form-data` upload of a single problem image. Returns the structured stem — `raw_text`, `latex`, sub-questions, topic, etc. The service is **prompted to ignore** any "参考答案:…", "解:…", or "标准答案:…" residue in the image; a defensive regex sweep scrubs anything that slips through. ```bash curl -s -X POST http://127.0.0.1:8765/v1/image/parse \ -F "image=@/path/to/problem.png" | jq ``` Response: ```json { "parsed": { "raw_text": "已知 f(x) = 2tan(x - π/3), 求其对称中心的最小正 a 值。", "latex": "f(x) = 2\\tan\\left(x - \\frac{\\pi}{3}\\right)", "has_figure": false, "figure_description": "", "sub_questions": ["(1) 求最小正 a 值"], "topic": "trigonometric_functions", "notes": "" }, "contains_reference_answer": false, "image_size": 142330 } ``` Constraints: - Allowed MIME: `image/png`, `image/jpeg`, `image/jpg`, `image/webp`. - Hard size cap: **8 MiB**. Larger uploads return `413 Payload Too Large`. ### `POST /v1/ask` *(constraint 2)* Sends a question + a specific ask to the tutor. The tutor is **forbidden from returning a final answer** — it will give you a direction, a hint, or a re-prompt, and never a solved-out result. The raw model output is also passed through a regex guard (`app/utils/guard.py`); if a leak is detected, the response is replaced with a Socratic re-prompt and `guard_triggered` is set to `true`. ```bash curl -s -X POST http://127.0.0.1:8765/v1/ask \ -H "Content-Type: application/json" \ -d "{ \"session_id\": \"$SID\", \"question\": \"求 f(x) = 2tan(x - π/3) 的对称中心, 最小 a 是多少。\", \"ask\": \"完全不会, 怎么入手?\" }" | jq ``` Response: ```json { "session_id": "6f1c...", "answer": "1. 切线方程需要切点 (1, f(1)) 与斜率 f'(1)。\n2. 请先算 f(1)...\n3. 再求 f'(1), 代入 y - f(1) = f'(1)(x - 1)。", "contains_final_answer": false, "guard_triggered": false } ``` Request body schema: | Field | Type | Required | Notes | |--------------|---------------------|----------|-----------------------------------------------------| | `session_id` | string | yes | 1–128 chars. Server-side history is keyed by this. | | `question` | string | yes | The problem text the student is working on. | | `ask` | string | yes | The student's specific question / ask. | | `history` | list\ \| null | no | Optional override. If present, server history is bypassed for this call. | ### `POST /v1/grade` *(constraint 3)* Sends a question + a list of student-written steps. The grader points out **the first erroneous step** and the **direction** of the fix; it is prompted to never produce a complete correct derivation, and the regex guard catches the most common leak patterns. ```bash curl -s -X POST http://127.0.0.1:8765/v1/grade \ -H "Content-Type: application/json" \ -d "{ \"session_id\": \"$SID\", \"question\": \"求 f(x) = sin(x) + x^2 的导数。\", \"steps_text\": [ \"f'(x) = cos(x) + 2x\", \"所以 f'(x) = -cos(x) + 2x\" ] }" | jq ``` Response: ```json { "session_id": "6f1c...", "error_info": { "error_step": 2, "reason": "这一步把 sin(x) 的导数写成了 -cos(x)。", "concept_misuse": "基本求导公式记反。", "formula_error": "(sin x)' = cos x。", "reading_error": null, "fix_direction": "请重新对照基本求导表, 把三角函数的导数再核一遍。" }, "contains_correct_answer": false, "guard_triggered": false } ``` Request body schema: | Field | Type | Required | Notes | |------------------|-----------------|----------|----------------------------------------------------| | `session_id` | string | yes | 1–128 chars. | | `question` | string | yes | The problem text. | | `steps_text` | list\ | yes* | Plain-text steps, in order. | | `steps_image_b64`| string \| null | no | Optional base64 of a handwritten solution image. | At least one of `steps_text` / `steps_image_b64` must be non-empty. ### Error envelope All 4xx/5xx responses use FastAPI's default `{"detail": "..."}` shape. Expect: - `400 Bad Request` — empty image, missing field, validation failure. - `404 Not Found` — unknown `session_id` on `DELETE`. - `413 Payload Too Large` — image > 8 MiB. - `415 Unsupported Media Type` — wrong MIME. - `502 Bad Gateway` — MiniMax call failed; see `logs/app.log` for the underlying error. --- ## AI agent integration The service is designed to be called by an **outer agent loop** (Claude tool use, GPT function calling, LangChain, a custom dispatcher, …). The recommended integration is to register **three** tools on the calling side, all of which forward to the corresponding `POST /v1/...` endpoint. The caller should **never** invent answers on top of the service's replies — it should forward them verbatim, because the service's response has already been guard-scrubbed. ### OpenAI-compatible function-calling schema Drop these into your tool manifest: ```jsonc [ { "type": "function", "function": { "name": "math_parse_image", "description": "Upload a problem image. Returns the structured problem statement (raw text, LaTeX, sub-questions, topic). The service ignores any 'reference answer' or '解:…' content in the image.", "parameters": { "type": "object", "properties": { "image_path": { "type": "string", "description": "Absolute path to the image file on the agent host." } }, "required": ["image_path"] } } }, { "type": "function", "function": { "name": "math_ask", "description": "Ask the math tutor a Socratic question about a problem the student is working on. The tutor will give guidance but NEVER a final answer. Forward the tutor's reply verbatim — do not add, remove, or solve any step on the caller's side.", "parameters": { "type": "object", "properties": { "session_id": { "type": "string" }, "question": { "type": "string", "description": "The problem text the student is working on." }, "ask": { "type": "string", "description": "The student's specific question / ask (free-form Chinese or English)." } }, "required": ["session_id", "question", "ask"] } } }, { "type": "function", "function": { "name": "math_grade", "description": "Submit a student's solution (as a list of plain-text steps) for grading. The grader will point out the first erroneous step and the direction of the fix, but NEVER the correct derivation. Forward the response verbatim.", "parameters": { "type": "object", "properties": { "session_id": { "type": "string" }, "question": { "type": "string" }, "steps_text": { "type": "array", "items": { "type": "string" } } }, "required": ["session_id", "question", "steps_text"] } } } ] ``` ### Reference dispatcher (Python) A complete reference dispatcher is provided in [`openclaw_agent.py`](openclaw_agent.py). It is roughly 350 lines and demonstrates the full pattern: tool registration, multi-turn session memory, error handling, and a guardrail system prompt that prevents the outer LLM from re-introducing the answer. Run it with: ```bash # interactive REPL — type problems, /img for images python3 openclaw_agent.py # canned 5-turn dialogue against /path/to/problem.png python3 openclaw_agent.py --demo --image /path/to/problem.png ``` ### Caller-side guardrail prompt When wiring this into a multi-LLM agent, the outer LLM's system prompt **must** include language equivalent to: > You are a dispatcher, NOT a tutor. For any student math question, you > must call one of `math_parse_image`, `math_ask`, or `math_grade`. You > must never solve the problem yourself. You must forward the backend's > response to the student verbatim, even if it looks too terse. Without this, the outer LLM will sometimes "fix up" the backend's guidance by solving the rest of the problem — which is exactly what constraint 2 / 3 forbid. --- ## Three hard constraints — how they're enforced Each constraint has **three** independent layers of defense. If the model slips past layer 1, layer 2 catches it; if both fail, layer 3 makes the leak visible to the caller for post-hoc audit. | Layer | Mechanism | Source | |-------|---------------------------------------------------------------------------------|-------------------------------------| | 1 | System prompt with explicit "DO NOT" rules and worked examples | `app/prompts/{parse_image,ask,grade}.py` | | 2 | Regex / keyword guard on the raw model output; replaces leaky text with a fallback | `app/utils/guard.py` | | 3 | `contains_final_answer` / `contains_correct_answer` / `guard_triggered` flags | `app/services/{tutor,grader}.py`, `app/services/parser.py` | For constraint 1 (image parse), a second defensive sweep is applied **after** layer 1: any text matching `参考答案[::]`, `解[::]`, `解析[::]`, or `https?://` is stripped from the response before it leaves the service. To verify the constraints end-to-end after any change: ```bash # constraint 1: upload an image with a "参考答案" watermark and assert # that the response's raw_text does not contain "参考答案" curl -s -X POST http://127.0.0.1:8765/v1/image/parse \ -F "image=@tests/fixtures/q_watermark.png" | jq '.parsed.raw_text' # constraint 2: ask "答案是什么" and assert the response does not match # any pattern in app.utils.guard.FINAL_ANSWER_PATTERNS curl -s -X POST http://127.0.0.1:8765/v1/ask \ -H "Content-Type: application/json" \ -d '{"session_id":"t","question":"1+1=?","ask":"答案是什么"}' | jq # constraint 3: submit a deliberately wrong step and assert # error_info.error_step is populated, error_info.fix_direction is # a *direction* (no complete correct derivation). curl -s -X POST http://127.0.0.1:8765/v1/grade \ -H "Content-Type: application/json" \ -d '{"session_id":"t","question":"d/dx sin(x)","steps_text":["cos(x)","-cos(x)"]}' | jq ``` --- ## Project layout ``` math-tutor-agent/ ├── app/ │ ├── main.py # FastAPI app, lifespan, router wiring │ ├── config.py # pydantic-settings (reads .env) │ ├── minimax_client.py # async wrapper over MiniMax chat-completion v2 │ ├── prompts/ # the three hard-constraint SYSTEM prompts │ │ ├── parse_image.py # constraint 1 │ │ ├── ask.py # constraint 2 │ │ └── grade.py # constraint 3 │ ├── services/ # business logic │ │ ├── parser.py # image → ParsedQuestion │ │ ├── tutor.py # Socratic answer │ │ └── grader.py # error-pointing grading │ ├── routers/ # one router per resource, mounted under /v1 │ │ ├── health.py │ │ ├── image.py │ │ ├── ask.py │ │ ├── grade.py │ │ └── session.py │ ├── models/schemas.py # Pydantic request/response models │ └── utils/ │ ├── guard.py # answer-leak detector & sanitizer │ └── session.py # in-memory session store ├── tests/ # pytest, FakeMiniMax stub, no real network ├── openclaw_agent.py # reference AI-agent dispatcher ├── start.sh / stop.sh # background lifecycle ├── requirements.txt ├── pyproject.toml ├── .env.example # template — tracked ├── .env # actual key — gitignored └── README.md ``` --- ## Tests ```bash source .venv/bin/activate pytest -q ``` The test suite uses FastAPI's `TestClient` and a `FakeMiniMax` stub (`tests/conftest.py`) so it makes **no real network calls**. The `test_guard.py` module exercises the regex blacklists against both safe and leaking text corpora. `test_tutor.py` and `test_grader.py` verify that the guard fires on a model output that contains a leak and that the response is replaced with the Socratic fallback. --- ## Security & secrets - **Never** commit `.env`. It is in `.gitignore`. The tracked template is `.env.example`. - If you accidentally commit a key, **rotate it immediately** on the MiniMax dashboard — git history is forever. - The service binds to `127.0.0.1` by default. If you change `MATH_TUTOR_HOST` to `0.0.0.0`, anyone reachable on the network can call your endpoint (and burn through your API quota). Put it behind a reverse proxy with auth if you need to expose it. - Image uploads are capped at 8 MiB and validated to be PNG/JPEG/WebP. Larger or wrong-type payloads are rejected at the router before they reach the model. - The service does not log the API key. It does log session IDs and short error excerpts to `logs/app.log`; if you ship that log off-host, scrub it first. --- ## Limitations - **Single-process session store.** Restarting the service clears all sessions. Multi-replica deployments will not share memory — each replica has its own store. Use the explicit `history` field on `/v1/ask` to drive a fully stateless agent. - **No persistence.** There is no database; image, question, and grading data live only for the duration of the request. - **No auth.** The endpoints are open. Bind to loopback and/or put behind a reverse proxy. - **No streaming.** The current model integration is request/response only. SSE / chunked transfer is a future-work item. - **Best-effort guard.** The regex guard catches the most common leak shapes but is not a complete formal proof. Treat `contains_final_answer` / `contains_correct_answer` as a hint, not as a guarantee. --- ## License [MIT](LICENSE)