# math

**Repository Path**: xlldkj/math

## Basic Information

- **Project Name**: math
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-06-03
- **Last Updated**: 2026-06-03

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# math-tutor-agent

A long-running HTTP service that wraps **MiniMax (M2)** into a high-school math
tutoring agent. Designed to be called by an external AI agent (e.g. an
OpenAI-compatible orchestrator, a Claude/GPT tool-use loop, or a custom
dispatcher) over plain HTTP on `127.0.0.1:8765`.

The service holds three hard product constraints that are enforced by **prompt
+ guard + response flag** at every layer:

| # | Constraint                                             | Endpoint                |
|---|--------------------------------------------------------|-------------------------|
| 1 | Image question parsing — ignore "参考答案" / 解/…   | `POST /v1/image/parse`  |
| 2 | Tutoring — guidance only, **never** a final answer     | `POST /v1/ask`          |
| 3 | Grading — point out the error, **never** the answer    | `POST /v1/grade`        |

---

## Table of contents

1. [Quick start](#quick-start)
2. [Configuration](#configuration)
3. [Running the service](#running-the-service)
4. [HTTP API reference](#http-api-reference)
5. [AI agent integration](#ai-agent-integration)
6. [Three hard constraints — how they're enforced](#three-hard-constraints--how-theyre-enforced)
7. [Project layout](#project-layout)
8. [Tests](#tests)
9. [Security & secrets](#security--secrets)
10. [Limitations](#limitations)
11. [License](#license)

---

## Quick start

> Requires **Python ≥ 3.10**, a working internet connection to
> `https://api.minimaxi.com`, and a valid MiniMax API key.

```bash
git clone <your-repo-url> math-tutor-agent
cd math-tutor-agent

# 1. Create and activate a virtualenv
python3 -m venv .venv
source .venv/bin/activate

# 2. Install dependencies
pip install -r requirements.txt

# 3. Configure your MiniMax key
cp .env.example .env
# edit .env and replace MINIMAX_API_KEY=<your-key>

# 4. Start the service in the background
./start.sh

# 5. Verify it's healthy
curl -s http://127.0.0.1:8765/v1/health
# → {"status":"ok","minimax_reachable":true,...}

# 6. Tail logs
tail -f logs/app.log

# 7. Stop the service
./stop.sh
```

If you'd rather run it in the foreground (e.g. under `systemd` or in a
container), skip `start.sh` and run:

```bash
uvicorn app.main:app --host 127.0.0.1 --port 8765
```

---

## Configuration

All configuration is read from environment variables (or a `.env` file in
the project root). See [`.env.example`](.env.example) for the full template.

| Variable               | Required | Default                              | Notes                                              |
|------------------------|----------|--------------------------------------|----------------------------------------------------|
| `MINIMAX_API_KEY`      | **yes**  | *(empty)*                            | MiniMax official API key. **Never commit this.**   |
| `MINIMAX_BASE_URL`     | no       | `https://api.minimaxi.com/v1`        | OpenAI-compatible chat-completion endpoint.        |
| `MINIMAX_VLM_MODEL`    | no       | `MiniMax-M3`                         | Multimodal model for `/v1/image/parse`.            |
| `MINIMAX_TEXT_MODEL`   | no       | `MiniMax-M3`                         | Text model for `/v1/ask` and `/v1/grade`.          |
| `MINIMAX_TIMEOUT`      | no       | `60`                                 | Seconds before a MiniMax request is aborted.       |
| `MATH_TUTOR_HOST`      | no       | `127.0.0.1`                          | Bind address. Loopback only by default.            |
| `MATH_TUTOR_PORT`      | no       | `8765`                               | Bind port.                                         |
| `MATH_TUTOR_LOG_LEVEL` | no       | `INFO`                               | `DEBUG` / `INFO` / `WARNING` / `ERROR`.            |

> ⚠️ **Never put a real key in `.env.example`.** The example file is tracked in
> git and exists solely as a template. The actual `.env` is in `.gitignore`.

---

## Running the service

### `start.sh` / `stop.sh`

`start.sh` activates the local `.venv` (if present), reads `.env`, and runs
`uvicorn` in the background with `nohup`, redirecting output to `logs/app.log`
and writing the child PID to `.math-tutor.pid`. `stop.sh` reads that PID
file and sends `SIGTERM` (then `SIGKILL` after a 3 s grace period).

```bash
./start.sh
# Starting math-tutor-agent on 127.0.0.1:8765...
# math-tutor-agent started: PID=12345, log=logs/app.log
#   - health: curl http://127.0.0.1:8765/v1/health
#   - stop:   ./stop.sh

tail -f logs/app.log
# 2026-06-03 13:46:00 INFO    app.main :: math-tutor-agent 0.1.0 starting on 127.0.0.1:8765
# 2026-06-03 13:46:00 INFO    uvicorn.error :: Application startup complete.

./stop.sh
# Stopping math-tutor-agent (PID 12345)...
# math-tutor-agent stopped.
```

### Under `systemd` / Docker / a process manager

`start.sh` is just a thin wrapper. The canonical entry point is:

```bash
python3 -m uvicorn app.main:app --host 127.0.0.1 --port 8765
```

You can drop it into a `systemd` unit, a Dockerfile, a `supervisord`
program, or a `Procfile` verbatim. Make sure `MINIMAX_API_KEY` is in the
process environment.

---

## HTTP API reference

The service speaks OpenAPI 3 — once it is running, interactive docs are at
`http://127.0.0.1:8765/docs`.

### `GET /v1/health`

Cheap liveness probe. Pings MiniMax with a 32-token completion to verify
the key + network path.

```bash
curl -s http://127.0.0.1:8765/v1/health
```

```json
{ "status": "ok", "minimax_reachable": true }
```

If `minimax_reachable` is `false`, the service is up but will return
`502 BAD_GATEWAY` from any `/v1/ask` / `/v1/grade` / `/v1/image/parse` call.

### `POST /v1/session` / `DELETE /v1/session/{sid}`

Sessions are the unit of memory for the ask/grade endpoints. The store is
**in-memory and process-local** — restarting the service clears all
sessions, and a multi-replica deployment will not share state.

```bash
SID=$(curl -s -X POST http://127.0.0.1:8765/v1/session | jq -r .session_id)
echo "$SID"
# e.g. 6f1c2c4e8a6f4d27a3e2c8a3b9d1f5a0

curl -s -X DELETE "http://127.0.0.1:8765/v1/session/$SID"
# → {"session_id":"6f1c...","deleted":true}
```

You may also pass an explicit `history` array in the request body of
`/v1/ask`, in which case the server-side session is not consulted.

### `POST /v1/image/parse`  *(constraint 1)*

`multipart/form-data` upload of a single problem image. Returns the
structured stem — `raw_text`, `latex`, sub-questions, topic, etc. The
service is **prompted to ignore** any "参考答案：…", "解：…", or
"标准答案：…" residue in the image; a defensive regex sweep scrubs anything
that slips through.

```bash
curl -s -X POST http://127.0.0.1:8765/v1/image/parse \
    -F "image=@/path/to/problem.png" | jq
```

Response:

```json
{
  "parsed": {
    "raw_text": "已知 f(x) = 2tan(x - π/3), 求其对称中心的最小正 a 值。",
    "latex": "f(x) = 2\\tan\\left(x - \\frac{\\pi}{3}\\right)",
    "has_figure": false,
    "figure_description": "",
    "sub_questions": ["(1) 求最小正 a 值"],
    "topic": "trigonometric_functions",
    "notes": ""
  },
  "contains_reference_answer": false,
  "image_size": 142330
}
```

Constraints:
- Allowed MIME: `image/png`, `image/jpeg`, `image/jpg`, `image/webp`.
- Hard size cap: **8 MiB**. Larger uploads return `413 Payload Too Large`.

### `POST /v1/ask`  *(constraint 2)*

Sends a question + a specific ask to the tutor. The tutor is **forbidden
from returning a final answer** — it will give you a direction, a hint,
or a re-prompt, and never a solved-out result. The raw model output is
also passed through a regex guard (`app/utils/guard.py`); if a leak is
detected, the response is replaced with a Socratic re-prompt and
`guard_triggered` is set to `true`.

```bash
curl -s -X POST http://127.0.0.1:8765/v1/ask \
    -H "Content-Type: application/json" \
    -d "{
      \"session_id\": \"$SID\",
      \"question\": \"求 f(x) = 2tan(x - π/3) 的对称中心, 最小 a 是多少。\",
      \"ask\": \"完全不会, 怎么入手?\"
    }" | jq
```

Response:

```json
{
  "session_id": "6f1c...",
  "answer": "1. 切线方程需要切点 (1, f(1)) 与斜率 f'(1)。\n2. 请先算 f(1)...\n3. 再求 f'(1), 代入 y - f(1) = f'(1)(x - 1)。",
  "contains_final_answer": false,
  "guard_triggered": false
}
```

Request body schema:

| Field        | Type                | Required | Notes                                               |
|--------------|---------------------|----------|-----------------------------------------------------|
| `session_id` | string              | yes      | 1–128 chars. Server-side history is keyed by this.  |
| `question`   | string              | yes      | The problem text the student is working on.         |
| `ask`        | string              | yes      | The student's specific question / ask.              |
| `history`    | list\<object\> \| null | no   | Optional override. If present, server history is bypassed for this call. |

### `POST /v1/grade`  *(constraint 3)*

Sends a question + a list of student-written steps. The grader points out
**the first erroneous step** and the **direction** of the fix; it is
prompted to never produce a complete correct derivation, and the regex
guard catches the most common leak patterns.

```bash
curl -s -X POST http://127.0.0.1:8765/v1/grade \
    -H "Content-Type: application/json" \
    -d "{
      \"session_id\": \"$SID\",
      \"question\": \"求 f(x) = sin(x) + x^2 的导数。\",
      \"steps_text\": [
        \"f'(x) = cos(x) + 2x\",
        \"所以 f'(x) = -cos(x) + 2x\"
      ]
    }" | jq
```

Response:

```json
{
  "session_id": "6f1c...",
  "error_info": {
    "error_step": 2,
    "reason": "这一步把 sin(x) 的导数写成了 -cos(x)。",
    "concept_misuse": "基本求导公式记反。",
    "formula_error": "(sin x)' = cos x。",
    "reading_error": null,
    "fix_direction": "请重新对照基本求导表, 把三角函数的导数再核一遍。"
  },
  "contains_correct_answer": false,
  "guard_triggered": false
}
```

Request body schema:

| Field            | Type            | Required | Notes                                              |
|------------------|-----------------|----------|----------------------------------------------------|
| `session_id`     | string          | yes      | 1–128 chars.                                       |
| `question`       | string          | yes      | The problem text.                                  |
| `steps_text`     | list\<string\>  | yes*     | Plain-text steps, in order.                        |
| `steps_image_b64`| string \| null  | no       | Optional base64 of a handwritten solution image.   |

At least one of `steps_text` / `steps_image_b64` must be non-empty.

### Error envelope

All 4xx/5xx responses use FastAPI's default `{"detail": "..."}` shape.
Expect:

- `400 Bad Request` — empty image, missing field, validation failure.
- `404 Not Found` — unknown `session_id` on `DELETE`.
- `413 Payload Too Large` — image > 8 MiB.
- `415 Unsupported Media Type` — wrong MIME.
- `502 Bad Gateway` — MiniMax call failed; see `logs/app.log` for the
  underlying error.

---

## AI agent integration

The service is designed to be called by an **outer agent loop** (Claude
tool use, GPT function calling, LangChain, a custom dispatcher, …). The
recommended integration is to register **three** tools on the calling
side, all of which forward to the corresponding `POST /v1/...`
endpoint. The caller should **never** invent answers on top of the
service's replies — it should forward them verbatim, because the
service's response has already been guard-scrubbed.

### OpenAI-compatible function-calling schema

Drop these into your tool manifest:

```jsonc
[
  {
    "type": "function",
    "function": {
      "name": "math_parse_image",
      "description": "Upload a problem image. Returns the structured problem statement (raw text, LaTeX, sub-questions, topic). The service ignores any 'reference answer' or '解：…' content in the image.",
      "parameters": {
        "type": "object",
        "properties": {
          "image_path": { "type": "string", "description": "Absolute path to the image file on the agent host." }
        },
        "required": ["image_path"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "math_ask",
      "description": "Ask the math tutor a Socratic question about a problem the student is working on. The tutor will give guidance but NEVER a final answer. Forward the tutor's reply verbatim — do not add, remove, or solve any step on the caller's side.",
      "parameters": {
        "type": "object",
        "properties": {
          "session_id": { "type": "string" },
          "question":    { "type": "string", "description": "The problem text the student is working on." },
          "ask":         { "type": "string", "description": "The student's specific question / ask (free-form Chinese or English)." }
        },
        "required": ["session_id", "question", "ask"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "math_grade",
      "description": "Submit a student's solution (as a list of plain-text steps) for grading. The grader will point out the first erroneous step and the direction of the fix, but NEVER the correct derivation. Forward the response verbatim.",
      "parameters": {
        "type": "object",
        "properties": {
          "session_id": { "type": "string" },
          "question":   { "type": "string" },
          "steps_text": { "type": "array", "items": { "type": "string" } }
        },
        "required": ["session_id", "question", "steps_text"]
      }
    }
  }
]
```

### Reference dispatcher (Python)

A complete reference dispatcher is provided in
[`openclaw_agent.py`](openclaw_agent.py). It is roughly 350 lines and
demonstrates the full pattern: tool registration, multi-turn session
memory, error handling, and a guardrail system prompt that prevents the
outer LLM from re-introducing the answer. Run it with:

```bash
# interactive REPL — type problems, /img <path> for images
python3 openclaw_agent.py

# canned 5-turn dialogue against /path/to/problem.png
python3 openclaw_agent.py --demo --image /path/to/problem.png
```

### Caller-side guardrail prompt

When wiring this into a multi-LLM agent, the outer LLM's system prompt
**must** include language equivalent to:

> You are a dispatcher, NOT a tutor. For any student math question, you
> must call one of `math_parse_image`, `math_ask`, or `math_grade`. You
> must never solve the problem yourself. You must forward the backend's
> response to the student verbatim, even if it looks too terse.

Without this, the outer LLM will sometimes "fix up" the backend's
guidance by solving the rest of the problem — which is exactly what
constraint 2 / 3 forbid.

---

## Three hard constraints — how they're enforced

Each constraint has **three** independent layers of defense. If the
model slips past layer 1, layer 2 catches it; if both fail, layer 3
makes the leak visible to the caller for post-hoc audit.

| Layer | Mechanism                                                                       | Source                              |
|-------|---------------------------------------------------------------------------------|-------------------------------------|
| 1     | System prompt with explicit "DO NOT" rules and worked examples                  | `app/prompts/{parse_image,ask,grade}.py` |
| 2     | Regex / keyword guard on the raw model output; replaces leaky text with a fallback | `app/utils/guard.py`              |
| 3     | `contains_final_answer` / `contains_correct_answer` / `guard_triggered` flags   | `app/services/{tutor,grader}.py`, `app/services/parser.py` |

For constraint 1 (image parse), a second defensive sweep is applied
**after** layer 1: any text matching `参考答案[：:]`, `解[：:]`,
`解析[：:]`, or `https?://` is stripped from the response before it
leaves the service.

To verify the constraints end-to-end after any change:

```bash
# constraint 1: upload an image with a "参考答案" watermark and assert
# that the response's raw_text does not contain "参考答案"
curl -s -X POST http://127.0.0.1:8765/v1/image/parse \
    -F "image=@tests/fixtures/q_watermark.png" | jq '.parsed.raw_text'

# constraint 2: ask "答案是什么" and assert the response does not match
# any pattern in app.utils.guard.FINAL_ANSWER_PATTERNS
curl -s -X POST http://127.0.0.1:8765/v1/ask \
    -H "Content-Type: application/json" \
    -d '{"session_id":"t","question":"1+1=?","ask":"答案是什么"}' | jq

# constraint 3: submit a deliberately wrong step and assert
# error_info.error_step is populated, error_info.fix_direction is
# a *direction* (no complete correct derivation).
curl -s -X POST http://127.0.0.1:8765/v1/grade \
    -H "Content-Type: application/json" \
    -d '{"session_id":"t","question":"d/dx sin(x)","steps_text":["cos(x)","-cos(x)"]}' | jq
```

---

## Project layout

```
math-tutor-agent/
├── app/
│   ├── main.py                # FastAPI app, lifespan, router wiring
│   ├── config.py              # pydantic-settings (reads .env)
│   ├── minimax_client.py      # async wrapper over MiniMax chat-completion v2
│   ├── prompts/               # the three hard-constraint SYSTEM prompts
│   │   ├── parse_image.py     #   constraint 1
│   │   ├── ask.py             #   constraint 2
│   │   └── grade.py           #   constraint 3
│   ├── services/              # business logic
│   │   ├── parser.py          #   image → ParsedQuestion
│   │   ├── tutor.py           #   Socratic answer
│   │   └── grader.py          #   error-pointing grading
│   ├── routers/               # one router per resource, mounted under /v1
│   │   ├── health.py
│   │   ├── image.py
│   │   ├── ask.py
│   │   ├── grade.py
│   │   └── session.py
│   ├── models/schemas.py      # Pydantic request/response models
│   └── utils/
│       ├── guard.py           # answer-leak detector & sanitizer
│       └── session.py         # in-memory session store
├── tests/                     # pytest, FakeMiniMax stub, no real network
├── openclaw_agent.py          # reference AI-agent dispatcher
├── start.sh / stop.sh         # background lifecycle
├── requirements.txt
├── pyproject.toml
├── .env.example               # template — tracked
├── .env                       # actual key — gitignored
└── README.md
```

---

## Tests

```bash
source .venv/bin/activate
pytest -q
```

The test suite uses FastAPI's `TestClient` and a `FakeMiniMax` stub
(`tests/conftest.py`) so it makes **no real network calls**. The
`test_guard.py` module exercises the regex blacklists against both
safe and leaking text corpora. `test_tutor.py` and `test_grader.py`
verify that the guard fires on a model output that contains a leak
and that the response is replaced with the Socratic fallback.

---

## Security & secrets

- **Never** commit `.env`. It is in `.gitignore`. The tracked template
  is `.env.example`.
- If you accidentally commit a key, **rotate it immediately** on the
  MiniMax dashboard — git history is forever.
- The service binds to `127.0.0.1` by default. If you change
  `MATH_TUTOR_HOST` to `0.0.0.0`, anyone reachable on the network can
  call your endpoint (and burn through your API quota). Put it behind
  a reverse proxy with auth if you need to expose it.
- Image uploads are capped at 8 MiB and validated to be PNG/JPEG/WebP.
  Larger or wrong-type payloads are rejected at the router before they
  reach the model.
- The service does not log the API key. It does log session IDs and
  short error excerpts to `logs/app.log`; if you ship that log
  off-host, scrub it first.

---

## Limitations

- **Single-process session store.** Restarting the service clears all
  sessions. Multi-replica deployments will not share memory — each
  replica has its own store. Use the explicit `history` field on
  `/v1/ask` to drive a fully stateless agent.
- **No persistence.** There is no database; image, question, and
  grading data live only for the duration of the request.
- **No auth.** The endpoints are open. Bind to loopback and/or put
  behind a reverse proxy.
- **No streaming.** The current model integration is request/response
  only. SSE / chunked transfer is a future-work item.
- **Best-effort guard.** The regex guard catches the most common leak
  shapes but is not a complete formal proof. Treat
  `contains_final_answer` / `contains_correct_answer` as a hint, not
  as a guarantee.

---

## License

[MIT](LICENSE)