mirror of
https://github.com/therealaleph/MasterHttpRelayVPN-RUST.git
synced 2026-05-18 07:44:47 +03:00
1c9e73e4e7
Bumps v1.9.23 → v1.9.24. Two PRs from @dazzling-no-more: - #1108 (#1088): batch header read honors request_timeout_secs. Closes the 10s inner timeout cliff that was cascade-killing tunnel sessions under slow Apps Script edges. +22 regression tests (231 total). - #1117 (#620): cargo-chef Dockerfile so tunnel-node builds without BuildKit. Cloud Run's gcloud-deploy path now works. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
26 lines
6.2 KiB
Markdown
26 lines
6.2 KiB
Markdown
<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
|
|
• **Fix Full mode timeout cascade — \`batch header read honors request_timeout_secs\`** ([#1088](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/1088), [PR #1108](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1108) by @dazzling-no-more). در Full mode، یک Apps Script edge کند، تمام تونل sessionهای hot-and-flowing رو cascade-kill میکرد. کاربرها روی v1.9.21+ مرتب 10s "batch timeout" میدیدن و download progress تلگرام/browser رو از دست میدادن. **Root cause**: \`read_http_response\` در \`domain_fronter.rs\` یک hardcoded 10s header-read timeout داشت که داخل \`tunnel_batch_request_to\` اجرا میشد — مستقل از و کوتاهتر از outer \`tokio::time::timeout(batch_timeout, ...)\` در \`fire_batch\`. Apps Script cold starts معمولاً 8-12s طول میکشن (PR #1040's A/B 4/30 H1 batches رو ثبت کرد که دقیقاً 10s timeout میشدن)، پس inner cliff بهعنوان false-positive batch timeout قبل از اینکه \`request_timeout_secs\` (default 30s) trigger بشه fire میشد. **Fix**: (1) \`tunnel_batch_request_to\` حالا \`batch_timeout\` رو به header read pass میکنه via new \`read_http_response_with_header_timeout\` helper. (2) Header read یک absolute deadline استفاده میکنه (\`timeout_at\`) به جای per-read \`timeout()\` — slow drip-feed peer دیگه نمیتونه silently extend بزنه. (3) Bonus: \`TunnelMux::reply_timeout\` با \`batch_timeout\` co-vary میکنه (\`batch_timeout + 5s slack\`). ۲۰۹ → **۲۳۱ lib test** (+22 regression).
|
|
• **Docker: cargo-chef برای build بدون BuildKit** ([#620](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/620), [PR #1117](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1117) by @dazzling-no-more). \`tunnel-node/Dockerfile\` از BuildKit-only \`RUN --mount=type=cache\` استفاده میکرد که روی Cloud Run's \`gcloud run deploy --source .\` path شکست میخورد (underlying \`gcr.io/cloud-builders/docker\` builder BuildKit رو enable نمیکنه). cargo-chef pattern: \`recipe.json\` planner stage + \`cargo chef cook\` deps stage + final build with \`src/\` on top. Docker's regular layer cache حالا dependency reuse رو handle میکنه — warm rebuilds تنها \`src/\` رو compile میکنن. Base bump \`rust:1.85-slim\` → \`rust:1.90-slim\` (cargo-chef نیاز به rustc 1.86+ داره).
|
|
---
|
|
• **Fix Full mode timeout cascade — `batch header read honors request_timeout_secs`** ([#1088](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/1088), [PR #1108](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1108) by @dazzling-no-more). Under Full mode, a single slow Apps Script edge cascade-killed every in-flight tunnel session sharing its batch. Users on v1.9.21+ saw frequent 10s "batch timeout" errors and lost download progress on Telegram / browser sessions.
|
|
|
|
**Root cause**: `read_http_response` in `domain_fronter.rs` had a **hardcoded 10s header-read timeout** that ran *inside* `tunnel_batch_request_to` — independent of and shorter than the outer `tokio::time::timeout(batch_timeout, …)` in `fire_batch`. Apps Script cold starts routinely land in the 8-12s range (PR #1040's A/B recorded 4/30 H1 batches timing out at exactly 10s after the H2→H1 switch), so the inner cliff fired as a false-positive batch timeout well before `request_timeout_secs` (default 30s) could.
|
|
|
|
**Fix** (in `domain_fronter.rs` + `tunnel_client.rs`):
|
|
|
|
1. `tunnel_batch_request_to` passes `batch_timeout` to the header read via new `read_http_response_with_header_timeout` helper. `Config::request_timeout_secs` is now the only knob controlling how long we wait for an Apps Script edge to start responding. Other callers (relay path, exit-node) keep the historical 10s value.
|
|
2. Header read uses a single **absolute deadline** (`timeout_at`) instead of per-read `timeout()`. Total elapsed across all header reads is bounded regardless of read cadence — a slow drip-feed peer can no longer silently extend.
|
|
3. **`TunnelMux::reply_timeout`** co-varies with `batch_timeout` (computed at construction as `fronter.batch_timeout() + 5s slack` instead of fixed 35s const). Operators raising `request_timeout_secs` no longer have sessions abandon `reply_rx` just before `fire_batch`'s HTTP round-trip would complete.
|
|
|
|
209 → **231 lib tests** (+22 regression covering the deadline/co-variance behavior).
|
|
|
|
• **Docker: cargo-chef so tunnel-node builds without BuildKit** ([#620](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/620), [PR #1117](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1117) by @dazzling-no-more). `tunnel-node/Dockerfile` used BuildKit-only `RUN --mount=type=cache` directives, breaking on Cloud Run's `gcloud run deploy --source .` path (the underlying `gcr.io/cloud-builders/docker` builder doesn't enable BuildKit, and `--set-build-env-vars DOCKER_BUILDKIT=1` doesn't flip it on either).
|
|
|
|
Reworked to use **cargo-chef**: a dedicated planner stage emits `recipe.json` for dependency metadata, a `cargo chef cook` stage builds just the deps in their own Docker layer, the final build stage adds `src/` on top. Docker's regular layer cache handles dependency reuse — warm rebuilds where only `src/` changes still skip the slow crate compile.
|
|
|
|
Base bump `rust:1.85-slim` → `rust:1.90-slim` (cargo-chef's transitive deps require rustc 1.86+; tunnel-node's `Cargo.toml` has no `rust-version` pin so the bump is internal-only).
|
|
|
|
**Action for Cloud Run users blocked on #620**: pull v1.9.24 of the tunnel-node Docker image (`ghcr.io/therealaleph/mhrv-tunnel-node:v1.9.24` or `:latest`) — your `gcloud run deploy --source .` should now succeed without BuildKit.
|
|
|
|
**Followup**: issue #1131 (BuffOvrFlw) reports `h1 open timed out after 8s` — that's the `H1_OPEN_TIMEOUT_SECS = 8` from PR #1029 firing on `open()` (TCP+TLS handshake), separate from the header-read timeout this release fixes. Worth a follow-up PR to make `H1_OPEN_TIMEOUT_SECS` parameterized via `request_timeout_secs` too.
|