From 1c9e73e4e7b56ed352a1c83a0a41dcbb6d6161a2 Mon Sep 17 00:00:00 2001 From: therealaleph Date: Wed, 13 May 2026 13:58:16 +0300 Subject: [PATCH] =?UTF-8?q?chore(release):=20v1.9.24=20=E2=80=94=20fix=20t?= =?UTF-8?q?imeout=20cascade=20+=20Cloud=20Run=20docker=20build=20(#1088,?= =?UTF-8?q?=20#620)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bumps v1.9.23 → v1.9.24. Two PRs from @dazzling-no-more: - #1108 (#1088): batch header read honors request_timeout_secs. Closes the 10s inner timeout cliff that was cascade-killing tunnel sessions under slow Apps Script edges. +22 regression tests (231 total). - #1117 (#620): cargo-chef Dockerfile so tunnel-node builds without BuildKit. Cloud Run's gcloud-deploy path now works. Co-Authored-By: Claude Opus 4.7 (1M context) --- Cargo.lock | 2 +- Cargo.toml | 2 +- docs/changelog/v1.9.24.md | 25 +++++++++++++++++++++++++ 3 files changed, 27 insertions(+), 2 deletions(-) create mode 100644 docs/changelog/v1.9.24.md diff --git a/Cargo.lock b/Cargo.lock index 24e20bd..5de28a8 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -2624,7 +2624,7 @@ dependencies = [ [[package]] name = "mhrv-rs" -version = "1.9.23" +version = "1.9.24" dependencies = [ "base64 0.22.1", "bytes", diff --git a/Cargo.toml b/Cargo.toml index f5ab9e3..5ec6267 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "mhrv-rs" -version = "1.9.23" +version = "1.9.24" edition = "2021" description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting" license = "MIT" diff --git a/docs/changelog/v1.9.24.md b/docs/changelog/v1.9.24.md new file mode 100644 index 0000000..6a4a8c4 --- /dev/null +++ b/docs/changelog/v1.9.24.md @@ -0,0 +1,25 @@ + +• **Fix Full mode timeout cascade — \`batch header read honors request_timeout_secs\`** ([#1088](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/1088), [PR #1108](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1108) by @dazzling-no-more). در Full mode، یک Apps Script edge کند، تمام تونل sessionهای hot-and-flowing رو cascade-kill می‌کرد. کاربرها روی v1.9.21+ مرتب 10s "batch timeout" می‌دیدن و download progress تلگرام/browser رو از دست می‌دادن. **Root cause**: \`read_http_response\` در \`domain_fronter.rs\` یک hardcoded 10s header-read timeout داشت که داخل \`tunnel_batch_request_to\` اجرا می‌شد — مستقل از و کوتاه‌تر از outer \`tokio::time::timeout(batch_timeout, ...)\` در \`fire_batch\`. Apps Script cold starts معمولاً 8-12s طول می‌کشن (PR #1040's A/B 4/30 H1 batches رو ثبت کرد که دقیقاً 10s timeout می‌شدن)، پس inner cliff به‌عنوان false-positive batch timeout قبل از اینکه \`request_timeout_secs\` (default 30s) trigger بشه fire می‌شد. **Fix**: (1) \`tunnel_batch_request_to\` حالا \`batch_timeout\` رو به header read pass می‌کنه via new \`read_http_response_with_header_timeout\` helper. (2) Header read یک absolute deadline استفاده می‌کنه (\`timeout_at\`) به جای per-read \`timeout()\` — slow drip-feed peer دیگه نمی‌تونه silently extend بزنه. (3) Bonus: \`TunnelMux::reply_timeout\` با \`batch_timeout\` co-vary می‌کنه (\`batch_timeout + 5s slack\`). ۲۰۹ → **۲۳۱ lib test** (+22 regression). +• **Docker: cargo-chef برای build بدون BuildKit** ([#620](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/620), [PR #1117](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1117) by @dazzling-no-more). \`tunnel-node/Dockerfile\` از BuildKit-only \`RUN --mount=type=cache\` استفاده می‌کرد که روی Cloud Run's \`gcloud run deploy --source .\` path شکست می‌خورد (underlying \`gcr.io/cloud-builders/docker\` builder BuildKit رو enable نمی‌کنه). cargo-chef pattern: \`recipe.json\` planner stage + \`cargo chef cook\` deps stage + final build with \`src/\` on top. Docker's regular layer cache حالا dependency reuse رو handle می‌کنه — warm rebuilds تنها \`src/\` رو compile می‌کنن. Base bump \`rust:1.85-slim\` → \`rust:1.90-slim\` (cargo-chef نیاز به rustc 1.86+ داره). +--- +• **Fix Full mode timeout cascade — `batch header read honors request_timeout_secs`** ([#1088](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/1088), [PR #1108](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1108) by @dazzling-no-more). Under Full mode, a single slow Apps Script edge cascade-killed every in-flight tunnel session sharing its batch. Users on v1.9.21+ saw frequent 10s "batch timeout" errors and lost download progress on Telegram / browser sessions. + +**Root cause**: `read_http_response` in `domain_fronter.rs` had a **hardcoded 10s header-read timeout** that ran *inside* `tunnel_batch_request_to` — independent of and shorter than the outer `tokio::time::timeout(batch_timeout, …)` in `fire_batch`. Apps Script cold starts routinely land in the 8-12s range (PR #1040's A/B recorded 4/30 H1 batches timing out at exactly 10s after the H2→H1 switch), so the inner cliff fired as a false-positive batch timeout well before `request_timeout_secs` (default 30s) could. + +**Fix** (in `domain_fronter.rs` + `tunnel_client.rs`): + +1. `tunnel_batch_request_to` passes `batch_timeout` to the header read via new `read_http_response_with_header_timeout` helper. `Config::request_timeout_secs` is now the only knob controlling how long we wait for an Apps Script edge to start responding. Other callers (relay path, exit-node) keep the historical 10s value. +2. Header read uses a single **absolute deadline** (`timeout_at`) instead of per-read `timeout()`. Total elapsed across all header reads is bounded regardless of read cadence — a slow drip-feed peer can no longer silently extend. +3. **`TunnelMux::reply_timeout`** co-varies with `batch_timeout` (computed at construction as `fronter.batch_timeout() + 5s slack` instead of fixed 35s const). Operators raising `request_timeout_secs` no longer have sessions abandon `reply_rx` just before `fire_batch`'s HTTP round-trip would complete. + +209 → **231 lib tests** (+22 regression covering the deadline/co-variance behavior). + +• **Docker: cargo-chef so tunnel-node builds without BuildKit** ([#620](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/620), [PR #1117](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1117) by @dazzling-no-more). `tunnel-node/Dockerfile` used BuildKit-only `RUN --mount=type=cache` directives, breaking on Cloud Run's `gcloud run deploy --source .` path (the underlying `gcr.io/cloud-builders/docker` builder doesn't enable BuildKit, and `--set-build-env-vars DOCKER_BUILDKIT=1` doesn't flip it on either). + +Reworked to use **cargo-chef**: a dedicated planner stage emits `recipe.json` for dependency metadata, a `cargo chef cook` stage builds just the deps in their own Docker layer, the final build stage adds `src/` on top. Docker's regular layer cache handles dependency reuse — warm rebuilds where only `src/` changes still skip the slow crate compile. + +Base bump `rust:1.85-slim` → `rust:1.90-slim` (cargo-chef's transitive deps require rustc 1.86+; tunnel-node's `Cargo.toml` has no `rust-version` pin so the bump is internal-only). + +**Action for Cloud Run users blocked on #620**: pull v1.9.24 of the tunnel-node Docker image (`ghcr.io/therealaleph/mhrv-tunnel-node:v1.9.24` or `:latest`) — your `gcloud run deploy --source .` should now succeed without BuildKit. + +**Followup**: issue #1131 (BuffOvrFlw) reports `h1 open timed out after 8s` — that's the `H1_OPEN_TIMEOUT_SECS = 8` from PR #1029 firing on `open()` (TCP+TLS handshake), separate from the header-read timeout this release fixes. Worth a follow-up PR to make `H1_OPEN_TIMEOUT_SECS` parameterized via `request_timeout_secs` too.