chore(release): v1.9.20 — fix Full-mode warm-up race (#924, #1029)

Bumps Cargo.toml v1.9.19 → v1.9.20 and ships the changelog. Headline fix: the v1.9.15 Full-mode regression that's been tracking in #924 for ~3 weeks is resolved by @rezaisrad's PR #1029. Bisect-quality root cause (h1 prewarm gated behind h2 handshake, both stall on cold start under the same network conditions). Affected users can drop the `force_http1: true` workaround now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 21:24:48 +03:00 · 2026-05-11 00:37:38 +03:00
parent 4d3e62195c
commit 786a9703c9
3 changed files with 25 additions and 2 deletions
@@ -2624,7 +2624,7 @@ dependencies = [

 [[package]]
 name = "mhrv-rs"
-version = "1.9.19"
+version = "1.9.20"
 dependencies = [
 "base64 0.22.1",
 "bytes",
@@ -1,6 +1,6 @@
 [package]
 name = "mhrv-rs"
-version = "1.9.19"
+version = "1.9.20"
 edition = "2021"
 description = "Rust port of MasterHttpRelayVPN -- DPI bypass via Google Apps Script relay with domain fronting"
 license = "MIT"
@@ -0,0 +1,23 @@
+<!-- see docs/changelog/v1.1.0.md for the file format: Persian, then `---`, then English. -->
+• **Fix Full mode regression از v1.9.15** ([#924](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/924) — یک ۳-هفته‌ای tracking thread با ۱۸+ duplicate report، fixed by [@rezaisrad](https://github.com/rezaisrad) in [PR #1029](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1029)). علامت: \`batch timed out after 30s\` در Full mode، در حالی که apps_script mode normal کار می‌کرد. فقط workaround موجود \`"force_http1": true\` kill switch بود. Bisect دقیق این رو به [\`0e678630a\`](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/commit/0e678630a) (PR #799 که h2 multiplexing رو اضافه کرد) رساند. روت کاز یک‌ line ordering: \`warm()\` در v1.9.15 h1 prewarm loop رو پشت \`ensure_h2().await\` گذاشت — وقتی h2 handshake کند بود (تا 8s)، pool h1 خالی می‌موند. اگر در آن window یک request می‌آمد، h1 fallback یک TCP+TLS handshake cold می‌زد که خود stall می‌شد، outside the 30s batch_timeout. Fix: h1 prewarm parallel با h2 handshake (v1.9.14 ordering restored)، plus بستنک‌های پیرامون با \`H1_OPEN_TIMEOUT_SECS = 8\` و \`H2Cell.dead\` AtomicBool. ۲۰۸ → **۲۰۹ lib test** (+1 regression: \`ensure_h2_rejects_dead_cell_within_ttl\`). تأیید end-to-end: 5/5 cold restarts pass (9.6-22.5s)، 5/5 concurrent SOCKS5 burst.
+---
+• **Fix Full mode regression since v1.9.15** ([#924](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/issues/924), [PR #1029](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/pull/1029) by @rezaisrad). #924 was the canonical tracking thread for an 18+ duplicate cluster spanning ~3 weeks; affected users saw `batch timed out after 30s` on every Full-mode request while apps_script mode kept working. The only available workaround was the `"force_http1": true` kill switch.
+
+**Root cause** (rigorously bisected to [`0e678630a`](https://github.com/therealaleph/MasterHttpRelayVPN-RUST/commit/0e678630a) — PR #799 which added HTTP/2 multiplexing): PR #799 gated the h1 socket-pool prewarm behind `ensure_h2().await`. `ensure_h2()` is bounded by `H2_OPEN_TIMEOUT_SECS = 8s` but can take the full window on a cold first connection. During that window the h1 fallback pool was empty, so any request that arrived would:
+
+1. Get `Err((Relay("h2 unavailable"), No))` immediately → fall back to h1
+2. Empty pool → cold `open()` → fresh TCP+TLS to `connect_host:443`
+3. Same network conditions that stalled h2 also stalled h1; cold open exceeded the 30s `batch_timeout`
+4. User saw `batch timed out after 30s` that "works on apps_script" couldn't explain
+
+**Fix** (two commits, `domain_fronter.rs`-only):
+
+1. **`warm h1 pool in parallel with h2`**: spawn h2 prewarm in a separate task so the h1 prewarm loop runs concurrently. Full `n` h1 sockets are warm before user traffic, even when h2 stalls. `run_pool_refill` trims back to `POOL_MIN_H2_FALLBACK = 2` within 5s once h2 lands as the fast path.
+
+2. **`bound h1 open() + detect dead h2 cells synchronously`**: `H1_OPEN_TIMEOUT_SECS = 8` wraps the TCP+TLS handshake in `open()` so a stuck handshake doesn't block `acquire()` until the outer batch budget elapses. `H2Cell.dead: Arc<AtomicBool>` flipped by the connection driver task when `Connection::await` ends — known-dead cells are rejected within ≤5s instead of waiting for `H2_CONN_TTL_SECS = 540s` to expire.
+
+**API impact**: `h2_handshake_post_tls` return type changes to `(SendRequest, Arc<AtomicBool>)`. One existing test (`h2_handshake_post_tls_returns_alpn_refused_when_peer_picks_h1`) tweaks its `Ok` arm to match — no panic message change.
+
+208 → **209 lib tests** (+1 regression: `ensure_h2_rejects_dead_cell_within_ttl`). Live end-to-end (per PR notes): 5/5 cold restarts pass in 9.6-22.5s, 5/5 concurrent SOCKS5 burst, default full.json baseline 200 OK in 13.3s.
+
+**Action for affected users**: update to v1.9.20, drop the `"force_http1": true` workaround from `config.json` if you had it set. Full mode should work reliably on cold restart again.