Bumps Cargo.toml v1.9.19 → v1.9.20 and ships the changelog. Headline fix: the v1.9.15 Full-mode regression that's been tracking in #924 for ~3 weeks is resolved by @rezaisrad's PR #1029. Bisect-quality root cause (h1 prewarm gated behind h2 handshake, both stall on cold start under the same network conditions). Affected users can drop the `force_http1: true` workaround now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4.1 KiB
• Fix Full mode regression از v1.9.15 (#924 — یک ۳-هفتهای tracking thread با ۱۸+ duplicate report، fixed by @rezaisrad in PR #1029). علامت: `batch timed out after 30s` در Full mode، در حالی که apps_script mode normal کار میکرد. فقط workaround موجود `"force_http1": true` kill switch بود. Bisect دقیق این رو به `0e678630a` (PR #799 که h2 multiplexing رو اضافه کرد) رساند. روت کاز یک line ordering: `warm()` در v1.9.15 h1 prewarm loop رو پشت `ensure_h2().await` گذاشت — وقتی h2 handshake کند بود (تا 8s)، pool h1 خالی میموند. اگر در آن window یک request میآمد، h1 fallback یک TCP+TLS handshake cold میزد که خود stall میشد، outside the 30s batch_timeout. Fix: h1 prewarm parallel با h2 handshake (v1.9.14 ordering restored)، plus بستنکهای پیرامون با `H1_OPEN_TIMEOUT_SECS = 8` و `H2Cell.dead` AtomicBool. ۲۰۸ → ۲۰۹ lib test (+1 regression: `ensure_h2_rejects_dead_cell_within_ttl`). تأیید end-to-end: 5/5 cold restarts pass (9.6-22.5s)، 5/5 concurrent SOCKS5 burst.
• Fix Full mode regression since v1.9.15 (#924, PR #1029 by @rezaisrad). #924 was the canonical tracking thread for an 18+ duplicate cluster spanning ~3 weeks; affected users saw batch timed out after 30s on every Full-mode request while apps_script mode kept working. The only available workaround was the "force_http1": true kill switch.
Root cause (rigorously bisected to 0e678630a — PR #799 which added HTTP/2 multiplexing): PR #799 gated the h1 socket-pool prewarm behind ensure_h2().await. ensure_h2() is bounded by H2_OPEN_TIMEOUT_SECS = 8s but can take the full window on a cold first connection. During that window the h1 fallback pool was empty, so any request that arrived would:
- Get
Err((Relay("h2 unavailable"), No))immediately → fall back to h1 - Empty pool → cold
open()→ fresh TCP+TLS toconnect_host:443 - Same network conditions that stalled h2 also stalled h1; cold open exceeded the 30s
batch_timeout - User saw
batch timed out after 30sthat "works on apps_script" couldn't explain
Fix (two commits, domain_fronter.rs-only):
-
warm h1 pool in parallel with h2: spawn h2 prewarm in a separate task so the h1 prewarm loop runs concurrently. Fullnh1 sockets are warm before user traffic, even when h2 stalls.run_pool_refilltrims back toPOOL_MIN_H2_FALLBACK = 2within 5s once h2 lands as the fast path. -
bound h1 open() + detect dead h2 cells synchronously:H1_OPEN_TIMEOUT_SECS = 8wraps the TCP+TLS handshake inopen()so a stuck handshake doesn't blockacquire()until the outer batch budget elapses.H2Cell.dead: Arc<AtomicBool>flipped by the connection driver task whenConnection::awaitends — known-dead cells are rejected within ≤5s instead of waiting forH2_CONN_TTL_SECS = 540sto expire.
API impact: h2_handshake_post_tls return type changes to (SendRequest, Arc<AtomicBool>). One existing test (h2_handshake_post_tls_returns_alpn_refused_when_peer_picks_h1) tweaks its Ok arm to match — no panic message change.
208 → 209 lib tests (+1 regression: ensure_h2_rejects_dead_cell_within_ttl). Live end-to-end (per PR notes): 5/5 cold restarts pass in 9.6-22.5s, 5/5 concurrent SOCKS5 burst, default full.json baseline 200 OK in 13.3s.
Action for affected users: update to v1.9.20, drop the "force_http1": true workaround from config.json if you had it set. Full mode should work reliably on cold restart again.