perf: skip H2 for full-tunnel batch requests (#1040)

Skip H2 for `tunnel_batch_request_to` (the Full-mode batch path).
Tunnel batches already coalesce N ops into one HTTP request — H2 stream
multiplexing has nothing to multiplex. The H2 try/fallback path on this
specific code path introduced three regressions vs v1.9.14:

1. **Long-poll stalls**: idle polls completed at 16-17s (`LONGPOLL_DEADLINE`
   + network latency) instead of timing out at 10s on H1. Each poll held
   an Apps Script execution slot 60% longer.
2. **Silent batch drops**: `RequestSent::Maybe` failures dropped the
   entire batch with no retry — a failure mode H1 doesn't have.
3. **Pool starvation**: `POOL_MIN_H2_FALLBACK = 2` trimmed the H1 pool
   from 8 → 2 once H2 connected, but tunnel batches still used H1 and
   needed the full pool.

H2 multiplexing is **kept active for relay mode** (non-full) where each
browser request is a separate HTTP call that genuinely benefits from
stream multiplexing. r0ar's controlled A/B test in #962 confirmed h2
is strictly better than `force_http1: true` for apps_script-mode users,
and that path is unchanged here.

## Changes

- `tunnel_batch_request_to`: remove H2 try/fallback/NonRetryable block,
  go straight to H1 pool `acquire()` (-54 lines).
- `run_pool_refill`: always maintain `POOL_MIN = 8`. Remove the
  `POOL_MIN_H2_FALLBACK = 2` trim that was starving tunnel batches
  (-12 lines).

## A/B results (Pixel 6 Pro, 30 batch samples each)

| Metric | H2 (stock v1.9.20) | H1 (this PR) | v1.9.14 (baseline) |
|---|---|---|---|
| 16-17s batches | **8-10/30** | **0/30** | **0/30** |
| 10s timeouts | 0 | 4/30 | 5/30 |
| Active RTTs | 1.4-2.4s | 1.3-2.2s | 1.4-2.3s |

Restores v1.9.14 tunnel performance while keeping all v1.9.15+
improvements (H2 for relay, zero-copy mux, block DoH/QUIC, TLS pool
tuning, PR #1029's warm-race fix).

## Verified locally on top of v1.9.20

- `cargo test --lib --release`: 209/209  (matches v1.9.20 baseline)
- `cargo build --release --features ui --bin mhrv-rs-ui`: clean 

## Interaction with PR #1029 (just shipped in v1.9.20)

PR #1029 added `H2Cell.dead: Arc<AtomicBool>` for synchronous dead-cell
detection. With this PR removing the H2 path for tunnel batches, the
dead-cell flag is no longer consulted on the tunnel batch path — that's
intentional (the flag now scopes to relay mode, which is the path it
was protecting in practice).

Reviewed via Anthropic Claude.

Co-Authored-By: yyoyoian-pixel <noreply@github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
yyoyoian-pixel
2026-05-11 01:41:59 +02:00
committed by GitHub
parent 9611279fbd
commit dd7b3553ec
+12 -66
View File
@@ -983,16 +983,13 @@ impl DomainFronter {
}
}
/// Background loop that keeps the h1 fallback pool warm.
/// Background loop that keeps the h1 pool warm.
///
/// Target depends on whether the h2 fast path is active:
/// - h2 disabled (or peer refused ALPN h2): keep `POOL_MIN` (8)
/// sockets so the per-request acquire never pays a cold handshake
/// — the pre-h2 default behavior.
/// - h2 active: keep just `POOL_MIN_H2_FALLBACK` (2). All real
/// traffic rides the multiplexed h2 connection; the h1 pool only
/// exists to cover the case where h2 dies and we need to fall
/// back instantly without a cold handshake.
/// Always maintains `POOL_MIN` (8) connections. Full-tunnel mode
/// uses the h1 pool for all batch traffic (h2 is skipped for
/// tunnel batches), so the pool must stay at full capacity
/// regardless of h2 status. Relay mode also benefits from a warm
/// pool as h1 fallback.
///
/// A connection only counts toward the minimum if it has at least
/// 20 s of TTL remaining — nearly-expired entries don't help.
@@ -1000,7 +997,6 @@ impl DomainFronter {
/// and opens replacements one at a time so there's no burst.
pub async fn run_pool_refill(self: Arc<Self>) {
const MIN_REMAINING_SECS: u64 = 20;
const POOL_MIN_H2_FALLBACK: usize = 2;
loop {
tokio::time::sleep(Duration::from_secs(POOL_REFILL_INTERVAL_SECS)).await;
@@ -1010,24 +1006,7 @@ impl DomainFronter {
pool.retain(|e| e.created.elapsed().as_secs() < POOL_TTL_SECS);
}
// Decide target. We treat "h2 active right now" as having a
// fresh, non-poisoned cell. h2_disabled is the sticky flag
// (peer never agreed to h2); a transient cell-poison after
// h2 success briefly drops back to the larger target until
// ensure_h2 reopens.
let target = if self.h2_disabled.load(Ordering::Relaxed) {
POOL_MIN
} else {
let cell = self.h2_cell.lock().await;
let h2_alive = cell
.as_ref()
.map(|c| {
c.created.elapsed().as_secs() < H2_CONN_TTL_SECS
&& !c.dead.load(Ordering::Relaxed)
})
.unwrap_or(false);
if h2_alive { POOL_MIN_H2_FALLBACK } else { POOL_MIN }
};
let target = POOL_MIN;
// Count only connections with enough life left.
// Refill one at a time to avoid bursting TLS handshakes.
@@ -2876,44 +2855,11 @@ impl DomainFronter {
let path = format!("/macros/s/{}/exec", script_id);
// h2 fast path. A batch carries N stateful tunnel ops — each
// `data`/`udp_data`/`connect` may have already executed
// upstream when the response framing failed. Replaying the
// whole batch on h1 risks duplicating every op in it. Only
// fall back when h2 definitely never sent. Honors
// user-configured batch_timeout so a slow but legitimate
// batch isn't cut off at an arbitrary fixed cap.
match self
.h2_relay_request(&path, payload.clone(), self.batch_timeout)
.await
{
Ok((status, _hdrs, _resp_body)) if is_h2_fronting_refusal_status(status) => {
// Edge rejected the batch before forwarding. Safe to
// fall back: no batched op reached Apps Script, so
// replaying via h1 won't double-fire any of them.
self.sticky_disable_h2_for_fronting_refusal(status, "tunnel batch")
.await;
// fall through to h1
}
Ok((status, _hdrs, resp_body)) => {
return self.finalize_batch_response(script_id, status, resp_body);
}
Err((e, RequestSent::No)) => {
tracing::debug!(
"h2 batch request pre-send failure: {} — falling back to h1",
e
);
}
Err((e, RequestSent::Maybe)) => {
tracing::warn!(
"h2 batch request post-send failure: {} — \
not replaying on h1 to avoid duplicating batched ops",
e
);
return Err(e);
}
}
// Skip h2 for tunnel batches. Batched ops are already coalesced
// into one HTTP request so h2 multiplexing adds no benefit.
// The h1 pool path is simpler and avoids h2-specific overhead
// (ready timeout, NonRetryable errors, concurrent stream
// contention with long-poll batches).
let mut entry = self.acquire().await?;
let req_head = format!(