Commit Graph

22 Commits

Author SHA1 Message Date
yyoyoian-pixel 919b13b166 feat(tunnel): pipelined polls with adaptive depth, wseq ordering, STUN blocking (#1115)
feat(tunnel): pipelined full-tunnel polls, ordered writes, and STUN blocking

Merged trusted PR #1115 by @yyoyoian-pixel after local verification and a small maintainer fix on the PR branch.

---
Answered via LLM, Supervised @therealaleph
2026-05-16 17:46:08 +03:00
dazzling-no-more e70947ff0d fix(udpgw): move magic IP out of tun2proxy virtual-DNS range (#251, #1143)
Closes #251. In Android Full mode, Telegram worked but Google search and most other websites failed silently. `apps_script` mode on the same setup was unaffected.

**Root cause**: the udpgw magic destination (`198.18.0.1:7300`) was inside `198.18.0.0/15` — the exact range tun2proxy's `--dns virtual` allocator uses to synthesise fake IPs for hostname lookups. Whenever virtual DNS assigned `198.18.0.1` to a real hostname, that hostname's traffic was intercepted by tun2proxy *itself* as a udpgw connection and dropped. Telegram was immune because it uses hardcoded numeric IPs; `apps_script` mode was immune because it never sets `--udpgw-server`.

**Fix**: move `UDPGW_MAGIC_IP` to `192.0.2.1` (RFC 5737 TEST-NET-1) — outside any virtual-DNS allocation pool. Coordinated change across the tunnel-node constant and the Android `--udpgw-server` flag.

## Back-compat

v1.9.25 tunnel-nodes still recognise the legacy `198.18.0.1:7300` for one deprecation cycle (removal in v1.10.0).

| Android | Tunnel-node | Full-mode UDP |
|---|---|---|
| v1.9.25 | v1.9.25 |  fully fixed |
| ≤v1.9.24 | v1.9.25 | ⚠️ handshake works (legacy IP still recognised), but the old client still asks tun2proxy for `198.18.0.1`, so the #251 virtual-DNS collision is still live on-device |
| v1.9.25 | ≤v1.9.24 |  breaks silently (old node rejects `192.0.2.1`) |

The fix lives on the client side (which magic IP it asks tun2proxy to reserve). The back-compat is on the tunnel-node side (accepting both during the deprecation window).

## Verified locally

- `cargo test --lib --release`: 231/231 
- `cargo build --release --features ui --bin mhrv-rs-ui`: clean 
- `(cd tunnel-node && cargo test --release)`: 38/38  (+2 new tests for the IP change)

## Version bump

Cargo.toml already bumped to 1.9.25 in this PR; `docs/changelog/v1.9.25.md` pre-baked. Will combine with any other PRs landing into v1.9.25 before tagging.

Reviewed via Anthropic Claude.

Co-Authored-By: dazzling-no-more <noreply@github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 22:45:23 +03:00
dazzling-no-more 4d2ce91c04 fix(docker): cargo-chef so tunnel-node builds without BuildKit (#620, #1117)
Fixes #620 — `tunnel-node/Dockerfile` used BuildKit-only `RUN --mount=type=cache` directives, breaking on Cloud Run's `gcloud run deploy --source .` path (the underlying `gcr.io/cloud-builders/docker` builder doesn't enable BuildKit, and `--set-build-env-vars DOCKER_BUILDKIT=1` doesn't flip it on either).

Reworked to use **cargo-chef**: a dedicated planner stage emits `recipe.json` for dependency metadata, a `cargo chef cook` stage builds just the deps in their own Docker layer, the final build stage adds `src/` on top. Docker's regular layer cache handles dependency reuse — warm rebuilds where only `src/` changes still skip the slow crate compile.

## Changes (`tunnel-node/Dockerfile`-only)

- Dropped `# syntax=docker/dockerfile:1` parser directive and all `RUN --mount=type=cache,...` blocks
- Added cargo-chef multi-stage build (`chef` → `planner` → `builder`)
- Pinned `cargo-chef` to exact `0.1.77` with `--locked` for reproducible installs
- Bumped base from `rust:1.85-slim` → `rust:1.90-slim` (cargo-chef's transitive deps require rustc 1.86+; tunnel-node's `Cargo.toml` has no `rust-version` pin so the bump is internal-only)
- Removed `ARG TARGETPLATFORM` per-platform cache-id workaround — Docker's regular layer cache is already arch-scoped

## Non-changes (deliberate)

- `tunnel-node/Cargo.toml` left alone — the old Dockerfile comment claimed "matches MSRV in Cargo.toml" but no `rust-version` field actually exists. The Docker base bump is internal build-env, not a declared MSRV.
- Base image digest pinning left on tag refs — without Renovate/Dependabot to keep digests fresh, pinning trades automatic glibc/openssl/ca-certificates CVE patching for a reproducibility property this repo doesn't currently need.

## Verified locally

- `cd tunnel-node && cargo build --release`: clean (binary side unchanged)
- `cd tunnel-node && cargo test --release`: 36/36
- Local `docker build` couldn't run (daemon not started on the dev machine); the PR author's test plan documents successful build under classic Docker daemon.

Reviewed via Anthropic Claude.

Co-Authored-By: dazzling-no-more <noreply@github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 13:55:03 +03:00
therealaleph 2c9c693d13 fix: v1.9.16 — Full mode 50 MiB batch-response truncation (#863)
Apps Script's response body cap is ~50 MiB. tunnel-node had a TCP_DRAIN_MAX_BYTES = 16 MiB per-session cap to stay under it, but multiple sessions in the same batch each contributed up to 16 MiB raw, summing past 50 MiB on busy VPS — N≥4 concurrent sessions × 16 MiB → ≥64 MiB raw → ≥85 MiB after base64. Steam updates and other CDN-served large downloads hit this exactly: `EOF while parsing a string at line 1 column 52428630` from the client and the session aborts mid-stream.

Fix: new BATCH_RESPONSE_BUDGET = 32 MiB total-batch cap. Drain loop tracks remaining budget across sessions and stops one short of the cliff. drain_now() now takes max_bytes; effective cap = min(budget, TCP_DRAIN_MAX_BYTES). Sessions deferred this batch keep their buffered data — no data loss, they drain on the next poll.

Single-op-path callers and existing tests pass usize::MAX (no extra constraint, original TCP_DRAIN_MAX_BYTES still enforced). New regression test `drain_now_respects_caller_budget_below_per_session_cap` covers the new behavior.

Tests: 197 lib + 36 tunnel-node (was 35) all green. UI release build green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 19:25:50 +03:00
dazzling-no-more 38d9d9fcd6 fix(tunnel-node): batch drain correctness and lock contention (#695)
* fix(tunnel-node): batch drain correctness and lock contention

* fix(tunnel-node): single-op lock contention and batch base64 consistency
2026-05-04 03:33:49 +03:00
yyoyoian-pixel 994dd0b23c tune: lower coalesce/settle step from 40 → 10 ms, raise tunnel-node settle max to 1 s (#674)
The batch coalesce step controls how long the client (and the
tunnel-node's straggler settle) waits between checking for more ops
to pack into the same batch.  At 40 ms the wait was conservative —
good for packing uploads but needlessly slow on the download path
where the tunnel-node round-trip, not coalescing, is the bottleneck.

Lowering the step to 10 ms means we fire batches almost immediately
when there's nothing else queued, cutting ~30 ms of dead air on
every download-dominated round-trip.  When both sides DO have data
in flight (uploads, bursty page loads), the adaptive reset still
works: each arriving op resets the 10 ms step timer, so a rapid
burst naturally coalesces up to the 1 s hard cap without wasting
quota on many small batches.

In short: don't wait when there's nothing to wait for; batch
aggressively when there is.

Client side:
  - DEFAULT_COALESCE_STEP_MS  40 → 10 ms
  - DEFAULT_COALESCE_MAX_MS   unchanged at 1000 ms

Tunnel-node side:
  - STRAGGLER_SETTLE_STEP     40 → 10 ms  (matches client step)
  - STRAGGLER_SETTLE_MAX     500 → 1000 ms (more room to pack
    straggler responses when upstream targets reply at different
    speeds — saves Apps Script quota on the return leg)

Users who prefer the old behaviour can set "coalesce_step_ms": 40
in config.json.

Co-authored-by: yyoyoian-pixel <279225925+yyoyoian-pixel@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-03 15:45:52 +03:00
therealaleph cd6ff8d381 feat: v1.9.1 — operator quality-of-life: tunable auto-blacklist, configurable batch timeout, MHRV_AUTH_KEY hint, run.bat CLI fallback
Four small fixes that address recurring user-issue patterns:

- src/config.rs / src/domain_fronter.rs: auto_blacklist_strikes,
  auto_blacklist_window_secs, auto_blacklist_cooldown_secs config fields
  (#391, #444). Previously 3 strikes / 30s window / 120s cooldown were
  hard-coded. Single-deployment users on flaky networks hit this too
  aggressively; multi-deployment users want tighter fail-fast. Defaults
  preserve historical behavior. Power-user file edit only — no UI
  control yet. Clamps to [1, 86400] for durations.

- src/config.rs / src/domain_fronter.rs / src/tunnel_client.rs:
  request_timeout_secs config field (#430, masterking32 PR #25).
  Replaces hard-coded BATCH_TIMEOUT 30s. DomainFronter::batch_timeout()
  exposes the value, fire_batch reads it. Clamped to [5s, 300s].

- tunnel-node/src/main.rs: detect MHRV_AUTH_KEY env var being set
  while TUNNEL_AUTH_KEY is unset, and emit a specific warning pointing
  at the right env var name. Catches the recurring #391/#444 docker
  run typo that made users chase phantom AUTH_KEY-mismatch decoys.

- assets/launchers/run.bat: when both UI renderers (glow + wgpu)
  fail on older Windows / RDP / VM-without-GPU, fall back to launching
  mhrv-rs.exe (CLI) instead of just printing "open an issue".
  Addresses #417 / #426 / #487. CLI has the same proxy functionality
  on 127.0.0.1:8085 (HTTP) / :8086 (SOCKS5).

169 mhrv-rs lib tests + 33 tunnel-node tests still passing. UI build
clean. ConfigWire round-trips the new fields with skip-default-on-write
so unchanged configs stay clean.
2026-04-30 08:33:53 +03:00
dazzling-no-more 8ed8e85687 feat: multi-edge fronting_groups + rename google_only to direct 2026-04-29 17:12:15 +04:00
therealaleph 75401acf0c fix: v1.8.5 — tunnel-node caps TCP drain at 16 MiB to stay under Apps Script body ceiling
@bankbunk reported (#460) that on a 1 Gbps VPS, raw MP4 streams in Full
mode died with `batch JSON parse error: EOF while parsing a string at
line 1 column 52428685` minutes into playback. Root cause: drain_now
took the entire per-session read buffer in one shot. On high-bandwidth
VPS the reader task fills the buffer with tens of MiB between polls;
the resulting batch response (raw + base64 1.33× + JSON envelope)
exceeded Apps Script's ~50 MiB hard cap; Apps Script truncated mid-base64;
the client's serde_json parse hit EOF and the stream tore.

Fix: drain_now now returns at most TCP_DRAIN_MAX_BYTES (16 MiB) per call
and leaves the tail in the buffer for the next poll. EOF is held back
until the buffer is fully drained so partial drains don't tear the
session prematurely. Three regression tests cover the cap, the under-cap
pass-through, and the EOF-holdback case (33 tunnel-node tests passing).

@bankbunk's wondershaper rate-limit workaround (40 Mbps cap on the VPS
interface) is no longer necessary — high-bandwidth VPS users can run at
line rate again.
2026-04-29 06:28:17 +03:00
yyoyoian-pixel ca76fe91d0 fix(tunnel-node): raise long-poll to 15s, adaptive straggler settle up to 500ms
Apps like Telegram maintain persistent XMPP connections (:5222) and
Google Push uses :5228 — both rely on long-lived sessions with
periodic heartbeats. At the previous 5s long-poll deadline, the
tunnel-node returned empty responses frequently enough that Telegram
interpreted it as connection instability and rotated sessions. Each
reconnect costs a full TLS handshake (~4s through Apps Script),
causing visible video/voice interruptions and buffering.

Raising the long-poll deadline to 15s keeps these persistent
connections alive: the tunnel-node holds the response open until
server data actually arrives (push notification, chat message, media
chunk) rather than returning empty every 5s. Tested on censored
networks in Iran where users reported smoother Telegram video
playback and fewer session resets.

The straggler settle is now adaptive (40ms steps, 500ms max): after
the first session in a batch gets data, keep checking every 40ms
whether neighboring sessions also have data. Break early when all
sessions are ready — no fixed 500ms wait when data is already there.
On high-latency relays where each Apps Script call costs ~1.5s
overhead, packing more session responses into one batch saves quota
and reduces total round-trips.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 21:44:14 +02:00
therealaleph 2c4c0a9a05 docs(tunnel-node): add Persian translation of README (#372)
Mirror of the English README with the same setup paths (Cloud Run /
Docker prebuilt / Docker source / direct binary), env-var table, and
protocol section, plus a Persian-language FAQ section answering the
specific questions users keep filing:

- bandwidth overhead (~25-30% from base64 + JSON envelope + v1.8.0
  random padding)
- whether all Android apps get tunneled (yes in Tunnel mode + VpnService;
  no in Proxy mode)
- realistic per-flow throughput (~50-200 KB/s, bound by Apps Script's
  per-roundtrip floor; horizontal-scale via more deployments)
- whether a VPS is required for Full mode (yes; not required for
  apps_script or google_only)
- which VPS to pick (Hetzner CX11 €4/mo for general use; Cloud Run
  free tier specifically for Iran users hit by #313 since destination
  IP stays Google-internal)

Adds an `MHRV_DIAGNOSTIC` env-var row to both the English and Persian
env-var tables — was added in v1.8.0 but never documented.

Closes #372.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 03:09:23 +03:00
therealaleph cb3732f920 feat: v1.8.0 — DPI evasion, active-probing defense, full-mode usage counters
Five user-visible changes shipping together. Each is independently
useful + bounded; bundled because they're all "small architectural
hardening" that benefits from one release announcement.

1. Random payload padding (#313, #365 §1)

   Every outbound Apps Script JSON request now carries a `_pad` field
   of uniform-random length 0..1024 bytes (base64). Defeats DPI that
   fingerprints on the tight length distribution of mhrv-rs's previous
   per-mode-bound packet sizes. ~25% bandwidth on a typical 2 KB batch,
   negligible against Apps Script's per-call latency floor. Backward-
   compatible — old `Code.gs` deployments ignore the unknown field.
   Applied at all three payload-build sites: single relay, single
   tunnel op, batch tunnel.

2. Active-probing decoy: GAS bad-auth → 200 HTML (#365 §3)

   `Code.gs` and `CodeFull.gs` now return a benign Apps-Script-style
   placeholder HTML page on bad/missing AUTH_KEY instead of the JSON
   `{"e":"unauthorized"}`. To an active scanner the deployment looks
   like one of the millions of forgotten public Apps Script projects
   rather than an obvious API endpoint. New `DIAGNOSTIC_MODE` const
   restores JSON errors during setup; default false (production-strong).

3. Active-probing decoy: tunnel-node bad-auth → 404 nginx (#365 §3)

   `tunnel-node` returns an HTTP 404 with an nginx-style HTML body on
   bad auth instead of `{"e":"unauthorized"}`. Active scanners cataloging
   the host see "static web server, nothing tunnel-shaped here." New
   `MHRV_DIAGNOSTIC=1` env var restores verbose JSON during setup.

4. Fix: Full-mode usage counter stuck at zero (#230, #362)

   `today_calls` / `today_bytes` were only being incremented on the
   apps_script-mode relay path. Full-mode batches go through
   `tunnel_client::fire_batch` which never wired into the counter.
   Now `fire_batch` calls `record_today(response_bytes)` after each
   successful batch — bytes estimated from the `d` (TCP payload) and
   `pkts` (UDP datagrams) sizes in the BatchTunnelResponse. Full-mode
   users now see real usage numbers.

5. Fix: quota reset countdown was UTC, should be PT (#230, #362)

   Apps Script's UrlFetchApp daily quota resets at midnight Pacific
   Time, not UTC. We were displaying the countdown to UTC midnight,
   off by 7-8h depending on DST. New `current_pt_day_key()` and
   `seconds_until_pacific_midnight()` helpers with hand-rolled US DST
   detection (2nd Sunday March → 1st Sunday November = PDT, else PST)
   so we don't pull `chrono-tz` and a ~3 MB IANA tzdb just for one
   helper. UI label "UTC day" → "PT day". Tests pin DST window
   boundaries against March/November of 2024, 2026, 2027 to catch
   regressions in the day-of-week math.

Tested:
- cargo test --lib: 154 passed (was 152, +2 for DST window + day-of-week)
- cargo build --release: clean
- cargo build --release --bin mhrv-rs-ui --features ui: clean (macOS arm64)
- tunnel-node cargo test: 30 passed
- Android: ./gradlew assembleDebug succeeds; APK installs + launches
  on mhrv_test emulator (arm64-v8a), no UnsatisfiedLink, no crash

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 01:39:47 +03:00
therealaleph 08efbc571d fix(docker): isolate cargo cache per TARGETPLATFORM (multi-arch race)
The tunnel-docker job in v1.7.3 release failed with:

  error: failed to unpack package `serde_json v1.0.149`
  Caused by: failed to open `/usr/local/cargo/registry/src/.../serde_json-1.0.149/.cargo-ok`
  Caused by: File exists (os error 17)

Root cause: BuildKit's default cache-mount sharing is "shared" — both
linux/amd64 and linux/arm64 build stages mount the SAME on-disk cache
dir. Cargo's registry source extraction is non-atomic; both arches
race on `tar -xzf serde_json-1.0.149.crate` into the same destination,
and the loser hits EEXIST mid-unpack.

Fix: scope each cache mount with `id=cargo-registry-${TARGETPLATFORM}`
(and matching for cargo-git + target). BuildKit then keeps separate
on-disk caches per architecture — no race. Per-arch warm-build speedup
is preserved (each cache fills with that arch's pre-built deps); the
only loss is one cache miss per arch on the first build after this
change, which we already paid in v1.7.3.

The target/ mount is also platform-scoped since target/ holds compiled
object files for a single ABI; sharing across arches would either miss
or, worse, link wrong-ABI objects together.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 22:04:19 +03:00
yyoyoian-pixel 1057797109 feat: native udpgw without QUIC/DNS - QUIC/DNS with udp associate — stable VoIP, faster browsing - needs new tunnel deployment for udpgw (#222)
* feat: native udpgw protocol alongside existing UDP associate

Why udpgw is needed even with UDP associate:

UDP associate (udp_open/udp_data) creates one tunnel session per UDP
destination and polls each independently. On high-latency or shaky
networks this compounds — N simultaneous UDP flows need N separate
polling loops, each paying its own batch round-trip overhead. Google
Meet calls, which fire dozens of concurrent STUN + RTP flows, stall
or fail entirely because the per-destination polling can't keep up.

udpgw multiplexes ALL UDP over one persistent TCP-like session using
conn_id framing. One batch op carries frames for many destinations.
Persistent sockets per (conn_id, dest) with continuous reader tasks
keep source ports stable — critical for protocols like Telegram VoIP
and STUN that expect replies on the same port.

Both paths coexist — they serve different traffic:
  - UDP associate (SOCKS5): apps that negotiate SOCKS5 UDP relay
  - udpgw (198.18.0.1:7300): TUN-captured UDP (DNS, QUIC, Meet, etc.)

tun2proxy vendored as git submodule at v0.7.20 with one transparent
commit adding udpgw_server to the Android JNI run() function.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: block QUIC (UDP 443) and DNS (UDP 53) from udpgw

QUIC through udpgw is slower than TCP/HTTP2 through the batch pipeline
— blocking it forces browsers to fall back to TCP, improving YouTube
and general browsing speed.

DNS is better handled by tun2proxy's virtual DNS / SOCKS5 UDP associate
path which is more reliable for single request-response exchanges.

VoIP (Telegram, Meet) still flows through udpgw normally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace submodule with [patch.crates-io] for tun2proxy udpgw

Use the idiomatic Rust [patch.crates-io] mechanism instead of a git
submodule. Points to yyoyoian-pixel/tun2proxy fork with the udpgw
JNI parameter patch (upstream PR: tun2proxy/tun2proxy#247).

Will be removed once upstream ships the change in tun2proxy >= 0.8.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: pin tun2proxy patch SHA in Cargo.lock

Locks tun2proxy at dfc24ed1 so the patch resolution is recorded and
any branch rewrite is visible in the lockfile diff.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use AbortHandle for ConnSocket readers to prevent FD leaks

JoinHandle::drop detaches the task without aborting it. When
udpgw_server_task is cancelled (session close), the post-loop
cleanup never runs and per-(conn_id, dest) reader tasks become
zombies holding Arc<UdpSocket> file descriptors.

AbortHandle::drop aborts the task automatically, so cleanup is
correct by construction regardless of how the parent task exits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: yyoyoian-pixel <279225925+yyoyoian-pixel@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-26 17:08:45 +03:00
dazzling-no-more bf6fab31ab fix(udp): surface upstream eof + bound sessions and payload size 2026-04-25 16:19:23 +04:00
dazzling-no-more 40c2b6c509 feat(udp): SOCKS5 UDP ASSOCIATE relay through full tunnel
Adds end-to-end UDP support: SOCKS5 client UDP ASSOCIATE → tunnel-mux
udp_open/udp_data ops → tunnel-node UDP sessions → real UDP to upstream.
QUIC/HTTP3, DNS, and STUN now traverse full mode without falling back to
TCP or leaking outside the tunnel.

Apps Script proxies the new ops opaquely through the existing batch
endpoint; CodeFull.gs only gets a doc-comment update.

Highlights:
- proxy_server.rs: SOCKS5 UDP ASSOCIATE handler with per-session task,
  bounded uplink mpsc channel, adaptive empty-poll backoff (500 ms → 30 s),
  source-IP validation against the control TCP peer, port-locking on
  first valid datagram, and self-removal from the dispatch map on eof.
- tunnel_client.rs: UdpOpen / UdpData / close_session mux variants
  alongside the existing TCP plumbing; pkts decoder helper.
- tunnel-node: UdpSessionInner with bounded VecDeque queue, drop-oldest
  on overflow with queue_drops counter and warn-then-throttled logs,
  last_active refreshed only on real activity (uplink send or upstream
  recv — empty polls do not refresh), independent TCP/UDP drain in
  handle_batch Phase 2, separate active-drain (150 ms) and retry
  (250 ms) windows for UDP, idle long-poll (5 s).
- Tests: SOCKS5 UDP packet parser (IPv4/IPv6/DOMAIN round-trips,
  truncation rejects, fragmented rejects), UDP queue overflow drop +
  counter, regression test that batch with both UDP and TCP-data ops
  still runs the TCP retry pass.

Docs: README + android.{md,fa.md} updated to reflect UDP availability
in full mode; tunnel-node/README documents the new ops.
2026-04-25 16:19:23 +04:00
therealaleph fb552c227d v1.5.0: long-poll Full Tunnel + Docker tunnel-node + brief FA release notes
Ships PR #173 (event-driven drain) plus three operational improvements:

PR #173 — long-poll tunnel mode. The tunnel-node's batch drain
switched from a fixed 150 ms sleep to an event-driven Notify wait;
idle sessions long-poll up to 5 s and wake on the first byte from
upstream. Push notifications and chat messages now arrive in roughly
RTT instead of waiting for the next client poll tick. Backward compat
with pre-#173 tunnel-nodes is automatic via a sticky AtomicBool that
detects fast empty replies and reverts to the legacy cadence.
92 client tests + 17 tunnel-node tests pass, including end-to-end
TCP-pair verification of the notify wiring.

Docker image for tunnel-node. Adds a hardened Dockerfile (BuildKit
cache mounts, non-root runtime user, ca-certificates for HTTPS
upstreams) and a .dockerignore to keep build context small. New
`tunnel-docker` job in the release workflow builds + pushes
multi-arch (linux/amd64 + linux/arm64) to
ghcr.io/therealaleph/mhrv-tunnel-node with `:latest`, `:1.5`, and
`:1.5.0` tags on every release. Setting up Full Tunnel mode goes
from "rustup + cargo build on a 1 GB VPS" (which fails on memory
half the time) to a one-liner. tunnel-node/README.md updated with
prebuilt-image + docker-compose recipes.

Brief Persian release note in Telegram caption. The release-post
caption now leads with a `<blockquote>`-wrapped FA bullet headlines
extracted from `docs/changelog/v<ver>.md`, above the existing two
links (repo + release). Markdown links → Telegram HTML <a> for
clickability. Cap-budget-aware truncation at bullet boundaries
keeps total caption under Telegram's 1024-char limit. Headlines-only
rather than full bullets so multiple "what's new" items fit
comfortably (the full bullets remain on the GH release page and as
the optional --with-changelog reply-threaded message).

GitHub Releases page bodies now lead with the changelog content
(Persian section + `---` + English) instead of just a Full Changelog
comparison link. The auto comparison link is appended at the bottom
via `append_body: true` rather than removed.

Workflow changes:
- New `permissions: packages: write` at the workflow level (required
  for ghcr push via docker/login-action).
- New `tunnel-docker` job needs `build` (not the full matrix) to
  serialize the QEMU buildx layer with the matrix cache.
- Release job composes the body from `docs/changelog/v${VER}.md`
  in a pre-step that handles both tag-push and workflow_dispatch
  paths (uses inputs.version || github.ref_name like the rest of
  the workflow).

Tested locally:
- `cargo test` — 92 lib tests pass
- `cargo test -p mhrv-tunnel-node` — 17 tests pass
- `docker build` of tunnel-node Dockerfile — 32 MB image, runs as
  non-root, /health returns "ok", auth rejection works correctly,
  legitimate requests open sessions to remote hosts
- Telegram script `--dry-run` mode added; rendered captions for
  v1.4.0, v1.4.1, v1.5.0 all fit under 900 chars

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 11:56:41 +03:00
dazzling-no-more 1d45dba2c2 feat(tunnel): event-driven drain with adaptive long-poll 2026-04-25 12:14:33 +04:00
dazzling-no-more 0a58943433 feat(tunnel): save one RTT per new HTTPS flow via connect_data op 2026-04-25 01:38:30 +04:00
vahidlazio d091c48da6 feat: remove pipeline depth cap — scales with deployment count
Pipeline depth was artificially capped at 12. Users with 20+
deployments across multiple accounts were wasting pipeline capacity.

Now: pipeline_depth = num_scripts (minimum 2, no upper cap).
The connection pool (80) is the natural ceiling.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 19:00:21 +02:00
vahidlazio b73bbe2106 feat: Mode::Full + batch tunnel client (#94)
Adds a new `mode: full` that tunnels ALL traffic end-to-end through Apps Script → a remote tunnel node. Browser does TLS directly with the destination. No MITM, no CA installation needed on the client device.

Ships as part of the 3-PR series: #93 (tunnel-node service + CodeFull.gs, merged) + this (Rust-side Mode::Full + batch tunnel client) + #95 (Android UI dropdown, now rolled into this PR post-rebase).

### Architecture
- Client → mhrv-rs → script.google.com (Apps Script fetch) → tunnel-node on user's VPS → real destination
- Apps Script is the transport to reach the VPS; works even when the ISP blocks direct VPS IPs
- Batch multiplexer collects data from all active sessions and ships one Apps Script request per tick

### Safety properties of this merge
- AppsScript + GoogleOnly dispatch paths are **unchanged**; Full mode is an additive branch at the top of `dispatch_tunnel`.
- `tunnel_client.rs` is a new isolated module (387 LOC).
- `tunnel_request()` is a new method on `DomainFronter`, no change to `relay()` / `relay_parallel_range()`.
- Config: additive `Mode::Full` variant + validation tests (2 new); existing validation rules untouched.
- Local build: clean compile. `cargo test --quiet`: 75 passed (73 → 75 with 2 new config tests).

### Closes
Unblocks the feature requested in #61, #69, #100, #105, #110, #111, #113, #116.

### Testing
vahidlazio has iterated on prior review feedback. End-to-end testing with a real tunnel-node deployment will follow post-merge from @Feiabyte (volunteered in #61). Post-merge CI will exercise compile + full test matrix across all targets; any regression caught there gets a fast-follow fix.
2026-04-24 12:48:56 +03:00
vahidlazio 31ae569aa2 feat: tunnel-node service + CodeFull.gs (#93)
Standalone Rust/axum HTTP server + Apps Script-side CodeFull.gs for users who want to deploy a remote tunnel node. All new files; no changes to the main Rust crate. This is part 1 of 3 of the full-tunnel feature — it adds scaffolding that users can opt into once the Rust-side Mode::Full lands in #94.
2026-04-23 23:34:51 +03:00