WASM I/O
Get your Ticket

Sessions

Discover our confirmed talks!

Kohei Tokunaga
blend-mode

Kohei TokunagaNTT, Inc


P2P Distributed LLM Inference Across Browsers With llama.cpp And WebRTC

Kohei Tokunaga - NTT, Inc

Web browsers offer portable execution environments such as WebAssembly and WebGPU, making them convenient platforms for running LLMs. However, they can’t run models that exceed their memory capacity, which limits the range of the models they can handle.

In this talk, Kohei will introduce LLMlet, a llama.cpp-based on-browser model runner with support for distributed LLM inference across browsers. This enables the model to be split and executed across multiple P2P-connected browsers by integrating llama.cpp’s distributed inference feature (RPC) with WebRTC. The talk will provide a deep technical dive, explore potential use cases, and share the current status of integration with community tools.

View all Sessions

Secure
your ticket!

  • Early Bird
    Conference Ticket WASM I/O 26

    Early Bird

    299 €

    Until December 4th

    All Things Webassembly

    Barcelona

    Mar • 19- 20 • 2026

    2-Day Conference
    AXA Convention Center

  • Standard
    After 4th Dec

    Standard

    379 €

    Until February 19th

    All Things Webassembly

    Barcelona

    Mar • 19- 20 • 2026

    2-Day Conference
    AXA Convention Center

  • Late Bird
    After 19th Feb

    Late Bird

    24 Feb 26 - 18 Mar 26