Sessions

Discover our confirmed talks!

Kohei TokunagaNTT, Inc

P2P Distributed LLM Inference Across Browsers With llama.cpp And WebRTC

Kohei Tokunaga - NTT, Inc

Web browsers offer portable execution environments such as WebAssembly and WebGPU, making them convenient platforms for running LLMs. However, they can’t run models that exceed their memory capacity, which limits the range of the models they can handle.

In this talk, Kohei will introduce LLMlet, a llama.cpp-based on-browser model runner with support for distributed LLM inference across browsers. This enables the model to be split and executed across multiple P2P-connected browsers by integrating llama.cpp’s distributed inference feature (RPC) with WebRTC. The talk will provide a deep technical dive, explore potential use cases, and share the current status of integration with community tools.

View all Sessions

Secure
your ticket!

Early Bird
Conference Ticket WASM I/O 26
Early Bird
299 €
Until December 4th
All Things Webassembly
Get your Ticket Today
Barcelona
Mar • 19- 20 • 2026
2-Day Conference
AXA Convention Center
Standard
After 4th Dec
Standard
379 €
Until February 19th
All Things Webassembly
Get your Ticket Today
Barcelona
Mar • 19- 20 • 2026
2-Day Conference
AXA Convention Center
Late Bird
After 19th Feb
Late Bird
24 Feb 26 - 18 Mar 26

Sessions

Discover our confirmed talks!

P2P Distributed LLM Inference Across Browsers With llama.cpp And WebRTC

Kohei Tokunaga - NTT, Inc

Secureyour ticket!

Early Bird

299 €

All Things Webassembly

Standard

379 €

All Things Webassembly

Late Bird

Secure
your ticket!