Transmit files, images, and data through sound. Fountain codes. DEFLATE compression. Reliable transfer protocol. No internet required.
ggwave by Georgi Gerganov is a brilliant data-over-sound library that handles the hardest part — turning bytes into precisely tuned audio tones and decoding them back using FSK modulation and FFT. It supports 9 protocols across audible, ultrasound, and dual-tone frequency bands. It’s the foundation that makes everything here possible.
WAVEPX is an application framework layer on top of ggwave. It adds multi-frame chunked images with progressive rendering, DEFLATE compression for higher throughput, fountain codes for reliable delivery, palette quantization, dithering, and full application protocols for file transfer and games.
Drag-drop any file up to 64KB. SYN/ACK handshake, fountain-coded data burst, CRC-32 verification. Reliable delivery over a lossy audio channel.
fountain codesLoad photos, apply dithering (Floyd-Steinberg, Atkinson), and transmit as multi-chunk pixel data. Supports B&W, 4-gray, 16-gray, and 16-color palette modes.
chunked protocolSend UTF-8 text up to 138 bytes in a single frame. With DEFLATE compression in transfer mode, send entire paragraphs efficiently.
single frameGenerate and transmit QR codes (versions 1–4) as bit-packed audio frames. URLs, contact cards, WiFi configs — no camera needed.
auto-versionFull game protocol over sound: ship placement, shot/result exchanges, win detection. Coordinate labels, ghost preview, ship roster, status messages.
game protocolSide-by-side comparison of 7 encoding strategies. See raw vs. RLE vs. palette, byte counts, chunk counts, and visual previews. Learn by seeing.
educationalInteractive pixel canvas with adjustable grid (8×8 to 512×512), grayscale modes, and real-time encoding stats. Draw and send pixel art over audio.
interactiveFloyd-Steinberg and Atkinson error diffusion algorithms. Convert photos to low-bit-depth pixel data while preserving visual quality through perceptual tricks.
image processingEvery tab has expandable blurbs explaining the science: how FSK works, why fountain codes tolerate loss, what dithering does. Learning is the feature.
details/summaryK source blocks + 50% parity blocks. Receiver reconstructs from any sufficient subsetCompressionStream API. RLE for pixel data. Palette quantization with row-run encoding for color images140-byte frames with headers: chunk index, total count, dimensions, compression flags, CRC-3248kHz sample rate, AudioWorklet capturenpx wavepx for the full app.import { SonicPixel } from 'wavepx'; const sp = new SonicPixel(); await sp.init(); // Send text await sp.sendText('hello from the other side'); // Send a QR code await sp.sendQr('https://github.com/0xNtive/wavepx'); // Send a pixel image (16x16 B&W) const pixels = new Array(256).fill(false); await sp.sendImage(16, 16, pixels);
import { SonicPixel, FrameType } from 'wavepx'; const sp = new SonicPixel({ onReceive: (msg) => { switch (msg.type) { case FrameType.TXT: console.log('Text:', msg.text); break; case FrameType.IMG: console.log(`Image: ${msg.width}x${msg.height}`); break; case FrameType.CHUNK: console.log('Chunked image complete'); break; } }, }); await sp.init(); await sp.startListening(); // mic access required
import { SonicPixel } from 'wavepx'; const sp = new SonicPixel({ onTransferMessage: (msg) => { // Handle incoming transfer protocol frames // SYN, DATA, DONE are handled automatically }, }); await sp.init(); // Send a file with fountain codes + compression const data = new TextEncoder().encode('file contents...'); await sp.sendFileTransfer( data.buffer, 'notes.txt', (sent, total, state) => { console.log(`${state}: ${sent}/${total} blocks`); } );
# Launch the full app locally $ npx wavepx # Custom port $ npx wavepx -p 8080 # Or install globally $ npm install -g wavepx $ wavepx WAVEPX data over sound Local: http://localhost:3000 Press Ctrl+C to stop
SonicPixel class API. Import anything from 'wavepx'.onReceive, onError, onStateChange, onAudioLevel, onChunkProgress, onGameMessage, onTransferMessage. Also protocol (0-8) and volume (0-100).onReceive callback. Requires mic permission.(sent, total) => void.bitDepth: 1, 2 (4 levels), or 4 (16 levels). Chooses raw or RLE automatically.quantizeColors() to prepare.(sent, total, state) => void.encodeBlocks(data, blockSize, redundancy) produces K systematic + parity blocks. Decoder reconstructs from any K received blocks.| Type | Byte 0 | Header | Payload | Max Size |
|---|---|---|---|---|
| QR | 0x01 | version/EC (1B) + proto ver (1B) | Bit-packed QR modules | 140B |
| IMG | 0x02 | width (1B) + height (1B) + proto ver (1B) | Bit-packed pixels | 140B |
| TXT | 0x03 | proto ver (1B) | UTF-8 text (up to 138B) | 140B |
| CHUNK | 0x04 | index + total + flags + dimensions | Compressed pixel data | 140B/frame |
| GAME | 0x05 | subtype (1B) + session fields | Setup/shot/result/win data | 7B |
| TRANSFER | 0x06 | subtype (1B) + session ID + fields | Fountain-coded blocks | 140B/frame |
| Subtype | Code | Direction | Purpose |
|---|---|---|---|
| SYN | 0x01 | Sender → Receiver | File metadata, block params, CRC-32, filename |
| SYN_ACK | 0x02 | Receiver → Sender | Acknowledge, ready to receive |
| DATA | 0x03 | Sender → Receiver | Fountain-coded block (index, degree, source indices, payload) |
| DONE | 0x04 | Receiver → Sender | CRC verified: success or mismatch |
| ABORT | 0x05 | Either | Cancel: user, timeout, or error |
| Byte | First Chunk (idx=0) | Subsequent Chunks |
|---|---|---|
| 0 | 0x04 (type) | 0x04 (type) |
| 1 | Chunk index | Chunk index |
| 2 | Total chunks | Total chunks |
| 3 | Flags [compress:2|depth:2|reserved:4] | Flags [compress:2|depth:2|reserved:4] |
| 4-5 | Width (big-endian) | Proto version (1B) + payload... |
| 6-7 | Height (big-endian) | — |
| 8 | Protocol version | — |
| 9-139 | Payload (131B max) | Payload (135B max) |
| Value | Strategy | Best For |
|---|---|---|
| 00 | Raw | Random patterns, high-entropy images |
| 01 | RLE (1-bit) | Solid regions, line art, sparse graphics |
| 10 | RLE (gray) | Grayscale with large uniform areas |
| 11 | Palette | Color images with ≤16 colors |
| Subtype | Code | Size | Fields |
|---|---|---|---|
| SETUP | 0x01 | 7B | hash (4 bytes, FNV-1a of ship placement) |
| SHOT | 0x02 | 5B | x, y coordinates |
| RESULT | 0x03 | 6B | x, y, result (0=miss, 1=hit, 2=sunk) |
| WIN | 0x04 | 3B | (none) |
Powered by ggwave. Frequency-Shift Keying maps each byte to a specific audio frequency. The receiver uses FFT (Fast Fourier Transform) to decompose the audio back into frequency components and read the data. Like a musical barcode.
Rateless erasure codes that let the receiver reconstruct data from any sufficient subset of encoded blocks. Each parity block XORs 2–3 source blocks. If a block is lost, it can be recovered from the parity without retransmission.
Floyd-Steinberg dithering distributes quantization error to neighboring pixels, creating the illusion of more gray levels than actually exist. Your eye averages nearby pixels, perceiving smooth gradients from binary dots.
Color quantization finds the N most representative colors by recursively splitting the color space along its widest axis. Each pixel is then mapped to the nearest palette entry, with row-run compression for efficient encoding.