Logo
Overview
Mediasoup: Building Scalable WebRTC Video Conferencing Applications

Mediasoup: Building Scalable WebRTC Video Conferencing Applications

January 12, 2026
14 min read

What Is Mediasoup?

Mediasoup is an open-source SFU (Selective Forwarding Unit) implementation written in C++ with Node.js bindings. It’s production-ready and used by companies building real-time communication platforms.

Unlike a full conferencing solution (like Jitsi or Kurento), mediasoup is a lower-level building block. You use it as the media engine, then build the rest (signaling, UI, authentication) yourself.

Your Application
mediasoup (Media Routing)
WebRTC ↔ Clients

This gives you complete control and flexibility.


Part 1: Understanding SFU Through Mediasoup

What Mediasoup Does

Mediasoup receives audio and video streams from endpoints and selectively forwards them to other endpoints.

Client A sends stream → mediasoup → forwards to B, C, D
Client B sends stream → mediasoup → forwards to A, C, D
Client C sends stream → mediasoup → forwards to A, B, D
Client D sends stream → mediasoup → forwards to A, B, C

Each client:

  • Sends one stream (audio + video)
  • Receives multiple streams (from all other clients)
  • Can choose which streams to receive (selective)
  • Can choose spatial and temporal layers (quality/bandwidth control)

This is the SFU model: receive multiple streams, process them client-side, display as you wish.

Why Mediasoup Over MCU?

MCU (Multipoint Control Unit) would:

  • Mix all streams into one
  • Cause high latency (processing)
  • Require powerful servers

Mediasoup just routes. Minimal processing. Low latency.

MCU: Client A + B + C → Mix → Compose → Single output → All clients
SFU: Client A → Forward → All clients (separately)
Client B → Forward → All clients (separately)
Client C → Forward → All clients (separately)

Part 2: The Architecture — Workers and Routers

Worker Model

Mediasoup uses a worker pool model:

┌─────────────────────────────────────────┐
│ Node.js Application (Single-threaded) │
└────────────┬────────────────────────────┘
┌────────┴────────┬──────────┬──────────┐
│ │ │ │
▼ ▼ ▼ ▼
[Worker 0] [Worker 1] [Worker 2] [Worker 3]
(CPU Core 0) (CPU Core 1)(CPU Core 2)(CPU Core 3)
│ │ │ │
└─────────────────┴──────────┴──────────┘
(Each handles routers)

Key concept: Each worker consumes one CPU core.

If your server has 8 cores, keep 8 workers ready. Each handles media routing independently.

import mediasoup from 'mediasoup';
const workers: mediasoup.Worker[] = [];
async function createWorkers(numWorkers: number) {
for (let i = 0; i < numWorkers; i++) {
const worker = await mediasoup.createWorker({
logLevel: 'warn',
logTags: ['rtp', 'rtcp'],
rtcMinPort: 40000,
rtcMaxPort: 40100,
});
worker.on('died', () => {
console.error(`Worker ${i} died, restart it`);
// Restart logic here
});
workers.push(worker);
}
return workers;
}

Router: The Media Hub

When a meeting starts, you create a router on a worker:

const router = await selectedWorker.createRouter({
mediaCodecs: [
{
kind: 'audio',
mimeType: 'audio/opus',
clockRate: 48000,
},
{
kind: 'video',
mimeType: 'video/VP8',
clockRate: 90000,
},
],
});

The router is responsible for:

  • Receiving streams from producers
  • Forwarding to consumers
  • Managing transports
  • Handling media codec negotiation

One router per meeting. All participants in a meeting use the same router.

Note

Important: A router is tied to a worker, which is tied to a CPU core. If you have 1000 concurrent meetings and 8 workers, you’ll distribute ~125 meetings per worker. This is why worker pool design is critical for scaling.


Part 3: Peers, Transports, Producers, and Consumers

What Is a “Peer”?

Tricky part: Mediasoup doesn’t have a native “peer” concept. A peer is an application-level abstraction.

Correct way to think about it:

interface Peer {
id: string; // Unique peer ID
userId: string; // Associated user account
websocket: WebSocket; // WebSocket connection (for signaling)
sendTransport?: mediasoup.Transport; // For sending media
recvTransport?: mediasoup.Transport; // For receiving media
producers: Map<string, mediasoup.Producer>; // Audio, video producers
consumers: Map<string, mediasoup.Consumer>; // Consuming from others
dataProducer?: mediasoup.DataProducer; // DataChannel producer
dataConsumers: Map<string, mediasoup.DataConsumer>;
}

A peer is metadata that ties together:

  • User identity
  • WebSocket connection (signaling)
  • Mediasoup transports, producers, consumers
const peers = new Map<string, Peer>();
// When user connects via WebSocket
websocket.on('message', async (msg) => {
const { peerId, action, data } = JSON.parse(msg);
let peer = peers.get(peerId);
if (!peer) {
peer = {
id: peerId,
userId: data.userId,
websocket,
producers: new Map(),
consumers: new Map(),
dataConsumers: new Map(),
};
peers.set(peerId, peer);
}
// Handle actions (join, produce, consume, etc.)
});

Part 4: The Critical Transport Concept

Why Two Separate Transports?

Tricky part: Many developers think one transport handles both sending and receiving.

Wrong:

const transport = await router.createWebRtcTransport();
// Try to both produce and consume on the same transport
// This doesn't work as expected!

Correct:

Mediasoup requires separate transports:

  • Send Transport: For producing media (audio/video)
  • Receive Transport: For consuming media
// Create send transport on server
const sendTransport = await router.createWebRtcTransport({
listenInfos: [{ ip: '0.0.0.0', announcedAddress: 'example.com' }],
enableSctp: true, // For DataChannel
numSctpStreams: { OS: 1024, MIS: 1024 },
});
// Create recv transport on server
const recvTransport = await router.createWebRtcTransport({
listenInfos: [{ ip: '0.0.0.0', announcedAddress: 'example.com' }],
enableSctp: true,
numSctpStreams: { OS: 1024, MIS: 1024 },
});
peer.sendTransport = sendTransport;
peer.recvTransport = recvTransport;

Why two transports?

  • Asymmetry in WebRTC: Sending and receiving use different SDP negotiations
  • Independent control: You can pause sending without affecting receiving
  • Better resource management: Clearer separation of concerns

Creating Transports on the Client

The client-side must mirror the server-side transports:

// Client-side TypeScript
import * as mediasoupClient from 'mediasoup-client';
const device = new mediasoupClient.Device();
// After loading device RTP capabilities
const sendTransport = device.createSendTransport({
id: serverSendTransport.id,
iceParameters: serverSendTransport.iceParameters,
iceCandidates: serverSendTransport.iceCandidates,
dtlsParameters: serverSendTransport.dtlsParameters,
});
const recvTransport = device.createRecvTransport({
id: serverRecvTransport.id,
iceParameters: serverRecvTransport.iceParameters,
iceCandidates: serverRecvTransport.iceCandidates,
dtlsParameters: serverRecvTransport.dtlsParameters,
});
// Handle connect event (DTLS handshake)
sendTransport.on('connect', async ({ dtlsParameters }, callback, errback) => {
try {
// Signal to server to complete DTLS handshake
await signalServer('connectTransport', {
transportId: sendTransport.id,
dtlsParameters,
});
callback(); // Confirm to mediasoup-client
} catch (error) {
errback(error);
}
});
Note

Common mistake: Not handling the “connect” event properly. The client won’t complete the DTLS handshake unless you signal the server from the connect event handler.


Part 5: Producing Media (Sending Video/Audio)

The Flow: Client → Server

Step 1: Client gets media stream

const stream = await navigator.mediaDevices.getUserMedia({
audio: true,
video: { width: { ideal: 1280 }, height: { ideal: 720 } },
});
const audioTrack = stream.getAudioTracks()[0];
const videoTrack = stream.getVideoTracks()[0];

Step 2: Client produces on send transport

// Produce audio
const audioProducer = await sendTransport.produce({
track: audioTrack,
encodings: [
{
maxBitrate: 128000, // 128 kbps for audio
},
],
codecOptions: {
opusNoiseSuppression: true,
},
});
// Produce video
const videoProducer = await sendTransport.produce({
track: videoTrack,
encodings: [
{
maxBitrate: 5000000, // 5 Mbps for video
dtx: false,
},
],
codecOptions: {
videoGoogleStartBitrate: 1000,
},
});
// Store for later
peer.producers.set('audio', audioProducer);
peer.producers.set('video', videoProducer);

Step 3: Transport emits “produce” event

The transport emits this event when media is first sent:

sendTransport.on('produce', async ({ kind, rtpParameters }, callback, errback) => {
try {
// Signal server to create server-side Producer
const producerId = await signalServer('produce', {
transportId: sendTransport.id,
kind,
rtpParameters,
});
callback({ id: producerId }); // Confirm to mediasoup-client
} catch (error) {
errback(error);
}
});

Step 4: Server creates Producer

// On server, handle produce signal
const producer = await peer.sendTransport.produce({
kind,
rtpParameters,
});
peer.producers.set(kind, producer);
// Notify all other peers that this peer produced media
broadcast('newProducer', {
peerId: peer.id,
producerId: producer.id,
kind,
});
Note

Timing is important: The “produce” event fires after transport.produce() is called. You must signal the server before returning from the event. The client waits for your callback.


Part 6: Consuming Media (Receiving Video/Audio)

The Flow: Server → Client

Important: Consuming is reverse order. Server initiates, client responds.

Step 1: Other peers receive “newProducer” notification

When peer A produces, all other peers are notified about peer A’s producer.

Step 2: Server creates Consumer

Peer B’s server-side code:

async function subscribeToProducer(peerB: Peer, producerId: string) {
// Check if peer B can consume this producer
if (!router.canConsume({
producerId,
rtpCapabilities: peerB.device.rtpCapabilities,
})) {
throw new Error('Cannot consume this producer');
}
// Create consumer on peer B's recv transport
const consumer = await peerB.recvTransport.consume({
producerId,
rtpCapabilities: peerB.device.rtpCapabilities,
paused: true, // Start paused!
});
peerB.consumers.set(consumer.id, consumer);
return {
id: consumer.id,
producerId,
kind: consumer.kind,
rtpParameters: consumer.rtpParameters,
};
}
// Signal peer B about the new consumer
await signalClient(peerB, 'newConsumer', {
id: consumer.id,
producerId,
kind: consumer.kind,
rtpParameters: consumer.rtpParameters,
});

Step 3: Client creates Consumer

recvTransport.on('connect', async ({ dtlsParameters }, callback, errback) => {
// Same as send transport
await signalServer('connectTransport', {
transportId: recvTransport.id,
dtlsParameters,
});
callback();
});
// Handle newConsumer notification
websocket.on('newConsumer', async (msg) => {
const { id, producerId, kind, rtpParameters } = msg;
const consumer = await recvTransport.consume({
id,
producerId,
kind,
rtpParameters,
});
// Store the consumer
peer.consumers.set(id, consumer);
// Get the track and attach to video element
const track = consumer.track;
const videoElement = document.createElement('video');
videoElement.srcObject = new MediaStream([track]);
videoElement.play();
document.body.appendChild(videoElement);
// Resume the consumer (server paused it)
await signalServer('resumeConsumer', { consumerId: id });
await consumer.resume();
});

Step 4: Server resumes Consumer

// Handle resume signal
const consumer = peer.consumers.get(consumerId);
await consumer.resume();
Note

Critical mistake: Starting consumers in “paused” state is recommended. Resume them after client confirms they’re created. This prevents video freeze or audio glitches. The docs recommend it, but many developers skip this step.


Part 7: Spatial and Temporal Layers (Quality Control)

What Are Layers?

When a client sends video, it can use simulcast or SVC (Scalable Video Coding):

Simulcast: Send multiple resolutions simultaneously
- High: 1280x720 @ 30fps
- Medium: 640x360 @ 30fps
- Low: 320x180 @ 30fps
SVC: Single stream with nested quality layers
- Temporal: 30fps, 15fps, 7.5fps
- Spatial: 1280x720, 640x360, 320x180

The producer sends all layers. The consumer (on server) can select which layers to forward:

// Producer sends with simulcast
const videoProducer = await sendTransport.produce({
track: videoTrack,
encodings: [
{ maxBitrate: 5000000, rid: 'h' }, // High
{ maxBitrate: 1000000, rid: 'm' }, // Medium
{ maxBitrate: 300000, rid: 'l' }, // Low
],
codecOptions: {
videoGoogleStartBitrate: 1000,
},
});
// Server-side consumer can select preferred layers
const consumer = await peer.recvTransport.consume({
producerId,
rtpCapabilities: peer.device.rtpCapabilities,
});
// Set preferred layers (e.g., medium resolution)
consumer.setPreferredLayers({
spatialLayer: 1, // Medium (0=low, 1=medium, 2=high)
temporalLayer: 1, // 15fps
});
// Listen for actual layer changes
consumer.on('layerschange', (layers) => {
console.log('Effective layers:', layers);
// Notify client about actual layers being sent
});

This is how Zoom handles bandwidth optimization: it downgrades layers for slow clients.


Part 8: Data Channels (Non-Media Communication)

Producing Data (Sending)

For things like chat messages, you can use DataChannels:

// Client produces data
const dataProducer = await sendTransport.produceData({
ordered: true,
maxRetransmits: 3,
});
peer.dataProducer = dataProducer;
// Send data through it
dataProducer.send(JSON.stringify({
type: 'chat',
message: 'Hello!',
timestamp: Date.now(),
}));

Consuming Data (Receiving)

// Server creates data consumer
const dataConsumer = await peer.recvTransport.consumeData({
dataProducerId,
paused: true,
});
peer.dataConsumers.set(dataConsumer.id, dataConsumer);
// Handle incoming data
dataConsumer.on('message', (msg) => {
const data = JSON.parse(msg);
console.log('Received:', data);
});
dataConsumer.on('open', () => {
console.log('DataChannel opened');
});
Note

Use cases for DataChannels:

  • Chat messages
  • Typing indicators
  • Pointer/cursor sharing
  • Game state sync
  • Metadata exchange

Anything that’s not audio/video but needs low-latency delivery.


Part 9: The Tricky Part — Signaling Synchronization

Common Mistake: Forgetting to Signal Both Sides

Many developers create mediasoup objects but forget to tell the other side:

Wrong:

// Server creates consumer
const consumer = await transport.consume({
producerId,
rtpCapabilities,
});
// Oops, forgot to signal the client!
// Client has no idea consumer was created
// Video won't appear

Correct:

// Server creates consumer
const consumer = await transport.consume({
producerId,
rtpCapabilities,
paused: true,
});
// Signal client about new consumer
await sendSignal(peerId, 'newConsumer', {
id: consumer.id,
producerId,
kind: consumer.kind,
rtpParameters: consumer.rtpParameters,
});
// Client receives signal and creates matching consumer
// Client calls resume when ready

Another Mistake: Not Handling Close Events

Wrong:

// Server closes a producer
await producer.close();
// Client doesn't know producer closed
// Client's consumer becomes stale

Correct:

// Listen for close events and notify client
producer.on('transportclose', () => {
// Transport closed, all producers/consumers on it are closed
broadcast('producerClosed', { producerId: producer.id });
});
// Client receives close signal
websocket.on('producerClosed', async ({ producerId }) => {
const consumer = findConsumerByProducerId(producerId);
await consumer.close(); // Clean up client-side
});

The core principle: Mediasoup doesn’t auto-sync between client and server. You must signal every action.


Part 10: Troubleshooting Common Issues

Issue 1: “Producer is closed” Error

Cause: Producer closed on server, but client tried to produce again.

Solution: Listen to producer events and handle closure:

audioProducer.on('transportclose', () => {
console.log('Transport closed, producer is dead');
setProducerActive(false);
});
audioProducer.on('score', (score) => {
// Monitor producer quality
if (score.score < 5) {
console.warn('Poor producer quality');
}
});

Issue 2: “No suitable codec” Error

Cause: Client’s RTP capabilities don’t match router’s codecs.

Solution: Ensure device is loaded properly:

const device = new mediasoupClient.Device();
// Load device with router's RTP capabilities
await device.load({
routerRtpCapabilities: routerCapabilities,
});
// Now device knows what codecs are supported
const sendTransport = device.createSendTransport({
id: serverTransport.id,
// ...
});

Issue 3: Video Freezes or Choppy Audio

Cause: Consumer’s preferred layers don’t match producer’s capabilities.

Solution: Check consumer state:

// Monitor consumer stats
consumer.on('statschange', (stats) => {
console.log('Consumer stats:', stats);
// Check jitter, packet loss, bitrate
if (stats.jitter > 100) {
console.warn('High jitter');
}
});
// Check if consumer is paused
if (consumer.paused) {
console.log('Consumer is paused');
// Might need to resume
}

Part 11: The Connection Flow (End-to-End)

Let’s trace a complete flow:

// 1. User A joins meeting
websocket.emit('joinMeeting', { meetingId, userId: 'A' });
// 2. Server creates router and transports for A
const router = await worker.createRouter({ mediaCodecs });
const sendTransportA = await router.createWebRtcTransport();
const recvTransportA = await router.createWebRtcTransport();
// 3. Server sends transports to client A
websocket.emit('joinedMeeting', {
sendTransport: {
id: sendTransportA.id,
iceParameters: sendTransportA.iceParameters,
// ...
},
recvTransport: {
// ...
},
});
// 4. Client A creates local transports
const sendTransport = device.createSendTransport(/* server data */);
const recvTransport = device.createRecvTransport(/* server data */);
// 5. Client A connects sendTransport (DTLS handshake)
sendTransport.on('connect', async ({ dtlsParameters }, callback) => {
await signalServer('connectTransport', {
transportId: sendTransport.id,
dtlsParameters,
});
callback();
});
// 6. User B joins meeting
websocket.emit('joinMeeting', { meetingId, userId: 'B' });
// 7. Server creates transports for B
const sendTransportB = await router.createWebRtcTransport();
const recvTransportB = await router.createWebRtcTransport();
// 8. Server notifies A about new peer B
websocket.emit('newPeer', { peerId: 'B' });
// 9. User A produces video
const videoProducer = await sendTransport.produce({
track: videoTrack,
encodings: [{ maxBitrate: 5000000 }],
});
// 10. sendTransport emits "produce" event
sendTransport.on('produce', async ({ kind, rtpParameters }, callback) => {
const producerId = await signalServer('produce', {
transportId: sendTransport.id,
kind,
rtpParameters,
});
callback({ id: producerId });
});
// 11. Server creates producer on A's sendTransport
const producerA = await sendTransportA.produce({
kind: 'video',
rtpParameters,
});
// 12. Server notifies all peers (including B) about A's producer
broadcast('newProducer', {
peerId: 'A',
producerId: producerA.id,
kind: 'video',
});
// 13. Server creates consumer on B's recvTransport
const consumerA_onB = await recvTransportB.consume({
producerId: producerA.id,
rtpCapabilities: rtpCapabilitiesB,
paused: true,
});
// 14. Server signals B about A's video
websocket.emit('newConsumer', {
id: consumerA_onB.id,
producerId: producerA.id,
kind: 'video',
rtpParameters: consumerA_onB.rtpParameters,
});
// 15. Client B creates consumer on recvTransport
const consumer = await recvTransport.consume({
id: consumerA_onB.id,
producerId: producerA.id,
kind: 'video',
rtpParameters,
});
// 16. B's track is ready, attach to video element
const videoElement = document.createElement('video');
videoElement.srcObject = new MediaStream([consumer.track]);
videoElement.play();
// 17. Client B resumes consumer
await consumer.resume();
websocket.emit('resumeConsumer', { consumerId: consumerA_onB.id });
// 18. Server resumes consumer
const consumer = peersB.consumers.get(consumerId);
await consumer.resume();
// 19. Video flows: A → mediasoup router → B
// Success!

That’s the complete flow.


Part 12: Production Considerations

Monitoring

Monitor these metrics:

// Worker CPU usage
worker.on('died', () => {
console.error('Worker crashed');
// Restart logic
});
// Producer quality
producer.on('score', (score) => {
metrics.recordProducerScore(producer.id, score.score);
});
// Consumer stats
setInterval(async () => {
const stats = await consumer.getStats();
metrics.recordConsumerStats(consumer.id, stats);
}, 5000);

Scaling

For 1000+ concurrent users:

  1. Multiple workers: 1 per CPU core
  2. Load balancing: Distribute users across workers
  3. Router pooling: Reuse routers across meetings
  4. Memory management: Close transports/consumers cleanly
  5. Database: Track user → router → worker mappings

Security

  • Authenticate WebSocket connections
  • Validate user permissions before allowing consume
  • Rate limit signaling messages
  • Use TLS for all connections

Conclusion: Mediasoup Is Powerful But Requires Care

Mediasoup is an excellent choice for building video conferencing. It’s:

  • Low-level (you control everything)
  • High-performance (forwarding, not mixing)
  • Scalable (worker model)
  • Flexible (you build the app layer)

But it requires understanding:

  • The SFU model
  • Producer/Consumer flows
  • Transport mechanics
  • Signaling synchronization
  • Event handling

The most common mistakes:

  1. Forgetting to signal both client and server
  2. Not handling close events
  3. Not starting consumers in paused state
  4. Misunderstanding separate send/recv transports
  5. Not monitoring producer/consumer quality

Master these, and you can build production-grade video conferencing systems.


References

Now go build something amazing.