Real-Time Audio Stream Access for ML Transcription with OpenVidu 2.31.0

Hello everyone,

I hope you’re all doing great!

I’m currently working with OpenVidu 2.31.0 and looking for a way to access the audio stream from participants in real-time. My goal is to pass the audio directly to a machine learning model for live transcription and audio analysis.

Has anyone here implemented something similar or can point me in the right direction for capturing audio streams on the server side (or browser, if that’s the only way)?

Any insights or suggestions would be greatly appreciated. Thanks in advance!

Unfortunately, this is not possible with 2.31.0.

But you can try to mix v3 PRO with the v2compatibility module enabled, and use LiveKit Agents to extract the audio in real-time.

Your app will still work with v2 and you can use the Agents functionality to access the audio.