Server side speech detection

Hi everyone,

I know there are built-in client-side events such as RoomEvent.ConnectionStateChanged, RoomEvent.ParticipantConnected, and RoomEvent.ActiveSpeakersChanged for detecting participant activity and dominant speaker changes.

However, since these events are only available on the client side, I’m looking for a server-side solution to capture active speaker changes along with their corresponding timestamps — essentially tracking when each participant starts and stops speaking during a session.

Has anyone implemented or come across an approach to achieve this on the server side?

LiveKit does not expose ActiveSpeakersChanged or IsSpeakingChanged events on the server side. They are only available in the client SDKs.

A simple solution to the problem is propagating the event from client to server. Just communicate to your server whenever your users receive event IsSpeakingChanged for their LocalParticipant (sending the timestamp too). You can then process and store each event in your server as you wish.

Another solution, maybe a little less direct but more elegant, would be to connect a server-side participant using a supported server-side SDK:

Those are the 4 supported server-side SDKs for real time. If you are building your server with any of those 4 languages, you could join a “hidden” participant to each Room that would receive the speaking events as any other regular participant. This would save you from communicating your client side with your server side to send the event information.

Cheers!