Intermittent disconnection between OpenVidu Server and Media Node (Kurento)

william07 · August 18, 2025, 2:44pm

Hello,

We recently faced an intermittent issue in our deployment and I’d like to better understand the potential root causes.

Setup:

1 VM running OpenVidu Server
2 VMs running Media Node Controller
Openvidu v2 Pro (2.31.0)
Everything had been working normally before the incident

Problem observed:

Suddenly, one Media Node started rejecting every session created on it, while the other node was still fine.
OpenVidu Server wasn’t able to connect to that Media Node, which eventually crashed.
The next day, without any intervention or configuration change, the failing node started working again.

Troubleshooting already done:

Removed the failing node from the Media Node list and created a new VM for workload continuity.
Left the problematic VM untouched to investigate root cause.
Checked ELK logs → nothing alarming.
Checked CPU usage → high during the disconnection, but seems more like a consequence (loop between OpenVidu Server and the unreachable node) rather than the root cause.
Verified network connectivity → look normal.
Verified disk usage → look good (Media Node auto-cleaning log files).

My questions:

What could cause a disconnection between OpenVidu Server and a Media Node, that later resolves without any action?
What could be potential root causes for a Media Node to suddenly start rejecting sessions and then recover on its own?
Are there additional logs or metrics (besides ELK, CPU, disk, network) that you recommend monitoring to detect or explain these transient failures?

Thanks a lot in advance for your help and insights! If you need any details, let me know!

cruizba · August 21, 2025, 1:56pm

The next day, without any intervention or configuration change, the failing node started working again.

What could cause a disconnection between OpenVidu Server and a Media Node, that later resolves without any action?

What could be potential root causes for a Media Node to suddenly start rejecting sessions and then recover on its own?

The connection in the next day makes me thing of a temporary error in the network when the disconnection happened.

Are there additional logs or metrics (besides ELK, CPU, disk, network) that you recommend monitoring to detect or explain these transient failures?

Yes, logs from openvidu-server container at the master node and logs from the kms container in the Media Node.

What logs do you have from that specific moment? If this is the first time it happened to you, it is mostly a temporary network issue.

Topic		Replies	Views
There is no available Media Node where to initialize session '77WDL4XXXX' Issues with deployment v2	5	1236	July 6, 2020
OpenVidu Pro - Restarting Since 2.14.0 > 2.15.x Update Issues with deployment v2	10	312	July 16, 2020
Client connections keep failing from time to time Issues with deployment v2	2	944	January 7, 2022
On Premise Deployment 2.15 - unable to get Media Node to start Issues with deployment v2	7	876	September 10, 2020
A race condition on 2.24 Issues with deployment v2	11	334	January 23, 2023

Intermittent disconnection between OpenVidu Server and Media Node (Kurento)

Related topics