Ways to health check

Peter_Sigma · June 3, 2020, 5:38pm

Hello community.

I had an issue with a video session (it was a 2 people session) getting dropped while using OpenVidu Call on a 2.14.0 instance. The instance (an AWS server) itself became unreachable. Although it came back up after a server reboot, it made me think about a proper way to monitor the health of each of OpenVidu components and OpenVidu as a whole.

How does everyone health-check OpenVidu (regardless of CE or Pro)?
Also, are there any recommended thresholds in metrics like CPU utilization?

Peter

imran · June 5, 2020, 1:10pm

This is very important to know some key areas to monitor if openvidu stack is not working as expected. May be it can be done via logs somehow.

Most recently, we had issue that video stream was showing up very late and we had to restart the stack to fix it.

A way to monitor or key indicator would be great.

Thanks

micael.gallego · June 5, 2020, 11:13pm

You should monitor all your systems for CPU usage.

As OpenVidu is working in realtime, it is important to not saturate the CPU and maintain it below 100%. For example 90% is a safe threshold.

We don’t have a proper “health-check” endpoint in the REST API, but you can use /config endpoint to test if OpenVidu is working as expected.

In OpenVidu PRO 2.15 version (to be released in a week) the CPU usage of all cluster nodes will be sent to ElasticSearch. So you will be able to use the alerting system in ElasticSearch to be notified when CPU is too high.

Peter_Sigma · June 7, 2020, 9:43pm

@micael.gallego

Thank you for the advise. I will try the /config approach in conjunction of monitoring CPU utilization.

Topic		Replies	Views
High CPU usage on idle Issues with deployment v2	12	312	June 15, 2023
New install of OV running at 100% CPU Useage. What's wrong? Issues with deployment v2	7	918	May 22, 2020
CPU Consumption went to 165% for 16 users Issues developing apps v2	4	393	July 15, 2020
CPU at 94% on Open Vidu Server, and 8% on Media Node Issues with deployment v2	7	764	May 14, 2021
High CPU Usage 2.17.0 - 700% on 12 core Server Issues with deployment v2	1	407	April 7, 2021

Ways to health check

Related topics