I have been successfully running a session of 30 participants (Publisher: audio + video, Subscriber: audio) on 16 core CPU and the maximum CPU utilization observed was around 20%. Inspired from this data, I downsized the system to an 8 core CPU but a sudden spike in CPU utilization happened in between and the server crashed. Adding the screenshot below.
2020-03-27T04:07:50+00:00 – New execution
(kurento-media-server:1587): GStreamer-CRITICAL **: Element rtpbin126 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:1587): GStreamer-CRITICAL **: Element rtpbin659 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:1587): GStreamer-CRITICAL **: Element rtpbin2336 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:1587): GStreamer-CRITICAL **: Element rtpbin3589 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined
2020-03-27T05:02:07+00:00 – New execution
2020-03-27T05:04:37+00:00 – New execution
(kurento-media-server:1886): GStreamer-CRITICAL **: Element rtpbin37 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
2020-03-27T05:05:19+00:00 – New execution
2020-03-27T05:08:13+00:00 – New execution
(kurento-media-server:4266): GStreamer-CRITICAL **: Element rtpbin5 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:4266): GStreamer-CRITICAL **: Element rtpbin524 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
My questions are,
What could cause an unexpected increase in CPU utilization?
Is there any useful information in Kibana related to this?
Hi @tibinpaul, do you have total control over the server where OpenVidu is running?
I would like you to edit the file /etc/default/kurento-media-server and do two things:
Uncomment the line with DAEMON_CORE_PATTERN. You can leave it with the default proposed value, or change the directory to some other existing path. For example you might leave it like this: DAEMON_CORE_PATTERN="/home/ubuntu/core_%e_%p_%u_%t"
Add this new line to the end of the file: export G_DEBUG="fatal-warnings"
With this done, restart Kurento Media Server: sudo service kurento-media-server restart
Now, KMS will crash immediately after the first GStreamer-CRITICAL error happens. In doing so, it will generate a Kernel core dump file in the path that was specified earlier. Please compress that core dump file and send it to us. As soon as you get one, you can disable the immediate crashing behavior, by commenting out or deleting the “G_DEBUG” line.
We need this core dump file in order to investigate the issue caused by all those “GStreamer-CRITICAL” messages, so it will be very helpful if you can provide one. Thanks!
(kurento-media-server:2042): GStreamer-CRITICAL **: Element rtpbin19348 already has a pad named send_rtp_sink_1, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:2042): GStreamer-CRITICAL **: Element rtpbin19406 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:2042): GStreamer-CRITICAL **: Element rtpbin19406 already has a pad named send_rtp_sink_1, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:2042): GStreamer-CRITICAL **: Element rtpbin20876 already has a pad named send_rtp_sink_0, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
(kurento-media-server:2042): GStreamer-CRITICAL **: Element rtpbin20876 already has a pad named send_rtp_sink_1, the behaviour of gst_element_get_request_pad() for existing pads is undefined!
No crash, just spike of CPU usage. Right now I’m using differents instance for each school and it works fine. I looks like four c5.4xlarge are better than just one c5.24xlarge (and cheapper).
More small machines are better than just one big machine if your use case allow it.
We have detected a performance problem in some operations in Kurento Media Server. We have published a beta version that we hope fix the problem. Can you test it and provide feedback?