White screen on recording, china server

Description of the issue

When we start a video recording on a server in China, the first 30 seconds of the recorded video display a white-screen. After this first 30 seconds, the recording is correct.
On another solutions with the same configuration we do not experience this issue; this happens only on a server in China.

If we empty the RAM cache, the first recording we make does not have the 30-seconds white screen; the recordings thereafter present the issue once more.

The Openvidu version is 2.20.0; Openvidu recordings version is 2.19.0-custom.

Additionally, when we start a video recording on the server in China, we have the following error (not present in other solutions we have):

Sep 27 21:30:20 localhost systemd[1]: docker-e08b6cebaa9942903914158f520c8c9d0d9bdbc6a370298629272b0d875a42d3.scope: Consumed 4.082s CPU time. Sep 27 21:30:20 localhost containerd[1234147]: time="2022-09-27T21:30:20.276672410+08:00" level=info msg="shim disconnected" id=e08b6cebaa9942903914158f520c8c9d0d9bdbc6a370298629272b0d875a42d3 Sep 27 21:30:20 localhost dockerd[594949]: time="2022-09-27T21:30:20.276765144+08:00" level=info msg="ignoring event" container=e08b6cebaa9942903914158f520c8c9d0d9bdbc6a370298629272b0d875a42d3 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Sep 27 21:30:20 localhost containerd[1234147]: time="2022-09-27T21:30:20.277642671+08:00" level=error msg="copy shim log" error="read /proc/self/fd/36: file already closed"
In the container of the video recording we have the following warning (this happens on other servers as well):

Here are the datas of the server in use:

This is the RAM usage at the start of the video recording:

free-h

What cloud provider do you use? We have no way to test this issue. Are you the same from here?: White recording on China Server ¡ Issue #655 ¡ OpenVidu/openvidu ¡ GitHub

The server provider is Tencent. Yes, this is the same issue

Any update on this? Could the problem be related to a resource consumption problem?

As I was saying, the first recording after we empty the RAM cache is working as expected, but the next ones still have this issues

Did you tried with other cloud provider in china region? This is a specific issue in a specific region with a specific cloud provider, so it is really hard for me to check what may happen.

Are you using OpenVidu CE or OpenVidu PRO? How much RAM and CPU does the machine have?

We have tried also on another solution we have also in China, that runs on a VM with ISP Chinanet. The white screen there is around two minutes.

We also checked the link with the custom layout, and there the streaming is working as expected, it’s only in the final file that we see the white screen.

We are using Openvidu CE. The machine has 15 GB of RAM and 8 CPU

Whenever we stop a registration, on syslog we recieve the next error:

localhost containerd[1234147]: time=“2022-09-29T18:44:25.932848306+08:00” level=error msg=“copy shim log” error=“read /proc/self/fd/36: file already closed”

This happens only on that solutions with the problem

Can you try to set up this environment variable?

OPENVIDU_RECORDING_COMPOSED_URL=http://localhost:5443/dashboard

Just in case if it is a hairpinning issue.

localhost containerd[1234147]: time=“2022-09-29T18:44:25.932848306+08:00” level=error msg=“copy shim log” error=“read /proc/self/fd/36: file already closed”

I don’t think this has something to do with the problem. Probably is a network problem (DNS firewalls/filtering in China ISPs or something like that). The machine looks enough in terms of resources).

The copy shim log / file already closed message seems to be a harmless warning from containerd: #5130 Error in log: copy shim log read /proc/self/fd/15: file already closed. Apparently, it got fixed for contained >= 1.5.0

My version from Docker install, right now, is 1.6.8:

$ containerd --version
containerd containerd.io 1.6.8

so it seems that 1.4.x is already pretty old, and I’d suggest having a look at upgrading Docker to more up-to-date version.

We are using a custom url for the recording layout.
Is this anywhy necessary?

So, are you using this?: Recording - OpenVidu Docs

With this parameter, the custom-layout will be loaded in the container from localhost, instead of using the defined DOMAIN_OR_PUBLIC_IP

If you are using custom layouts… try this environment variable instead

OPENVIDU_RECORDING_COMPOSED_URL=https://localhost:443/dashboard
OPENVIDU_RECORDING_DEBUG=true

If it does not work, send to me the log files generated at the recording directory.

Ahhh, I think I understand you. You are using a custom URL for the recording? Like:

...
"customLayout": "https://USER:PASS@my.domain.com:8888/path?myParam=123"
...

Then, make sure that this URL loads properly from the server in China. If that’s the case, you don’t need the previous environment variables I have said.

I think the problem is that the webpage don’t load properly in China server. If the URL of your custom layout is available from the China server, then try to use localhost to access it.

I’ve tryied to load the custom url from the server, with the DNS the url load properly,

trying with localhost I get an 404 error page

Obviously, if the page can not be accessed through localhost, and it is not deployed next to openvidu, you will receive that 404

Then, forget about localhost… And try to run with OPENVIDU_RECORDING_DEBUG=true. This will print next to the recording some logs about the recording container, which may expose some problem if something is happening.

I am suspecting that the page is taking too much time to load when loaded from the recording container running in China Server. This is why it takes that much time to load.

Does this happen to you without using custom layout?

If this works correctly without the custom layout, then the problem occurs when the webpage is loaded from the recording container in that server.

After checking the logs of recording in the file chrome_debug.logs we found some differences between the recordings without the white screen, the recordings with with screen and the recordings on a server outside China.

In both logs of the recordings in China there is the next error:
[71:71:0930/080849.457385:VERBOSE1:extension_downloader.cc(792)] Failed to fetch manifest ‘https://clients2.google.com/service/update2/crx?os=linux&arch=x64&os_arch=x86_64&nacl_arch=x86-64&prod=chromecrx&prodchannel=&prodversion=86.0.4240.193&lang=en-US&acceptformat=crx3&x=id%3Dnmmhkkegccagdldgiimedpiccmgmieda%26v%3D0.0.0.0%26installedby%3Dother%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dpkedcjkdefgpdelpbcmbmeomcjbeemfm%26v%3D0.0.0.0%26installedby%3Dother%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Daapocclcgogkmnckokdopfmhonfmgoek%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dfelcaaldnbdncclmgdcncolpebgiejap%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dghbmnnjooekpmoecnnnilnnbdlolhkhi%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Daohghmighlieiainnegkcijnfilokake%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dapdfllckaahabafndbhieahigkjlhalf%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dblpcfgokakmgnkcojhhkbfbldkacnbeo%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1&x=id%3Dpjkljhegncpnkpknbcohdijeoejaedia%26v%3D0.0.0.0%26installedby%3Dinternal%26uc%26ping%3Dr%253D-1%2526e%253D1’ response code:-1

Only in the logs provided with recordings that have the issues we found some strange random http requests:

This problem is present whenever whe receive a recording with white screen at the beginning. The url are always random.

In the logs of the recordings without the issue, taken also on the China server at that place we have the next requests:

The url request in the last screenshot that I’ve enlighted isn’t present in the recordings with white screen

I don’t know if maybe one of those fetches is causing some kind of timeout. I don’t understand those random URLs.

Is your custom layout using those google fonts? Maybe google services are making those requests to timeout because of the China IP (Sad, but it could be a possibility). Some big services use these tricks of long timeout to avoid DDoS attacks. Maybe I am being too paranoid on the thinking, but it is a possibility.

mmm maybe I am not too paranoid, and it is the real problem: Link

The custom url is a link that is working as expectedly from China.
We are doing some tests and analysis to be sure, but what I’ve noticed so fare that in the recording with white screen the chrome service try trice to fetch this api https://clients2.google.com/service/update2/crx and all the requests fails. In the videos without white screen the chrome service after the second failed request is able to fetch correctly this api, and the issue is not presented

From what I can read in the article, the load of resources outside of China fails inconsistently depending on how firewalls behave. It varies depending on the China region, and it may sometimes work and sometimes not.

What I would do is to locate all needed files from your custom layout in China, so when the page is loaded there, the resources load from a server in the country. In this way, when the web is loaded from China, the files of your custom layout web will not need to pass country’s firewalls.

The https://clients2.google.com/service/update2/crx comes from the browser itself, some people runs composed recordings without internet connection, so I don’t think this is the problem. But I am just speculating.