Connection to 5.9.0 on low bandwidth results in Timeout #7316

Open
opened 2024-08-16 23:11:02 +02:00 by AliasAlreadyTaken · 6 comments

Similar to your-land/administration#208 people on low bandwidth connections report issues when connecting to the testserver.

Similar to your-land/administration#208 people on low bandwidth connections report issues when connecting to the testserver.
Author
Owner

On the 5.9.0 testserver (30001), I keep getting these:

2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) Handle per peer queues: peer_id=55489 packet quota: 2048
2024-08-16 20:46:09: VERBOSE[Server]: httpfetch_caller_alloc_secure: allocating 1451523435837045027
2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) processing queued reliable command 
2024-08-16 20:46:09: VERBOSE[ConnectionSend]: con(4/1)Ran out of sequence numbers!
2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) Failed to queue packets for peer_id: 55489, delaying sending of 601170 bytes

The client cannot log in, its either stuck or gets a timeout. That happens on the same device that can log in to the 5.8.0 main server (30000) without problems. The same device can also log in to the 5.9.0 testserver if it uses a normal DSL landline. If I use mobile data, I cannot log in. It eventually ends in a timeout.

This affects ALL servers that were upgraded to 5.9.0: Test, builder and even the NPC server (30003) which is not very content-heavy.

Diaeresis suggested to remove a check: c3a8205ed9

I did that, but yielded no positive result.

On the 5.9.0 testserver (30001), I keep getting these: ``` 2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) Handle per peer queues: peer_id=55489 packet quota: 2048 2024-08-16 20:46:09: VERBOSE[Server]: httpfetch_caller_alloc_secure: allocating 1451523435837045027 2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) processing queued reliable command 2024-08-16 20:46:09: VERBOSE[ConnectionSend]: con(4/1)Ran out of sequence numbers! 2024-08-16 20:46:09: TRACE[ConnectionSend]: con(4/1) Failed to queue packets for peer_id: 55489, delaying sending of 601170 bytes ``` The client cannot log in, its either stuck or gets a timeout. That happens on the same device that can log in to the 5.8.0 main server (30000) without problems. The same device can also log in to the 5.9.0 testserver if it uses a normal DSL landline. If I use mobile data, I cannot log in. It eventually ends in a timeout. This affects ALL servers that were upgraded to 5.9.0: Test, builder and even the NPC server (30003) which is not very content-heavy. Diaeresis suggested to remove a check: https://gitlab.com/tunnelers-abyss/minetest/-/commit/c3a8205ed9a808917d456f217ba99583bc99fa01 I did that, but yielded no positive result.
AliasAlreadyTaken added the
1. kind/bug
2. prio/critical
3. source/engine
labels 2024-08-16 23:15:36 +02:00
AliasAlreadyTaken added this to the minetest 5.9.0 milestone 2024-08-16 23:15:39 +02:00

Seeing that Alias was able to reproduce the issue. Along with having issues with logging in using a mobile connection. I reproduced the same tests he did. Finding that with a mobile connection I am able to connect to main, test, build server. However if using DSL I can log only onto main server. Not onto the test nor build getting hung up on loading item definitions. Than timing out.

Seeing that Alias was able to reproduce the issue. Along with having issues with logging in using a mobile connection. I reproduced the same tests he did. Finding that with a mobile connection I am able to connect to main, test, build server. However if using DSL I can log only onto main server. Not onto the test nor build getting hung up on loading item definitions. Than timing out.
Author
Owner

Apparently there is an upstream issue already, thank you Lars for notifying me!

https://github.com/minetest/minetest/issues/14765

We're also getting a "Ran out of sequence numbers!" spam. We had max_packets_per_iteration = 60000, but also decreasing to 2048 didn't change much.

Here are three debug.txt to the NPC server with debu_log_level = trace, the server is not very large on media compared to the others:

Since we have a repro, I offered them to bring more info, add debugging code or if we can otherwise assist, to please let me know

Apparently there is an upstream issue already, thank you Lars for notifying me! https://github.com/minetest/minetest/issues/14765 We're also getting a "Ran out of sequence numbers!" spam. We had `max_packets_per_iteration = 60000`, but also decreasing to 2048 didn't change much. Here are three debug.txt to the NPC server with debu_log_level = trace, the server is not very large on media compared to the others: * https://your-land.de/additional/7316/attempt1.txt * https://your-land.de/additional/7316/attempt2.txt * https://your-land.de/additional/7316/attempt3.txt Since we have a repro, I offered them to bring more info, add debugging code or if we can otherwise assist, to please let me know
Author
Owner

Testing with poor network connection on the Your NeighboursServer, due to a smaller media footprint.

https://your-land.de/additional/7316/attempt4.txt : YN main, packet quota 1024, deleted media cache, none of the patches: Took a while, but I was able to log in.
https://your-land.de/additional/7316/attempt5.txt : YL Testserver, packet quota 2048, deleted media cache, none of the patches: Took a while, but I was able to log in. I saw the message "Ran out of sequence numbers" but then it started anew.

Finally, I used up all my data volume:

https://your-land.de/additional/7316/attempt6.txt : YN main, packet quota 1024, deleted media cache, none of the patches: Timeout and many "ran out of sequence numbers". Yay, a repro.

Now I built https://github.com/sfan5/minetest.git branch lessbrokennetwork:

https://your-land.de/additional/7316/attempt7.txt : YN main, packet quote 1024, kept media cache, patch df8b3b6631a488d818f30d27f6d44cfeddb30f14: Many Ran out of sequence numbers!, then DC

https://your-land.de/additional/7316/attempt8.txt : YN main, packet quote 1024, kept media cache, patch f28d2a78edfaab8f6cced5b7921c647dd5fa0ff1: Many Ran out of sequence numbers!, then DC

Even with both commits I can still reproduce the problem. NOK

Testing with poor network connection on the `Your Neighbours`Server, due to a smaller media footprint. https://your-land.de/additional/7316/attempt4.txt : YN main, packet quota 1024, deleted media cache, none of the patches: Took a while, but I was able to log in. https://your-land.de/additional/7316/attempt5.txt : YL Testserver, packet quota 2048, deleted media cache, none of the patches: Took a while, but I was able to log in. I saw the message "Ran out of sequence numbers" but then it started anew. Finally, I used up all my data volume: https://your-land.de/additional/7316/attempt6.txt : YN main, packet quota 1024, deleted media cache, none of the patches: Timeout and many "ran out of sequence numbers". Yay, a repro. Now I built https://github.com/sfan5/minetest.git branch lessbrokennetwork: https://your-land.de/additional/7316/attempt7.txt : YN main, packet quote 1024, kept media cache, patch df8b3b6631a488d818f30d27f6d44cfeddb30f14: Many Ran out of sequence numbers!, then DC https://your-land.de/additional/7316/attempt8.txt : YN main, packet quote 1024, kept media cache, patch f28d2a78edfaab8f6cced5b7921c647dd5fa0ff1: Many Ran out of sequence numbers!, then DC > Even with both commits I can still reproduce the problem. NOK
Author
Owner

Yes, with max_packets_per_iteration=128 and good connection I can do stuff, with bad connection DCs. Please note, YN main (address: your-land.de port 30003) does not have much going on, it's mostly static with few NPCs standing around.

Good connection to 5.9.0 server with 5.9.0 mobile client:

https://your-land.de/additional/7316/attempt9.txt : YN main, packet quote 128, kept media cache, none of the patches: I can log in and run around

Bad connection to 5.9.0 server with 5.9.0 mobile client:

https://your-land.de/additional/7316/attempt10.txt : YN main, packet quote 128, kept media cache, none of the patches: Many Ran out of sequence numbers!, then DC

Bad connection to 5.8.0 server with 5.9.0 mobile client:

YL main, packet quote 60000, kept media cache, server 5.8.0: DC. No debug.txt, because main server

Bad connection to 5.9.0 server with 5.9.0 mobile client:

https://your-land.de/additional/7316/attempt11.txt : YN main, packet quote 1024, kept media cache, commit 1b952a9e34ef5841c0f39513bf6a5ddee3ba53a8: Took a while, then brief login, then DC again

Eventually we'll have to admit that Minetest requires a bandwidth and ping above carrier pidgeon grade - at least smoke signals.

Yes, with `max_packets_per_iteration=128` and good connection I can do stuff, with bad connection DCs. Please note, YN main (address: your-land.de port 30003) does not have much going on, it's mostly static with few NPCs standing around. Good connection to 5.9.0 server with 5.9.0 mobile client: https://your-land.de/additional/7316/attempt9.txt : YN main, packet quote 128, kept media cache, none of the patches: I can log in and run around Bad connection to 5.9.0 server with 5.9.0 mobile client: https://your-land.de/additional/7316/attempt10.txt : YN main, packet quote 128, kept media cache, none of the patches: Many Ran out of sequence numbers!, then DC Bad connection to 5.8.0 server with 5.9.0 mobile client: YL main, packet quote 60000, kept media cache, server 5.8.0: DC. No debug.txt, because main server Bad connection to 5.9.0 server with 5.9.0 mobile client: https://your-land.de/additional/7316/attempt11.txt : YN main, packet quote 1024, kept media cache, commit 1b952a9e34ef5841c0f39513bf6a5ddee3ba53a8: Took a while, then brief login, then DC again Eventually we'll have to admit that Minetest requires a bandwidth and ping above carrier pidgeon grade - at least smoke signals.
Author
Owner

This was fixed with 5.11.0 and fixed again for 5.12.0

This was fixed with 5.11.0 and fixed again for 5.12.0
AliasAlreadyTaken modified the milestone from minetest 5.9.0 to luanti 5.12.0 2025-03-08 18:25:56 +01:00
AliasAlreadyTaken added the
4. step/ready to QA test
label 2025-04-26 01:44:29 +02:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: your-land/bugtracker#7316
No description provided.