Please separate compound fields and add modification date to the database #4526
Labels
No Label
1. kind/balancing
1. kind/breaking
1. kind/bug
1. kind/construction
1. kind/documentation
1. kind/enhancement
1. kind/griefing
1. kind/invalid
1. kind/meme
1. kind/node limit
1. kind/other
1. kind/protocol
2. prio/controversial
2. prio/critical
2. prio/elevated
2. prio/good first issue
2. prio/interesting
2. prio/low
3. source/art
3. source/client
3. source/engine
3. source/ingame
3. source/integration
3. source/lag
3. source/license
3. source/mod upstream
3. source/unknown
3. source/website
4. step/approved
4. step/at work
4. step/blocked
4. step/discussion
4. step/help wanted
4. step/needs confirmation
4. step/partially fixed
4. step/question
4. step/ready to deploy
4. step/ready to QA test
4. step/want approval
5. result/cannot reproduce
5. result/duplicate
5. result/fixed
5. result/maybe
5. result/wontfix
ugh/petz
ugh/QA main
ugh/QA NOK
ugh/QA OK
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: your-land/bugtracker#4526
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
Currently a stored mapblock is an opaque amalgamation of blocks, entities, metadata and more in the database. Without in depth knowledge of the format it is hard to select, debug or - when issues arise - modify. Not having compound fields is a widespread idea when it comes to databases.
Solutions
Please separate block data, entity data and metadata (like modification_date and creation_date). Maybe even node meta and everything else you could find that allows for reasonable separation.
Alternatives
A similar smaller approach was discussed here: https://github.com/minetest/minetest/issues/10671
Not sure if only mapblocks are affected, but those are the most mysterious to me ;)
A modification_date and creation_date already exist on the player table, I'd like those added to the others as well. Sure, I could do that myself from database POV, but IMO everyone who wants to maintain their DB needs those values on most, if not every table.
Additional context
This would break compat with older maps yet again, so - if implemented - needs much more thought than I most likely put into this request, by smarter people than me, in an effort to only break compat once and then hopefully live with it for a while.
Adding more fields might make the database use more storage for the same amount of world.
A default modification date could defend against all those "my harddisk was full, now my sqlite database is broken" and similar. Throwing away entities for /clearobjects would result in deleting the entity field, instead of having to unpack the mapblock, doing its thing and repacking.
while i'm theoretically in support of this, there's some technical issues that need to be addressed. the data in mapblocks are deliberately chunked together so that they can be compressed more efficiently.
but there's certainly an argument to be made that chunking heterogenous data like node IDs, node metadata, and entity data, are not actually a good target for compression at the same time
When it's a question of compression, I doubt entities and metadata and blocks have any overlap in their compression dictionaries (assuming the modern compression algos still use such a thing)
I don't know enough about MT or modern DBs, so take this with with a grain (or lots) of salt:
If we assume that mapblocks are loaded only when someone is near that block and saved when player leaves - then in this scenario both nodes and entities inside a mapblock always load/unload together.
And if nodes/ents are loaded/unloaded always at the same time, then it makes sense to keep them closer "on a disk" too. And you can't get much closer than just putting them in one blob: you get both with just one
seek()
and oneread()
.There is probably a way to force RDBMS to optimize for this, but it's beyond my knowledge.
(I tried searching "interleaving" or something, I don't really know)Just my assumption about why the blob may make sense.
They do overlap because the compression algorithm doesn´t care what kind of data it gets or what they are used for as long as it can find patterns in that data that get repeated.
Thats why some algorithms have a static dictionary of common patterns.
Yep in that case it wouldn´t make sense to separate them and it would increase the size cause now you got 2 dictionaries etc to save.
Minetest already caches changes in memory and only sends the whole mapblock at a fixed interval to the DB to reduce load and to make sure changes that happened between loading and unloading aren´t completely lost if something goes wrong.
In general it is a complex thing to balance with tradeoffs.
For example
a mob walking through a mapblock
yeah but most mapblocks contain plants that grow or nodes that change based on daytime etc
do we really need to save a whole mapblock(4096 nodes) cause 1 plant node changed
In the end you might come up with a acceptable balance between all those cases.
Just to notice that it works great with backend "A" but has poor performance on "B" etc.
some links
current mapblock storage format documentation:
180ec92ef9/doc/world_format.txt (L230-L475)
map parser in go: https://pkg.go.dev/github.com/minetest-go/mapparser
rust library: https://lib.rs/crates/minetestworld
i thought there was a python library somewhere, but i can't currently find it.
postgres can have compressed tablespaces, obviating the need to store compressed data in the db https://postgrespro.com/docs/enterprise/12/cfs-usage
there's also citus, which seems capable of creating compressed columns, though it seems like it's aimed at professional DBAs and might be hard to set up.
i think trying to make major structural changes to the minetest database would require a huge effort, particularly if you want to create a way to migrate old databases to the new format. you'd have to write new code for every possible map backend as well.
i dug into the mapblock format again, and it's not the whole thing that's compressed, it just compresses the node data and node metadata, and it compresses those separately.
And here we go again
You might also want to dig into my comment again and hopefully notice this time that it was a reply to whosit assumption and I never said minetest does that.
But yes with that carefully quoted snippet of my sentence/text it might appear so.
apologies, i wasn't meaning to correct you, i was just trying to make it clear what was currently happening. i didn't fully remember myself.