est31 wrote:Hi Argos, I like your mapper!
Thanks!
I have developed a small
sloppy mapper script that calls your program to generate tiles. The main design idea behind this script was to not have one minetestmapper instance making a huge map and then cutting it into pieces ("top down") but to make minetestmapper generate many small tiles and then set them together for zoom out layers ("bottom up"). This approach allows to avoid keeping the whole map in RAM. minetestmapper and also the cutting tools other sloppy mappers use require some resources here.
Good idea. And very nice interactive map !
My suspection is that most time is spent loading all area position hashes. I see this is the best way of doing it when you request a map of the whole world, which is usually only sparsely populated. But when you have many instances of minetestmapper called after another, they spend more time loading the keys than doing anything else.
Could be. In particular on a leveldb database. I think sqlite3 (which uses an index) and redis (which keeps everything in memory anyway) are pretty efficient at producing block lists. Does VanessaE's world use leveldb ?
But then, even for a leveldb database, I presume that the block list should be in the disk cache by the time the first mapper process terminates, so that subsequent instances don't need the disk access...
I know from my own tests that
decompression of map blocks is often quite a large percentage of the execution time. Much larger than reading block numbers. But those tests were on larger images (I think 10000x10000), and I must confess I don't own a quad-core i7...
Nevertheless, whatever my results, YMMV. There are a few easy ways you could determine where your version actually spends time::
It would be helpful if you could analyze where your version
really spends it time...
There are some options (ordered from easy to do to hard to do, but also by how much I think the solution helps solving the problem) to solve this:
-Adding a command line option that disables this loading, and instead tries to lookup the map for every required position
-When minetestmapper could store those keys in a file in a way that is optimised for fast access without having to load the whole file, it would greatly improve speed, too.
-If minetestmapper could be configured to generate many small tiles, prefferably multithreaded (but thats no priority), in one instance, and not store so much in RAM, it would be best.
I agree. That last option would be a very nice feature. I have actually thought about such a feature, but not yet analyzed the impact, or even decided to do so :-)
An added advantage would be, that special care could be taken to ensure that the boundaries between tiles have correct shading. Now they probably don't, unless you generate larger images and cut the borders (1 pixel should suffice).
What do you think, is the best option?
The third option has definite architectural advantages. Except for ease of implementation indeed. I think the second option should be relatively painless. Even the cost of loading a 60MB to 120MB file of block numbers (for a HUGE world with 20M blocks) would probably be neglegible. The first option may indeed be easiest to implement, and verify for its benefit - which it might very well
not have in a significant way... If it does have a significant benefit, it will probably be a quick win.
So the 'best' option depends on one's goals...
However, I do think there is an option that is easier still: you could consider generating 10240x10240 maps instead of 1024x1024. That reduces the number of times the blocks need to be read from the database by a factor 100, while still generating reasonably small maps (requiring maybe 0.5 GB of RAM per image).
WRT multithreading: I suspect the best way to make use of multithreading is to parallelize the decompression of map blocks. The easiest way to make use of
multiprocessing though, at the moment, is probably to start several instances of minetestmapper in parallel :-)