PlomRogue Development Blog

Arrays for Two-Dimensional Maps – 1D, or 2D?

I recently blogged on how my game map geometry is set up. One issue: How my 2D (two-dimensional) maps are encoded in 1D (one-dimensional) arrays. Let's assume a game map of n rows and m columns: Such a map can be stored in a 1D array of this size: n × m × size of an individual element. This 1D array may then be read as a 2D map – by transforming 2D coordinates to 1D array indexes. The formula: 1D array index = (number of row to access × width of map in columns) + number of column to access. In C code:

/* sizeof(char) is just for clarity. It always equals 1, so it's redundant. */ 
char * map = malloc(map_height_in_rows * map_width_in_columns * sizeof(char)); 
 
/* Set map cell in row y, column x to value 1. (Parentheses are included for 
 * clarity, but redundant due to operator precedence.) 
 */ 
map[(y * map_width_in_columns) + x] = 1; 

Lukas, maker of tinyRL, wrote a comment on this handling of array business: "Why didn't you consider a two-dimensional array? This would solve the coordinates problem nicely and since the map is of fixed size it wouldn't make any problems. Actually this is the way I do store the map in tinyRL."

I had not considered a 2D array for my maps before. Of course, I could encode my map as an array of arrays. The second-level arrays would be my map's rows – indexes of which would match my x coordinates. A row would be selected with the first-level array's index – which would match my y coordinates. The above setting of a map cell value would look like this:

/* Let's presuppose my char * map somehow got replaced by a char ** map. */ 
map[y][x] = 1; 

This seems clearer than the "map[(y * map_width_in_columns) + x] = 1" style. It also needs one less variable. That fact forced me to give Lukas' suggestion some considering. I considered, and experimentally implemented it. After some more considering, and testing, I dropped it again. 2D arrays solved some old problems, and created some new ones. Neither kind of problem was especially severe – but my priorities gave the 1D array a slight edge. My choice might change again with small shifts in my priorities – or learning better ways to solve some 2D array problems. Here I want to document the problems I encountered, and how my priorities affected their evaluation.

Creating 2D map arrays, the tinyRL way

Lukas in his comment seemed to assume my maps were of fixed size. That's not completely correct: The map size is configured by the user. The user's map size command is read in during program runtime. The numbers of the map's rows and columns are not known in advance. But that's no excuse against Lukas' proposal: His tinyRL 2D array maps are not strictly of fixed size, either. Let's look at how tinyRL sets up its map. First, the map's dimensions are determined:

map_dimensions[0] = tb_width(); 
map_dimensions[1] = tb_height(); 

Both tb_width() and tb_height() belong to the termbox library included by tinyRL. According to termbox.h, they return the dimensions of the "terminal's window size in characters": the terminal screen's numbers of rows and columns. In other words: The map's dimensions are determined only during runtime. They're stored in an array read when the map's 2D array is set up:

int map[map_dimensions[0]][map_dimensions[1]]; 

This creates a 2D array that can be accessed in the "map[y][x]" way. The manner of this creation deserves a closer look, though. The 2D array is not set up in heap memory (as calls to malloc() would do). By declaring it as above, it's set up in stack memory – as an automatic variable. Setting up arrays as automatic variables by itself is nothing special. But for a long time, the C standard only allowed automatic variables of fixed size. Such an array's dimensions needed to be hard-coded, like this:

int map[24][80]; 

In C99, variable-length arrays (VLAs) became part of the standard: automatic variables of arrays whose size was determined only during runtime. Which is just the kind of arrays tinyRL uses for its map. It's an easy way to set up multi-dimensional arrays. Could I use it for my PlomRogue maps?

Variable-length arrays: yay, or nay?

Turns out VLAs (and their predecessor alloca()) are somewhat controversial. Memory allocated with them doesn't need to be freed with free(), their notation is simple, and they allow more flexible use of stack memory. But the latter is also the main argument against them: Stack memory is usually very limited. It should be used up conservatively, so as to not cause stack overflows. VLAs could grow arbitrarely large, and thus use up too much stack. In contrast, heap memory is easily extended when more of it is needed. In theory (not on Linux), malloc() demanding more from the heap than available will return NULL – which can be tested, and handled gracefully as an error. No such tests can be done on declarations of VLAs: The stack either overflows (crashing the program in the most harmless case), or it doesn't. Implementing VLAs in compilers also seems to be non-trivial. In C11, VLAs were relegated from "mandatory" to "optional": Compilers need no longer implement them to be compliant. The portability of VLAs is questionable.

PlomRogue originally used lots of VLAs. Then I read about these issues, and got more careful. I try in general to keep my use of the stack small. (There may be some performance gains in keeping data on the stack. I may explore these at a later stage of development.) Some commits ago, I fully eliminated all my VLAs. I plan to avoid them until I see strong reasons to use them. I therefore won't copy tinyRL's VLA version of 2D map arrays for now.

Dynamic 2D map arrays

But that doesn't rule out 2D arrays of variable size. I just have to set them up on heap memory, with malloc(). That's more complicated than with VLAs, but hardly impossible. First, I allocate an array for pointers to map rows, of this size: number of map rows × size of a map row address. Next, I might get each row's address from a separate malloc(), of this size: number of columns × size of a single map cell:

char ** map = malloc(n_of_rows * sizeof(char *)); 
int i; 
for (i = 0; i < n_of_rows; i++) 
{ 
    map[i] = malloc(n_of_columns * sizeof(char)); 
} 

As straightforward as this method seems, it's considered bad practice. Not only is it a hassle to free such a 2D array: Before a free() of map, one free() must be called on each of its rows. And malloc() may set up each row somewhere else in heap memory. Such splitting up of arrays may slow down iteration through them.

The solution: Allocate the same array for pointers to map rows. Then point its first slot to an address malloc()'d with this size: number of rows × number of columns × size of a single map cell. This sets up one array to store all rows in sequence – just like my current 1D map arrays do. Then point the next address slots to the next row starts in this array. The result: The first array indexes a good old 1D map array. In code:

char ** map = malloc(n_of_rows * sizeof(char *)); 
map[0] = malloc(n_of_rows * n_of_columns * sizeof(char)); 
int i; 
for (i = 1; i < n_of_rows; i++) 
{ 
    map[i] = map[0] + (i * n_of_columns * sizeof(char)); 
} 

Setting up and destroying experimental 2D map arrays in PlomRogue

For PlomRogue, I generalized the above into this function:

extern char ** malloc_map_cells(uint16_t map_height, uint16_t map_width, 
                                size_t element_size) 
{ 
    char ** map_cells = malloc(map_height * sizeof(char *)); 
    map_cells[0] = malloc(map_height * map_width * element_size); 
    uint32_t i; 
    for (i = 1; i < map_height; i++) 
    { 
        map_cells[i] = map_cells[0] + (i * map_width * element_size); 
    } 
    return map_cells; 
} 

(Two notes on this function: 1. It makes no assumptions about the size of a single map cell. In some of my algorithms, I use maps of greater cell size than "sizeof(char)". 2. I don't test the return value of malloc() for NULL. I'd catch that case if I were to commit this. In fact, PlomRogue has a system for handling such errors. But that needs an introduction of its own, maybe in a later post.)

2D arrays thus created with malloc() need to be freed with free(). For my 1D map arrays, a single free() on them sufficed. On a 2D array, free() needs to be called twice: on the array that contains the map cells proper, and on the array indexing that array. Now, consider this: If a pointer is of value NULL, calling free() on it is harmless: "If ptr is a null pointer, no action shall occur." In my current code, I rely on this guarantee of harmlessness: Often, I call free() on map pointers without testing their values first. But now, free() would run on a pointer pointed to by a pointer: "free(map[0])" would (by necessity) come before the more harmless "free(map)". If map is NULL, accessing "map[0]" at best crashes the program. So I had to replace my calls to free() on map arrays with this:

extern void free_map_cells(char ** map_cells) 
{ 
    if (map_cells) 
    { 
        free(map_cells[0]); 
        free(map_cells); 
    } 
} 

What 2D map arrays do for PlomRogue's code

With these two functions (and changing all "char *" declarations of map arrays to "char **" declarations), I could replace all my 1D array maps. In turn, I could replace all "map[(y * map_height) + x]" addressing with "map[y][x]". That does make certain parts of my code easier to read. But some parts actually get slightly more complicated. Turns out, I don't always index map cells with 2D coordinates. Sometimes I just iterate over all map cells linearly. Somewhat like this:

/* Set all map cells to value 1. (Bad example, that'd be a job for memset().) 
 * Let's presuppose a 1D map array. 
 */ 
int i; 
for (i = 0; i < map_height * map_width; i++) 
{ 
    map[i] = 1; 
} 

Using proper 2D coordinates, one more loop is needed:

/* Let's presuppose a 2D map array again. */ 
int y, x; 
for (y = 0; y < map_height; y++) 
{ 
    for (x = 0; x < map_width; x++) 
    { 
        map[y][x] = 1; 
    } 
} 

There is a way to cheat through this with just one loop. Remember that the whole map cells array is allocated to "map[0]". One may abuse this to treat 2D arrays just like 1D ones:

int i; 
for (i = 0; i < map_height * map_width; i++) 
{ 
    map[0][i] = 1; 
} 

It's not a big abuse if you know how the array is set up. Otherwise, it may look somewhat esoteric. Which mirrors the problem with indexing 1D map arrays: "map[y * map_width + x]" may look somewhat esoteric – until you know how the array is set up. Seems to me both approaches cause bizarre indexing notations. They just do so under different kinds of circumstances: treating maps as 1D, or 2D arrays; linearly iterating over whole maps, or choosing cells by their latitude and longitude. I don't see any of these circumstances dominate in PlomRogue's code. Therefore, notation-wise, one approach seems as good as the other.

What 2D map arrays do for PlomRogue's performance

Performance testing for PlomRogue right now happens in a rather crude way: It records game world start data and all player moves to a file "record". PlomRogue can be run in "replay" mode – automatically replaying a game by walking through that file. The same calculations are done, but without pause (no waiting for player input). These calculations include computationally heavy things: enemies' pathfinding algorithms determining their moves; and, after each move, actors' fields of view getting re-processed. I can measure how long such a replay of a "record" file takes. I may then tweak my algorithms one way or another. When done, I replay the same "record" and compare how long it takes now. Thus I roughly learn what changes make the game faster or slower.

And that's how I tested my changes in map array dimensions: I set up two copies of PlomRogue's source code in separate directories. One used 1D map arrays. The other used 2D map arrays as described above. I compiled both versions on gcc with the same compiler flags – the same levels of compiler optimization in particular. I played some amount of turns in one version, and so built a "record" file. I copied that file over into the directory of the other version. I replayed that "record" several times with one version's executable, and then with the other version's executable. Thus I got to measure mean replay durations for both versions, which I then could compare. I repeated this with different compiler flags, different map sizes, and also occasional code tweaks.

I was not very disciplined with recording test results. But I can roughly say this: The 1D map array version was always as fast – or slightly faster. Speed seemed equal on the highest compiler optimizations (-O2, -O3). Inequality turned up on -O1, and on -O0 especially. But its amount changed with other factors, too: Inequality was less with smaller map sizes. It was also reduced by use of the "map[0][i]" hack to remove loops. The greatest difference was with -O0, with maps of 256 × 256 cells size, and sans "map[0][i]" hack: The 2D map array version then took 25% longer than its counterpart. Using the "map[0][i]" hack, that excess impressively dropped to 10%.

On the whole, 2D map arrays don't seem to boost PlomRogue's performance. Nor do they hurt it but on low optimization levels. I can but speculate on what issues slow things down on these. There is the more elaborate setting up and freeing of maps – but those functions should not get called all that often. Indirection in map accesses may play a bigger part: In 1D map arrays, each map cell is but one pointer away: The value of map plus some offset is the address of the cell. With 2D map arrays, two pointers must be followed to access any cell: The value of the map pointer plus some offset is the address of … another address, that of a row. Only that row's address plus one more offset is the cell's address. 2D arrays need two memory jumps where 1D ones need one. (Seems like the problem is a known one.) That difference may not sound like much. But PlomRogue's most expensive calculations involve: lots of map access, lots of iterating over maps, lots of reading or setting of individual map cells – run again and again, at the end of deep loops.

tl;dr

To sum up my findings, and reasons to stay with 1D map arrays: 2D map arrays make for nicer index notations – in some cases. But they complicate index notations in others. Linear iteration over a map is simpler with 1D map arrays. To allocate and free 1D map arrays, not much is needed: a single malloc(), and a single free(); with 2D map arrays, more complex constructions are needed. 2D map arrays mean more indirection in map cell access. Some of these factors may slow down code that uses them. Compiler-optimized, 2D and 1D map array operations seem equal in speed. 2D map arrays don't seem much worse than 1D map arrays. But neither do they seem much better. So status quo wins.

Comments

No comments on this page.

Write your own comment

Commenting currently impossible: Captcha not set.