I found another way to optimize the rom space by implementing a custom malloc, free, realloc and calloc
function.
This reduces rom size by 3 KB and IWRAM usage by 1 KB. (elimination of __malloc_av). The original
implementation is much more complex and larger than it needs to be.
The custom malloc is implemented as a bitmap allocator. It keeps a bitmap to track which pages of the
heap are allocated. Like the original allocator, it uses the free space in EWRAM after the multiboot gba
rom. But unlike the original allocator, we control the size with CUSTOM_MALLOC_POOL_SIZE.
The custom malloc can be disabled with USE_CUSTOM_MALLOC.