I am investigating strange memory allocation behavior in 32 bit Windows. The program below allocates n arrays of size m (as a linked list) until out of memory, and then frees the list. m is passed as a parameter. Next it allocates all available memory again, but this time using a binary search to find the largest array it can allocate at each step and continuing until this size is 0.
I am testing in 32 bit Vista with 3 GB RAM compiling with g++ and VC++. With either compiler, about 2 GB can be allocated in either pass regardless of m. I tried g++ -Wl,--large-address-aware with several builds (4.8.1 Mingw-x64, 4.8.1 GCC, 4.8.0 niXman), which is supposed to increase the address space to 3 GB, but the option has no effect.
The strange behavior I am observing is that if the node size of the linked list is m = 520185 or larger, then the second pass is able to allocate a single contiguous array of a little less than 2 GB followed by several smaller arrays. If m is 520184 or smaller then malloc() is unable to allocate any arrays larger than 520184, although the total is still about 2 GB. This behavior is the same with g++ and VC++. So I am guessing it is a bug in the operating system or some other shared code that freeing all memory does not unfragment it.
Here is an example. mtg was compiled with g++ 4.8.1 -O3. mt was compiled with VC++ 16.00.30319.0 /O2.
Code:
C:\tmp>mtg 520185
4071 x 520185 = 2117673135 allocated
allocated 1968701408 at 009E0020
allocated 140378080 at 77110020
allocated 13500384 at 76300020
allocated 8126432 at 7F7F0020
allocated 2162656 at 76040020
allocated 1834976 at 00240020
allocated 1376224 at 00410020
allocated 44032 at 000233E0
allocated 8184 at 0002DFE8
allocated 48 at 000217A0
allocated 0 at 00000000
4071 x 520185 = 2117673135 initially allocated and freed
total reallocated 2136132424
C:\tmp>mt 520185
4074 x 520185 = 2119233690 allocated
allocated 1965293536 at 00D20020
allocated 140378080 at 77110020
allocated 16383968 at 76040020
allocated 8126432 at 7F7F0020
allocated 5767136 at 00780020
allocated 917472 at 00690020
allocated 720864 at 00040020
allocated 73624 at 001F0048
allocated 65432 at 00020048
allocated 57336 at 00201FE8
allocated 42672 at 00773930
allocated 8184 at 0077DFE8
allocated 208 at 00771A80
allocated 0 at 00771A80
4074 x 520185 = 2119233690 initially allocated and freed
total reallocated 2137834944
C:\tmp>mtg 520184
3979 x 520184 = 2069812136 allocated
allocated 520184 at 7A8EF048
allocated 520184 at 7B9BD048
allocated 520184 at 7CA8B048
allocated 520184 at 7DB59048
allocated 520184 at 7EC27048
allocated 520184 at 76973048
allocated 520184 at 7FC6F048
... (4K lines not shown)
C:\tmp>mt 520184
3981 x 520184 = 2070852504 allocated
allocated 520184 at 7A9ED048
allocated 520184 at 7BABB048
allocated 520184 at 7CB89048
allocated 520184 at 7DC57048
allocated 520184 at 7ED25048
allocated 520184 at 75B9F048
allocated 520184 at 7F40F048
... (4K lines not shown)
When I sort the pointers, they are mostly at intervals of 520192 = 2^19 - 2^12, which is 8 bytes larger than the array size.
Of course this is different from 64 bit code. I tested a similar program in 64 bit Ubuntu with 4 GB real memory and 3 GB swap. It can allocate 140 TB in chunks of 7 GB as long as you don't write to it.
Code:
// mt.cpp - test allocating all memory. Public domain.
#include <stdio.h>
#include <stdlib.h>
// Allocate the largest array possible. Print and return its size.
size_t mtest() {
size_t lo=1, hi=size_t(-1), mid=0, best=0;
void* p;
while (lo<hi) {
mid=lo/2+hi/2+((lo&1)*(hi&1));
void* p=malloc(mid);
if (p) best=mid, free(p), lo=mid+1;
else hi=mid-1;
}
while (best>0 && (p=malloc(best))==0) --best;
printf("allocated %1.0f at %p\n", best+.0, p);
return best;
}
int main(int argc, char** argv) {
int m=0;
if (argc>1) m=atoi(argv[1]); // get m
if (m<sizeof(void*)) m=sizeof(void*);
// Build a linked list of n nodes of size m bytes until out of memory.
int n=0;
void* start=0;
void *p;
while ((p=malloc(m))!=0) {
++n;
*(void**)p=start;
start=p;
}
printf("%d x %d = %1.0f allocated\n", n, m, n*1.0*m);
// Free the list
while (start) {
p=*(void**)start;
free(start);
start=p;
}
// Allocate all available memory as arrays as large as possible
size_t sum=0, t;
while ((t=mtest())>0) sum+=t;
printf("%d x %d = %1.0f initially allocated and freed\n", n, m, n*1.0*m);
printf("total reallocated %1.0f\n", sum+.0);
return 0;
}