Floresy 
I'm not 100% convinced that aligned allocation is a primary concern for most developers using malloc, sure its good when you're considering nitty gritty performance tuning but if you're software has other bottlenecks it might be moot to think about, and it almost certainly not the first thing you should be thinking about. Also the standard library does offer an aligned version of malloc.
Well, memory allocations always need to be aligned, the only reason malloc doesn't take an alignment value is because it always maximally aligns.
The problem with malloc isn't just alignment. If you were to implement malloc as is, you can't write a good performing allocator, just because of the api. Having the allocator store the size of allocated regions is bad. Whenever you free something, more likely than not, that thing you are freeing isn't in the cache. If the allocator then goes and reads from that region to get the size, you get a cachemiss every single time. Even if custom alignment isn't important most of the time, allocator performance itself is.
The api that I suggested makes it so that you as the library implementer are given the most freedom when it comes to implementing an allocator.
Floresy 
 And while I don't have a better API/solution to offer, "void* malloc(size_t size, size_t* allocated size)" still requires user code to keep that allocated size around, so its really same as requesting an allocated size and storing it. 
When has anybody ever allocated memory and not stored the size somewhere? As it is, almost all C programs basically store the size of allocated memory twice in their programs because of C and malloc.
The 
only usecase for not keeping the size around after allocation is if you are allocating null terminated strings, and that is not a good enough reason. First of all, almost nobody really uses null terminated strings as is. We just enforce a null at the end of strings, but everybody still keeps at least a capacity value around, so that inserting/removing chars doesn't incur an allocation all the time. Second of all, you don't need malloc itself to keep the size information to be able to allocate and free nullterminated strings. Just use a different api for string allocations.
We already treat nullterminated strings differently than memory blocks. There is memchr and strchr, memcpy and strcpy etc etc.
Why not have stralloc and strfree for that 
single usecase of not needing a size?
Here is an even an implementation using the improved malloc:
|  1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12 | char* stralloc( size_t size )
{
    char* result = (char*)malloc( size + sizeof( size_t ), alignof( char ) );
    memcpy( result, &size, sizeof( size_t ) );
    return result + sizeof( size_t ) );
}
void strfree( char* str )
{
    size_t size;
    memcpy( &size, str - sizeof( size_t ), sizeof( size_t ) );
    free( str - sizeof( size_t ), size, alignof( char ) );
}
 | 
There you go, malloc is fixed and doesn't treat every single allocation as allocating strings.