Tuesday, January 21, 2014

Nginx Architecture Deep Dive: Array implementation

In the last article we discussed about Nginx Pools. This post will go in detail on Nginx Arrays.

Nginx provides a simple interface to create, destroy arrays with flexibility to increase the array size in run time.

Here is the data structure of ngx_array_t
typedef struct {    void        *elts;
    ngx_uint_t   nelts;
    size_t       size;
    ngx_uint_t   nalloc;
    ngx_pool_t  *pool;
} ngx_array_t;

elts - a pointer to linked list of elements
nelts - number of elements
size - memory size of the array
nalloc - capacity of array in number of elements
pool - pointer to the pool for managing arrays


ngx_array_create()
  Creates an array. It takes pool to use, number of elements in the array and size of the each element.
In summary, it takes an available memory block and initialize the start of available memory with ngx_array_t. Then, allocates required memory for the elements following it.

But, there are chances based on availability in selected memory block, that array data and elements could be in different memory blocks.


ngx_array_destroy()
  Destroys an array. It takes in the pointer to an array.
It reclaims the memory given to array elements provided the elements are the last memory utilization in that block. In other words, the memory block last address is equal to end of array elements address.

It reclaims the memory given to array data provided the memory block last address is same as end of array data.

If the above conditions does not match, memory consumed by arrays is not claimed back.



ngx_array_push()
 Pushes a new element into the array.
This function verifies if number of elements reached the capacity of the array. If not, returns a free element and increases number of elements by one.
Note that, there is no equivalent to delete element. That is to put back the element into free elements.

If there is no free element available, nginx allocates a new memory block with twice the size of array and copies over all the elements into the new memory block. Say if the number of elements in array is 5, if we wanted 6th element - nginx will create an array with 10 elements. Thus, nginx foresees the array requirements and avoid more malloc which is costly.

Going into advanced details, there is a bug in nginx which will result in lot of memory wasted in one corner case.
In ngx_arrary_push(), nginx checks if the array elements allocation is the last in the memory block and if there is still space available it will expand the array elements right in that pool. 
Say, that condition hits and we expanded the array till the end of the memory block. From now on, if any new element is pushed, nginx always fall down to else condition and go on consuming new memory even though the current memory block has enough space to accommodate the double the elements.

Say we need an array of 10 elements. Lets say the memory block used for this is consumed to the end. Now if user wants a new element, array size increases to 20 and content is copied to the new memory block.  
From now on, any new element request will result in doubling of the array size and that too copied to new location always. It goes like 40, 80, 160 etc., even though there is memory available in the memory block. 
Also, the old memory is not claimed back, and it too adds to wastage.

The reason is that pool data pointer is outdated once a new memory block is created in the pool. The check given below would always fail once it starts failing, the size of memory goes way beyond.

        if ((u_char *) a->elts + size == p->d.last
            && p->d.last + a->size <= p->d.end)
The right way is to store the relevant pool data in array and not just pool pointer. We cannot use pool->current as well because that points to next available memory block but not the memory block which has array elements.

ngx_array_push_n()
It is same as ngx_array_push() but creates 'n' elements instead of just one.

No comments:

Post a Comment