Tuesday, January 21, 2014

Nginx Architecture Deep Dive: Array implementation

In the last article we discussed about Nginx Pools. This post will go in detail on Nginx Arrays.

Nginx provides a simple interface to create, destroy arrays with flexibility to increase the array size in run time.

Here is the data structure of ngx_array_t
typedef struct {    void        *elts;
    ngx_uint_t   nelts;
    size_t       size;
    ngx_uint_t   nalloc;
    ngx_pool_t  *pool;
} ngx_array_t;

elts - a pointer to linked list of elements
nelts - number of elements
size - memory size of the array
nalloc - capacity of array in number of elements
pool - pointer to the pool for managing arrays


ngx_array_create()
  Creates an array. It takes pool to use, number of elements in the array and size of the each element.
In summary, it takes an available memory block and initialize the start of available memory with ngx_array_t. Then, allocates required memory for the elements following it.

But, there are chances based on availability in selected memory block, that array data and elements could be in different memory blocks.


ngx_array_destroy()
  Destroys an array. It takes in the pointer to an array.
It reclaims the memory given to array elements provided the elements are the last memory utilization in that block. In other words, the memory block last address is equal to end of array elements address.

It reclaims the memory given to array data provided the memory block last address is same as end of array data.

If the above conditions does not match, memory consumed by arrays is not claimed back.



ngx_array_push()
 Pushes a new element into the array.
This function verifies if number of elements reached the capacity of the array. If not, returns a free element and increases number of elements by one.
Note that, there is no equivalent to delete element. That is to put back the element into free elements.

If there is no free element available, nginx allocates a new memory block with twice the size of array and copies over all the elements into the new memory block. Say if the number of elements in array is 5, if we wanted 6th element - nginx will create an array with 10 elements. Thus, nginx foresees the array requirements and avoid more malloc which is costly.

Going into advanced details, there is a bug in nginx which will result in lot of memory wasted in one corner case.
In ngx_arrary_push(), nginx checks if the array elements allocation is the last in the memory block and if there is still space available it will expand the array elements right in that pool. 
Say, that condition hits and we expanded the array till the end of the memory block. From now on, if any new element is pushed, nginx always fall down to else condition and go on consuming new memory even though the current memory block has enough space to accommodate the double the elements.

Say we need an array of 10 elements. Lets say the memory block used for this is consumed to the end. Now if user wants a new element, array size increases to 20 and content is copied to the new memory block.  
From now on, any new element request will result in doubling of the array size and that too copied to new location always. It goes like 40, 80, 160 etc., even though there is memory available in the memory block. 
Also, the old memory is not claimed back, and it too adds to wastage.

The reason is that pool data pointer is outdated once a new memory block is created in the pool. The check given below would always fail once it starts failing, the size of memory goes way beyond.

        if ((u_char *) a->elts + size == p->d.last
            && p->d.last + a->size <= p->d.end)
The right way is to store the relevant pool data in array and not just pool pointer. We cannot use pool->current as well because that points to next available memory block but not the memory block which has array elements.

ngx_array_push_n()
It is same as ngx_array_push() but creates 'n' elements instead of just one.

Thursday, January 16, 2014

Nginx Architecture Deep Dive: Pool Implementation

This is one of the series of articles that I will write on nginx Architecture. I did a google search but could not find proper documentation on nginx. Although there are few but not as much in detail as I wanted.

This series of articles I am going to write on nginx will be complimentary along with those already existing. I will start with various utilities that are built in the nginx.

In this post, I will describe about nginx memory management utility called 'Pools'

From Usage perspective,  Pools are chunks/blocks of memory that are allocated and managed internally.  Pools try to use contiguous memory and if there is requirement of memory more need than available nginx allocates a new block and adds to the pool.

Nginx can allocate a max size of Page size for each block i.e 4095(4096-1) on x86. To meet the requirements of memory of size more than 4k, nginx uses 'malloc' and manages it.

Here is the pool structure

struct ngx_pool_s {
    ngx_pool_data_t       d;
    size_t                max;
    ngx_pool_t           *current;
    ngx_chain_t          *chain;
    ngx_pool_large_t     *large;
    ngx_pool_cleanup_t   *cleanup;
    ngx_log_t            *log;
};

 ngx_pool_data_s stores all the information related to pool.  It has the following pointers

'last' : Points to the available memory to use in the block
'end': Points to the dead end of that memory block
'next': Points to the next memory block's pool data structure
'failed' : Number of failures to allocate memory in this block. 
'max' is the total size of memory block
'current' points to the first available memory block for use in the pool.  This is always updated to the block which has memory left for use and not failed more than 4 times in allocating memory.
'large' points to the chain of memory blocks allocated by malloc
'cleanup' will hold a function pointer with data to pass to it. Applications can make use of it to set callbacks during pool destruction.
'log' points to logger utility structure which I will describe in coming posts.

Note: nginx allocates more  memory blocks on demand. But it does not provide any function to delete the memory blocks except for those 'large' memory blocks that are wrapped around 'malloc'

Users of the Pools can reset the memory blocks and free up the large buffers.  After a reset of pool, each memory block will have the 'last' pointed to the start of the memory block(i.e actual start + size of (ngx_pool_s))


Following are the pool functions that are used by other nginx modules or utilities:

ngx_palloc()  - To allocate memory of a given size from a given pool
ngx_pnalloc() - similar to above but allocates zeroed memory
ngx_create_pool() - Create a new memory Pool of a given size
ngx_destroy_pool() - Destroy the memory pool - frees up both memory blocks and large buffers
ngx_pool_reset() - Cleans up Large buffers and makes all memory blocks available for use.