Tuesday, January 21, 2014

Nginx Architecture Deep Dive: Array implementation

In the last article we discussed about Nginx Pools. This post will go in detail on Nginx Arrays.

Nginx provides a simple interface to create, destroy arrays with flexibility to increase the array size in run time.

Here is the data structure of ngx_array_t
typedef struct {    void        *elts;
    ngx_uint_t   nelts;
    size_t       size;
    ngx_uint_t   nalloc;
    ngx_pool_t  *pool;
} ngx_array_t;

elts - a pointer to linked list of elements
nelts - number of elements
size - memory size of the array
nalloc - capacity of array in number of elements
pool - pointer to the pool for managing arrays


ngx_array_create()
  Creates an array. It takes pool to use, number of elements in the array and size of the each element.
In summary, it takes an available memory block and initialize the start of available memory with ngx_array_t. Then, allocates required memory for the elements following it.

But, there are chances based on availability in selected memory block, that array data and elements could be in different memory blocks.


ngx_array_destroy()
  Destroys an array. It takes in the pointer to an array.
It reclaims the memory given to array elements provided the elements are the last memory utilization in that block. In other words, the memory block last address is equal to end of array elements address.

It reclaims the memory given to array data provided the memory block last address is same as end of array data.

If the above conditions does not match, memory consumed by arrays is not claimed back.



ngx_array_push()
 Pushes a new element into the array.
This function verifies if number of elements reached the capacity of the array. If not, returns a free element and increases number of elements by one.
Note that, there is no equivalent to delete element. That is to put back the element into free elements.

If there is no free element available, nginx allocates a new memory block with twice the size of array and copies over all the elements into the new memory block. Say if the number of elements in array is 5, if we wanted 6th element - nginx will create an array with 10 elements. Thus, nginx foresees the array requirements and avoid more malloc which is costly.

Going into advanced details, there is a bug in nginx which will result in lot of memory wasted in one corner case.
In ngx_arrary_push(), nginx checks if the array elements allocation is the last in the memory block and if there is still space available it will expand the array elements right in that pool. 
Say, that condition hits and we expanded the array till the end of the memory block. From now on, if any new element is pushed, nginx always fall down to else condition and go on consuming new memory even though the current memory block has enough space to accommodate the double the elements.

Say we need an array of 10 elements. Lets say the memory block used for this is consumed to the end. Now if user wants a new element, array size increases to 20 and content is copied to the new memory block.  
From now on, any new element request will result in doubling of the array size and that too copied to new location always. It goes like 40, 80, 160 etc., even though there is memory available in the memory block. 
Also, the old memory is not claimed back, and it too adds to wastage.

The reason is that pool data pointer is outdated once a new memory block is created in the pool. The check given below would always fail once it starts failing, the size of memory goes way beyond.

        if ((u_char *) a->elts + size == p->d.last
            && p->d.last + a->size <= p->d.end)
The right way is to store the relevant pool data in array and not just pool pointer. We cannot use pool->current as well because that points to next available memory block but not the memory block which has array elements.

ngx_array_push_n()
It is same as ngx_array_push() but creates 'n' elements instead of just one.

Thursday, January 16, 2014

Nginx Architecture Deep Dive: Pool Implementation

This is one of the series of articles that I will write on nginx Architecture. I did a google search but could not find proper documentation on nginx. Although there are few but not as much in detail as I wanted.

This series of articles I am going to write on nginx will be complimentary along with those already existing. I will start with various utilities that are built in the nginx.

In this post, I will describe about nginx memory management utility called 'Pools'

From Usage perspective,  Pools are chunks/blocks of memory that are allocated and managed internally.  Pools try to use contiguous memory and if there is requirement of memory more need than available nginx allocates a new block and adds to the pool.

Nginx can allocate a max size of Page size for each block i.e 4095(4096-1) on x86. To meet the requirements of memory of size more than 4k, nginx uses 'malloc' and manages it.

Here is the pool structure

struct ngx_pool_s {
    ngx_pool_data_t       d;
    size_t                max;
    ngx_pool_t           *current;
    ngx_chain_t          *chain;
    ngx_pool_large_t     *large;
    ngx_pool_cleanup_t   *cleanup;
    ngx_log_t            *log;
};

 ngx_pool_data_s stores all the information related to pool.  It has the following pointers

'last' : Points to the available memory to use in the block
'end': Points to the dead end of that memory block
'next': Points to the next memory block's pool data structure
'failed' : Number of failures to allocate memory in this block. 
'max' is the total size of memory block
'current' points to the first available memory block for use in the pool.  This is always updated to the block which has memory left for use and not failed more than 4 times in allocating memory.
'large' points to the chain of memory blocks allocated by malloc
'cleanup' will hold a function pointer with data to pass to it. Applications can make use of it to set callbacks during pool destruction.
'log' points to logger utility structure which I will describe in coming posts.

Note: nginx allocates more  memory blocks on demand. But it does not provide any function to delete the memory blocks except for those 'large' memory blocks that are wrapped around 'malloc'

Users of the Pools can reset the memory blocks and free up the large buffers.  After a reset of pool, each memory block will have the 'last' pointed to the start of the memory block(i.e actual start + size of (ngx_pool_s))


Following are the pool functions that are used by other nginx modules or utilities:

ngx_palloc()  - To allocate memory of a given size from a given pool
ngx_pnalloc() - similar to above but allocates zeroed memory
ngx_create_pool() - Create a new memory Pool of a given size
ngx_destroy_pool() - Destroy the memory pool - frees up both memory blocks and large buffers
ngx_pool_reset() - Cleans up Large buffers and makes all memory blocks available for use.




Monday, December 9, 2013

Openstack and Virtual Private Cloud

A virtual Private Cloud (VPC) is a virtual Cloud over the cloud. The resources and infrastructure is shared with other VPC users but they feel as if the resources are private to them.

As per wikipedia, the resources are allocated among the VPC users like VLAN or a set of encrypted communication channels etc.,

Is Project equivalent to VPC?


IMO, its  equivalent in a limited way. Project/Tenant Owner can create network resources specific to them. But the following are limitations
  • Cannot manage other users within VPC
  • Cannot isolate user resources. They are all shared across other users of that Project
In addition to the above - Ask the following questions and we get a solution

Why would any enterprise want to share private user list with Public Cloud Service Provider!!!?

Why would any Cloud Service Provider want to manage minor administrative things of a VPC user??

It must be one time effort for a Cloud Service Provider. Create a VPC, allocate resources and create a VPC Admin and hand off.

What is needed?

Keystone in Openstack need some changes. A new level of admin user for VPC must be created.

Public cloud is managed by Service Provider(SP) Admin. 
As a SP Admin, he can create new VPCs, manage VPC Admins, allocate resources for it.

As a VPC Admin, he can create new projects with in VPC, Manage resources allocated to the VPC and manage users of VPC.

Thus, a VPC would have full set of Cloud features under the control of VPC Admin. And SP Admin need not  intervene into management of VPC Cloud


Keystone needs to define VPC Admin as 
  • Admin to all resources with in VPC
  • Admin who can manage users with in VPC
  • Admin who can manage(create/delete/update) new projects in the VPC
All openstack Services must create new set of policies for the VPC Admin.

Comments

I welcome comments and suggestions to define VPC model for Openstack. 

Monday, December 2, 2013

Openstack RabbitMQ issues and solution

Symptoms

In our deployment, we observed that when ever there is a network interruption, any Openstack operation that requires MQs get struck for long time. For example, VM creation will be struck in "Build State"

Observations

From our observations, it took 12 to 18 minutes to recover from that state. We could see lot of messages unacknowledged and Ready state.
Also, we used to see lot of Consumers per Queue. It means there are lot of TCP connections from the consumer to the RabbitMQ broker which does not make any sense.

We did check the TCP network connections on Rabbit Server and there are indeed several of them in ESTABLISHED state while on the consumer(say in compute) there is only one network connection.

It means problematic connections were closed on the consumer  but those connections are still there in rabbit server. The health check between rabbit client and server implementations is not implemented inOpenstack code.

This state would recover but takes lot of time depending on number of consumers( for example many number of nova-computes)

Solution

We introduced load balancer and placed rabbit servers behind the Virtual service. Load Balancer implements a sort of proxy where it maintains states on each side of connection. When ever there is a problem on client side (Say) it closes connection on the server end. 
With this in place, network interruptions like switch reboot etc., did not have any affect on our Openstack deployment.
We configured load balancer to have idle inactive timeout of 90 seconds as our periodic updates from compute happens every 60 seconds. Thus, we do not close our rabbit connections un-necessarily.

Update:

There are other advantages with loadbalancer. The distribution of load from Openstack rabbit client is not so good. It takes in a list of rabbit servers and picks the first active rabbit server blindly. It does not really understand the actual load of rabbit servers. With LB in place, we can distribute the consumers on all rabbit servers.
It is from our observations, this indeed improved overall performance of Openstack with this deployment.

Thursday, September 19, 2013

My notes for Git

This is my random notes to bookmark git commands for future reference.

Push/Pull a local repository to a remote git location

Lets take an example. I have my local repo pointing to openstack/neutron.git. I made some changes in my local repo and want to take backup before I send the code for review. I just need to add 'remote' location and push the code there.

git remote add githubbackup <url>

You can verify by looking into .git/config file if remote is added as expected.

Now push the code from master 

git push githubbackup master

If you want to pull the code from actual repo

git pull origin master

Similarly to pull from githubbackup 

git pull githubbackup master



For generating the patches from the topmost commits from a specific sha1 hash:
git format-patch -<n> <SHA1>
The last 10 patches from head in a single patch file:
git format-patch -10 HEAD --stdout > 0001-last-10-commits.patch

Sunday, August 25, 2013

Openstack devstack setup with VXLAN as overlay

Good News is that OVS now supports VXLAN tunnel protocol. This makes opensource Openflow Controllers equipped with power to get ready enter into new markets where Overlay networks are preferred. Here is the release notes for Openvswitch release 1.10.

I wanted to build a Openstack setup with VXLAN and try it out myself. But there was no proper documentation put together. Hope this post helps folks who want to build VXLAN overlay network based Openstack setup.

Setup:

I have two Ubuntu Servers and a Windows desktop. I used Ubuntu Servers as compute nodes , twos VMs on virtual box in Windows. One VM as a Controller and second VM as a Network node.
Since VXLAN support in OVS is in master branch, I chose to use devstack to setup Openstack.

For simplicity I will not mention second Compute Node in the config section as the settings are same as other. And each system has only one NIC.

Controller: 192.168.1.121
Compute Node: 192.168.1.112
Network Node: 192.168.1.123

localrc for Controller:

#SCHEDULER=nova.scheduler.simple.SimpleScheduler
SCHEDULER=nova.scheduler.filter_scheduler.FilterScheduler
LOGFILE=/opt/stack/data/stack.log
SCREEN_LOGDIR=/opt/stack/data/log
RECLONE=yes
#disable_service n-net, n-cpu
#enable_service q-svc, q-agt, q-l3, q-meta, q-dhcp, neutron
ENABLED_SERVICES=g-api,g-reg,key,n-api,n-crt,n-obj,n-cond,cinder,c-sch,c-api,c-vol,n-sch,n-novnc,n-xvnc,n-cauth,horizon,rabbit,mysql,neutron,q-svc
Q_SRV_EXTRA_OPTS=(tenant_network_type=vxlan)
Q_AGENT_EXTRA_AGENT_OPTS=(tunnel_types=vxlan vxlan_udp_port=8472)
ENABLE_TENANT_TUNNELS=True


localrc for compute

ENABLED_SERVICES=n-cpu,rabbit,neutron,q-agt
LOGFILE=/opt/stack/data/stack.log
SCREEN_LOGDIR=/opt/stack/data/log
RECLONE=yes
# Openstack services running on controller node
SERVICE_HOST=192.168.1.121 # replace this with the IP address of the controller node
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST
Q_HOST=$SERVICE_HOST
GLANCE_HOSTPORT=$SERVICE_HOST:9292
Q_AGENT_EXTRA_AGENT_OPTS=(tunnel_types=vxlan vxlan_udp_port=8472)
Q_SRV_EXTRA_OPTS=(tenant_network_type=vxlan)
ENABLE_TENANT_TUNNELS=True 

localrc for network node

SERVICE_HOST=192.168.1.121
MYSQL_HOST=$SERVICE_HOST
RABBIT_HOST=$SERVICE_HOST
Q_HOST=$SERVICE_HOST
GLANCE_HOSTPORT=$SERVICE_HOST:9292
#SCHEDULER=nova.scheduler.simple.SimpleScheduler
SCHEDULER=nova.scheduler.filter_scheduler.FilterScheduler
LOGFILE=/opt/stack/data/stack.log
SCREEN_LOGDIR=/opt/stack/data/log
RECLONE=yes
ENABLED_SERVICES=q-agt,q-l3,q-dhcp,q-meta,rabbit
Q_SRV_EXTRA_OPTS=(tenant_network_type=vxlan)
Q_AGENT_EXTRA_AGENT_OPTS=(tunnel_types=vxlan vxlan_udp_port=8472)
ENABLE_TENANT_TUNNELS=True

Other Changes

As expected, this does not work straight forward. Openstack requires us to install OVS manually as it is not Officially released by Ubuntu.

Download OVS 1.10 from here on compute nodes and Network node. 
Here are the installation instructions

./configure --prefix=/usr --localstatedir=/var  --with-linux=/lib/modules/`uname -r`/build
make
make install
sudo rmmod openvswitch
sudo  insmod datapath/linux/openvswitch.ko
sudo mkdir -p /usr/etc/openvswitch
sudo pkill ovsdb-tool
sudo pkill ovsdb-server
sudo pkill ovs-vswitchd
sudo rm -rf /usr/etc/openvswitch/conf.db
sudo  ovsdb-tool create /usr/etc/openvswitch/conf.db vswitchd/vswitch.ovsschema
sudo ovsdb-server --remote=punix:/var/run/openvswitch/db.sock \
                     --remote=db:Open_vSwitch,manager_options \
                     --private-key=db:SSL,private_key \
                     --certificate=db:SSL,certificate \
                     --bootstrap-ca-cert=db:SSL,ca_cert \
                     --pidfile --detach

sudo ovs-vsctl --no-wait init
sudo  ovs-vswitchd --pidfile --detach
sudo ovs-vsctl add-br br-int


One more issues I faced is the VNC console. Devstack multi node scripts seems to have problem. They dont generate proper config for Compute Nodes. The listen address is set to 127.0.0.1 and few variables not set. I changed the nova.conf and restarted nova-compute.

novnc_enabled=True
novncproxy_base_url=http://192.168.1.121:6080/vnc_auto.html
xvpvncproxy_base_url=http://192.168.1.121:6081/console
novncproxy_port=6080
vncserver_proxyclient_address=192.168.1.112
vncserver_listen=0.0.0.0


Hope, you find this post useful. Let me know in comments if you need more information.


Sunday, November 18, 2012

Setting up Dev environment for Nodejs coffeescript development


This article describes my experiences in selecting the development environment for Nodejs and coffee scripts.


First Step: Choosing  OS


Most of the JS developers use windows. Since I am from networking background, I preferred Linux and that too Ubuntu system. Download and installation of Ubuntu is simple and fast.

Second step: Choosing an Editor


I do not have Java Background. In my career so far, I used vim, cscope and ctags. vim by default does not highlight coffee script syntax. When opened an coffeescript file, it showed as if its plain text. Hence I began to search for proper editor to use. I understand there are some very good IDE like eclipse. But I have to jump start and not worry about learning new IDEs.

I found one good vim extension that suits my needs. It is vim-coffee-script. Here is a quick setup guide installing it. If you are used to other editors – Check this link if it needs any extensions and add your experiences in comments of this blog.

Here are the steps to install vim-coffee-script:

1) Download the latest zipball from vim.org or github. The latest version on github is under Download Packages (don’t use the Download buttons.)

2) Extract the archive into ~/.vim/:

unzip -od ~/.vim vim-coffee-script-HASH.zip

You can compile coffee scripts using vim-coffee-scripts using vim. Read more about it here.

Third Step: Installing node and npm

Ubuntu has good package support. To install node, coffee script and npm just run the following.

sudo apt-get install node coffeescript npm
Thats it. Development environment for Node using coffeescripts in Ubuntu is ready to use.