Skip to main content

PHP Associative Arrays and Hashtable


Our today’s topic is how associative arrays are implemented in PHP. Let’s  discuss over it….

Basic Arrays in C 

Let’s take an example of basic array in C where index or key is always an integer. It can’t be string. All the arrays start from index 0 and ends at index (n-1), where n is the number of elements in array. All the elements are stored in continuous memory locations. For example first element at memory location 0 and 10th  element on 9th location.  This is the basic array of C. 

What is Hashtable? 

A hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associative array.  We will go into detail of this in next few minutes..

What is Hash Function?

The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. There is no magic formula for the creation of the hash function. It can be any mathematical transformation that produces a relatively random and unique distribution of values within the address space of the storage.


 

PHP and Hashtable (an implementation of a map):

In PHP associative arrays can be implemented using hash-maps or b-trees. This has important implications on the performance characteristics of associative arrays and how they should be used; e.g. b-trees are slow to insert but handle collisions better than hash maps.  Hashmaps are faster if there are no collisions, but are slower to retrieve when there are collisions. In PHP associative arrays are actually implemented using hashtables with ordered map or you can say hashmaps.  Following are the actual functions of PHP written in C to implement hashtable. Check here.

struct _hashtable;
typedef struct bucket {
                ulong h;                                                                                  /* Used for numeric indexing */
                uint nKeyLength;
                void *pData;
                void *pDataPtr;
                struct bucket *pListNext;
                struct bucket *pListLast;
                struct bucket *pNext;
                struct bucket *pLast;
                const char *arKey;
} Bucket;

typedef struct _hashtable {
                uint nTableSize;
                uint nTableMask;
                uint nNumOfElements;
                ulong nNextFreeElement;
                Bucket *pInternalPointer;      /* Used for element traversal */
                Bucket *pListHead;
                Bucket *pListTail;
                Bucket **arBuckets;
                dtor_func_t pDestructor;
                zend_bool persistent;
                unsigned char nApplyCount;
                zend_bool bApplyProtection;
#if ZEND_DEBUG
                int inconsistent;
#endif
} HashTable;


In PHP, associative array key(string) is first converted into index using hash function and then that value is stored at that particular index location. Here is again a problem. Hash function returns an integer that is system dependent (e.g 32bit or 64 bit).

For example say, a system with 32 bit OS, against a key named ‘dirtyhandsphp’ hash function returns :
00000000010111111010101010111000

It’s decimal value is 6269624.

This is a huge value and can’t represent the location with hashtable size of 2048(say). So here PHP use one more trick. It throws some digits away by using table mask.
 
(ht)->nTableMask = (ht)->nTableSize - 1;

It’s one less than the size of hash table. Because if table size is 2048(210) then maximum index could be 2047 starting from 0.

PHP performs & operation between both of them
    
          00000000010111111010101010111000
   &    00000000000000000000001111111111
______________________________________________________
          00000000000000000000001010111000 = 676

So the element would be stored at 676th location.

Collision: Implementing hash table in this way produces some problems called collision. For example hash function can generate same integer value for two different strings or you can say for associative array keys.  Two locations may refer to same index. Such situation is called collision. In PHP to solve such type of problem, it uses linked list. So if same index is generated then it would stored at different location but would linked to the same index generated using hash function.

To read or access the element again same process will follow :


1) Hash is calculated
2) Use table mask
3) Locate the memory address and
4) Get the data or if required traverse the linked list to find the element we needed.

Thanks !!!!!!!!!!! Enjoy Programming :)


Comments

Popular posts from this blog

Odoo/OpenERP: one2one relational field example

one2one relational field is deprecated in OpenERP version>5 but you can achieve the same using many2one relational field. You can achieve it in following two ways : 1) using many2one field in both the objects ( http://tutorialopenerp.wordpress.com/2014/04/23/one2one/ ) 2)  using inheritance by deligation You can easily find the first solution with little search over internet so let's start with 2nd solution. Scenario :  I want to create a one2one relation between two objects of openerp hr.employee and hr.employee.medical.details What I should do  i. Add _inherits section in hr_employee class ii. Add field medical_detail_id in hr_employee class class hr_employee(osv.osv):     _name = 'hr.employee'     _inherits = {' hr.employee.medical.details ': "medical_detail_id"}     _inherit = 'hr.employee'         _columns = {              'emp_code':fields.char('Employee Code', si

Odoo: Download Binary File in v10

To download any binary file in Odoo10 following is the link: http://127.0.0.1:8069/web/content?model=<module_name>&field=<field_name>&filename_field=<field_filename>&id=<object_id> module_name    - the name of the model with the Binary field field_name         - the name of the Binary field object_id            - id of the record containing particular file. field_filename    - name of a Char field containing file's name (optional). So if you want to call a function on button click and download the file, code is as follow: file_url = "http://127.0.0.1:8069/web/content?model=<module_name>&field=<field_name>&filename_field=<field_filename>&id=<object_id>" return {     'type': 'ir.actions.act_url',     'url': file_url,     'target': 'new' } In Reports or web page, you can use it as: <t t-foreach="files&qu

Odoo/ OpenERP: Customize one2many relational field

I was working on one2many relational field to integration one functionality. Worked on python code and then wrote an xml file for the same. Restarted my server. It showed me the row with headers and 'Add an item' button. When clicked on that it opened a pop-up to provide inputs. It shows all the fields with in sequence of it's own. Q1. I need input fields in other sequence or don't need all fields. A.   In xml you would have wrote code like this : <field name="dirtyhandsphp" widget="one2many_list" colspan="4" nolabel="1"> <tree string="Details">              <field name="field1" />              <field name="field2" />                      ---------------------------------------------              <field name="designation" />         </tree>                                                     </field> Add form section inside and