Skip to main content

PHP Associative Arrays and Hashtable


Our today’s topic is how associative arrays are implemented in PHP. Let’s  discuss over it….

Basic Arrays in C 

Let’s take an example of basic array in C where index or key is always an integer. It can’t be string. All the arrays start from index 0 and ends at index (n-1), where n is the number of elements in array. All the elements are stored in continuous memory locations. For example first element at memory location 0 and 10th  element on 9th location.  This is the basic array of C. 

What is Hashtable? 

A hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys (e.g., a person's name), to their associated values (e.g., their telephone number). Thus, a hash table implements an associative array.  We will go into detail of this in next few minutes..

What is Hash Function?

The hash function is used to transform the key into the index (the hash) of an array element (the slot or bucket) where the corresponding value is to be sought. There is no magic formula for the creation of the hash function. It can be any mathematical transformation that produces a relatively random and unique distribution of values within the address space of the storage.


 

PHP and Hashtable (an implementation of a map):

In PHP associative arrays can be implemented using hash-maps or b-trees. This has important implications on the performance characteristics of associative arrays and how they should be used; e.g. b-trees are slow to insert but handle collisions better than hash maps.  Hashmaps are faster if there are no collisions, but are slower to retrieve when there are collisions. In PHP associative arrays are actually implemented using hashtables with ordered map or you can say hashmaps.  Following are the actual functions of PHP written in C to implement hashtable. Check here.

struct _hashtable;
typedef struct bucket {
                ulong h;                                                                                  /* Used for numeric indexing */
                uint nKeyLength;
                void *pData;
                void *pDataPtr;
                struct bucket *pListNext;
                struct bucket *pListLast;
                struct bucket *pNext;
                struct bucket *pLast;
                const char *arKey;
} Bucket;

typedef struct _hashtable {
                uint nTableSize;
                uint nTableMask;
                uint nNumOfElements;
                ulong nNextFreeElement;
                Bucket *pInternalPointer;      /* Used for element traversal */
                Bucket *pListHead;
                Bucket *pListTail;
                Bucket **arBuckets;
                dtor_func_t pDestructor;
                zend_bool persistent;
                unsigned char nApplyCount;
                zend_bool bApplyProtection;
#if ZEND_DEBUG
                int inconsistent;
#endif
} HashTable;


In PHP, associative array key(string) is first converted into index using hash function and then that value is stored at that particular index location. Here is again a problem. Hash function returns an integer that is system dependent (e.g 32bit or 64 bit).

For example say, a system with 32 bit OS, against a key named ‘dirtyhandsphp’ hash function returns :
00000000010111111010101010111000

It’s decimal value is 6269624.

This is a huge value and can’t represent the location with hashtable size of 2048(say). So here PHP use one more trick. It throws some digits away by using table mask.
 
(ht)->nTableMask = (ht)->nTableSize - 1;

It’s one less than the size of hash table. Because if table size is 2048(210) then maximum index could be 2047 starting from 0.

PHP performs & operation between both of them
    
          00000000010111111010101010111000
   &    00000000000000000000001111111111
______________________________________________________
          00000000000000000000001010111000 = 676

So the element would be stored at 676th location.

Collision: Implementing hash table in this way produces some problems called collision. For example hash function can generate same integer value for two different strings or you can say for associative array keys.  Two locations may refer to same index. Such situation is called collision. In PHP to solve such type of problem, it uses linked list. So if same index is generated then it would stored at different location but would linked to the same index generated using hash function.

To read or access the element again same process will follow :


1) Hash is calculated
2) Use table mask
3) Locate the memory address and
4) Get the data or if required traverse the linked list to find the element we needed.

Thanks !!!!!!!!!!! Enjoy Programming :)


Comments

Popular posts from this blog

Odoo/OpenERP: one2one relational field example

one2one relational field is deprecated in OpenERP version>5 but you can achieve the same using many2one relational field. You can achieve it in following two ways : 1) using many2one field in both the objects ( http://tutorialopenerp.wordpress.com/2014/04/23/one2one/ ) 2)  using inheritance by deligation You can easily find the first solution with little search over internet so let's start with 2nd solution. Scenario :  I want to create a one2one relation between two objects of openerp hr.employee and hr.employee.medical.details What I should do  i. Add _inherits section in hr_employee class ii. Add field medical_detail_id in hr_employee class class hr_employee(osv.osv):     _name = 'hr.employee'     _inherits = {' hr.employee.medical.details ': "medical_detail_id"}     _inherit = 'hr.employee'         _columns = {              'emp_code':fields.char('Employee Code', si

How to draw Dynamic Line or Timeseries Chart in Java using jfreechart library?

Today we are going to write a code to draw a dynamic timeseries-cum-line chart in java.   The only difference between simple and dynamic chart is that a dynamic event is used to create a new series and update the graph. In out example we are using timer which automatically calls a funtion after every 1/4 th second and graph is updated with random data. Let's try with the code : Note : I had tried my best to provide complete documentation along with code. If at any time anyone have any doubt or question please post in comments section. DynamicLineAndTimeSeriesChart.java import java.awt.BorderLayout; import java.awt.Color; import java.awt.event.ActionEvent; import java.awt.event.ActionListener; import javax.swing.Timer; import javax.swing.JPanel; import org.jfree.chart.ChartFactory; import org.jfree.chart.ChartPanel; import org.jfree.chart.JFreeChart; import org.jfree.chart.axis.ValueAxis; import org.jfree.chart.plot.XYPlot; import

pyodbc.OperationalError: ('08001', '[08001] [Microsoft][ODBC Driver 17 for SQL Server]

Recently, I faced this error in our Docker-Container environment. All the necessary packages were already installed but still, I was facing this clueless error. I search a bit and after an hour and so I found the exact reason and solution for this error. To know more about this error in detail. Please follow this Github thread. https://github.com/mkleehammer/pyodbc/issues/610 https://github.com/mkleehammer/pyodbc/issues/610#issuecomment-587523802 Solution: It's because the   server's certificate has too weak a key. In case you are using Linux env directly/not the Docker one.  Just edited /etc/ssl/openssl.cnf and change these 2 lines. MinProtocol = TLSv1.0 CipherString = DEFAULT@SECLEVEL=1 In case you are also using a container, please add these three lines to your Docker file. RUN chmod +rwx /etc/ssl/openssl.cnf RUN sed -i ' s/TLSv1.2/TLSv1/g ' /etc/ssl/openssl.cnf RUN sed -i ' s/SECLEVEL=2/SECLEVEL=1/g ' /etc/ssl/openssl.cnf Thanks!! Enjoy Programming! Refer