Saturday, February 24, 2007

Differences between CHAR and VARCHAR

Many people believe that VARCHAR is better because it stores only actual data, while CHAR is stored in full length. It is not true. In fact, both CHAR and VARCHAR are stored in memory buffer in their full, declared length; when the row is stored on disk, then RLE compression algorithm is used to compress whole row, i.e. CHARs, VARCHARs, INTEGERs, DATEs, etc. all together. So if you want to save space, CHARs are slightly better than VARCHARs (the differenece is that VARCHAR stores string length in two bytes).
There is also a bug that causes that VARCHAR does not properly clean string tail if you assign shorter string, thus causing worse compression. (this problem is fixed in Firebird-0.9.4)

Many people also believe that VARCHAR sends over network only actual data, while CHAR is sent in full length. It is not true either. Communication between client and server is done via messages of fixed length. For this reason both CHAR and VARCHAR are sent in their full declared length. (this problem is fixed in IB-6.5)

So decision whether use CHAR or VARCHAR should be based solely on application's requirements. E.g. store fixed length codes in CHAR, store names in VARCHAR (to allow correct concatenating).

No comments: