JPEG Baseline Compression Page-4

Huffman Coding in JPEG

Fortunately, the JPEG standard provides a number of tables of Huffman codes to describe the different ways that the input quantised data is to be encrypted. The quantized DCT coefficients tend to have two different styles of values within them. The first is the DC coefficient or the first value in the zig-zag ordered array. This value is always the average value and as such is likely to be any number within the range permitted. The second is all of the AC values (the second through to last coefficient in the zig-zag ordered array). The values of the coefficients are likely to be close to zero, with the likelihood of being zero increasing as the coefficients near the end of the array.These two types of information are different enough to warrant separating them and applying different methods of coding to get the optimal compression efficiency.

Additionally, coding in JPEG also distinguishes between whether the data is Luminance or Chrominance information. To do this it allows for different code tables to be specified for each of the components, much the same as how the quantisation tables are defined.This means that a total of four Huffman code tables are generally defined in colour images, two for DC information and 2 for AC information. Again, if the image was grey scale only two tables would be required (1 DC and 1 AC).

Coding the DC Coefficients

While the DC coefficient of the input array is fairly unpredictable by itself, the property of real world images to not have a great deal of change in a localised area can be used to predict the current DC coefficient value. By using a differential prediction model (DPCM) we can increase the probability that the value we encode will be small. To obtain this value we simply subtract the DC coefficient from the previously processed 8x8 block in the same component from the DC coefficient of the current block. This value is called the "DPCM difference" or the "DIFF".

Once this DIFF value is calculated, it is compared to the following table to identify the symbol group that it belongs to (SG in the following diagram). The Huffman code that this symbol is described by is defined in the code tables that the JPEG standard provides (these tables are provided as an attachment to this report).

jpeg-baseline-compression-5

Huffman Coding of DPCM Differences

However this does not completely describe the DIFF coefficient as the SG code is only for a range of absolute values, except of course if the DIFF is zero. The full code is created when a number of additional bits are appended to the SG code stream to identify the sign and fully specify the magnitude of the DIFF. These additional bits are defined in the following diagram.

Additional bits for Huffman DPCM coding [1]

As can be seen the code to be appended is found by taking the SG low order bits of the DIFF if the DIFF is positive. If the DIFF is negative then one is subtracted from the DIFF (the twos complement representation of the negative DIFF) and taking the SG low order bits is the code to be appended.

Note that the JPEG baseline standard only allows for 12 bit DPCM values (the DCT output was 12 bit, following from an 8 bit input sample), so a value of SG of 11 is the maximum supported. In the special case of the first 8x8 pixel section of the component being processed, the DIFF is assigned the value of the current DC coefficient value (ie the previous section DC coefficient is zero).

Note that the Huffman code tables for SG can be different depending on which component is currently being processed. If the JPEG standard libraries are being used then there is one each for luminance and chrominance components. The additional bits that get appended to the SG codes are always the same, irrespective of the component being processed.

Coding the AC Coefficients

The AC coefficients are coded using a substantially different procedure then the DC coefficient version. Because the value of the AC coefficients tends to be close to zero after the quantisation step of the JPEG algorithm, the coding of these coefficients manipulates long runs of zeroes.

The selection of a symbol is based on a combination of the number of zeroes (NZ) that preceded the current coefficient and the range of the current coefficient. The range is calculated by the same method as it was in the DC coefficient case, selecting the SG symbol from the following figure.

Huffman Coding of AC Coefficients [1]

The "RUN-SIZE" symbol is then selected by cross referencing the NZ and SG values in the following figure, once found the code for this symbol can be found by searching through the tables that the JPEG standard provides.

		SG
		0	1	2	3	4	5	6	7	8	9	10
	0	EOB	01	02	03	04	05	06	07	08	09	0A
	1	N/A	11	12	13	14	15	16	17	18	19	1A
	2	N/A	21	22	23	24	25	26	27	28	29	2A
	3	N/A	31	32	33	34	35	36	37	38	39	3A
	4	N/A	41	42	43	44	45	46	47	48	49	4A
	5	N/A	51	52	53	54	55	56	57	58	59	5A
N	6	N/A	61	62	63	64	65	66	67	68	69	6A
Z	7	N/A	71	72	73	74	75	76	77	78	79	7A
	8	N/A	81	82	83	84	85	86	87	88	89	8A
	9	N/A	91	92	93	94	95	96	97	98	99	9A
	10	N/A	A1	A2	A3	A4	A5	A6	A7	A8	A9	AA
	11	N/A	B1	B2	B3	B4	B5	B6	B7	B8	B9	BA
	12	N/A	C1	C2	C3	C4	C5	C6	C7	C8	C9	CA
	13	N/A	D1	D2	D3	D4	D5	D6	D7	D8	D9	DA
	14	N/A	E1	E2	E3	E4	E5	E6	E7	E8	E9	EA
	15	ZRL	F1	F2	F3	F4	F5	F6	F7	F8	F9	FA

Enroll now!

Theoretical Paper

Practical Paper

Industrial Training

JPEG Baseline Compression Page-4

Huffman Coding of DPCM Differences

SG

N

Z

		SG
		0	1	2	3	4	5	6	7	8	9	10
	0	EOB	01	02	03	04	05	06	07	08	09	0A
	1	N/A	11	12	13	14	15	16	17	18	19	1A
	2	N/A	21	22	23	24	25	26	27	28	29	2A
	3	N/A	31	32	33	34	35	36	37	38	39	3A
	4	N/A	41	42	43	44	45	46	47	48	49	4A
	5	N/A	51	52	53	54	55	56	57	58	59	5A
N	6	N/A	61	62	63	64	65	66	67	68	69	6A
Z	7	N/A	71	72	73	74	75	76	77	78	79	7A
	8	N/A	81	82	83	84	85	86	87	88	89	8A
	9	N/A	91	92	93	94	95	96	97	98	99	9A
	10	N/A	A1	A2	A3	A4	A5	A6	A7	A8	A9	AA
	11	N/A	B1	B2	B3	B4	B5	B6	B7	B8	B9	BA
	12	N/A	C1	C2	C3	C4	C5	C6	C7	C8	C9	CA
	13	N/A	D1	D2	D3	D4	D5	D6	D7	D8	D9	DA
	14	N/A	E1	E2	E3	E4	E5	E6	E7	E8	E9	EA
	15	ZRL	F1	F2	F3	F4	F5	F6	F7	F8	F9	FA

		SG
		0	1	2	3	4	5	6	7	8	9	10
	0	EOB	01	02	03	04	05	06	07	08	09	0A
	1	N/A	11	12	13	14	15	16	17	18	19	1A
	2	N/A	21	22	23	24	25	26	27	28	29	2A
	3	N/A	31	32	33	34	35	36	37	38	39	3A
	4	N/A	41	42	43	44	45	46	47	48	49	4A
	5	N/A	51	52	53	54	55	56	57	58	59	5A
N	6	N/A	61	62	63	64	65	66	67	68	69	6A
Z	7	N/A	71	72	73	74	75	76	77	78	79	7A
	8	N/A	81	82	83	84	85	86	87	88	89	8A
	9	N/A	91	92	93	94	95	96	97	98	99	9A
	10	N/A	A1	A2	A3	A4	A5	A6	A7	A8	A9	AA
	11	N/A	B1	B2	B3	B4	B5	B6	B7	B8	B9	BA
	12	N/A	C1	C2	C3	C4	C5	C6	C7	C8	C9	CA
	13	N/A	D1	D2	D3	D4	D5	D6	D7	D8	D9	DA
	14	N/A	E1	E2	E3	E4	E5	E6	E7	E8	E9	EA
	15	ZRL	F1	F2	F3	F4	F5	F6	F7	F8	F9	FA

		SG
		0	1	2	3	4	5	6	7	8	9	10
	0	EOB	01	02	03	04	05	06	07	08	09	0A
	1	N/A	11	12	13	14	15	16	17	18	19	1A
	2	N/A	21	22	23	24	25	26	27	28	29	2A
	3	N/A	31	32	33	34	35	36	37	38	39	3A
	4	N/A	41	42	43	44	45	46	47	48	49	4A
	5	N/A	51	52	53	54	55	56	57	58	59	5A
N	6	N/A	61	62	63	64	65	66	67	68	69	6A
Z	7	N/A	71	72	73	74	75	76	77	78	79	7A
	8	N/A	81	82	83	84	85	86	87	88	89	8A
	9	N/A	91	92	93	94	95	96	97	98	99	9A
	10	N/A	A1	A2	A3	A4	A5	A6	A7	A8	A9	AA
	11	N/A	B1	B2	B3	B4	B5	B6	B7	B8	B9	BA
	12	N/A	C1	C2	C3	C4	C5	C6	C7	C8	C9	CA
	13	N/A	D1	D2	D3	D4	D5	D6	D7	D8	D9	DA
	14	N/A	E1	E2	E3	E4	E5	E6	E7	E8	E9	EA
	15	ZRL	F1	F2	F3	F4	F5	F6	F7	F8	F9	FA