Property | Type | Description | |
---|---|---|---|
forward_DCT | forward_DCT_ptr[] |
Method | Description | |
---|---|---|
jpeg_forward_dct ( jpeg_compress_struct cinfo ) | ||
start_pass ( ) : void |
Initialize for a processing pass. Verify that all referenced Q-tables are present, and set up the divisor table for each one. In the current implementation, DCT of all components is done during the first pass, even if only some components will be output in the first scan. Hence all components should be examined here.
|
Method | Description | |
---|---|---|
FAST_INTEGER_MULTIPLY ( int var, int c ) : int |
Multiply a DCTELEM variable by an int constant, and immediately descale to yield a DCTELEM result.
|
|
SLOW_INTEGER_FIX ( double x ) : int | ||
forwardDCTFloatImpl ( jpeg_component_info compptr, byte sample_data, JBLOCK coef_blocks, int start_row, int start_col, int num_blocks ) : void | ||
forwardDCTImpl ( jpeg_component_info compptr, byte sample_data, JBLOCK coef_blocks, int start_row, int start_col, int num_blocks ) : void | ||
jpeg_fdct_10x10 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_10x5 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_11x11 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_12x12 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_12x6 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_13x13 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_14x14 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_14x7 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_15x15 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_16x16 ( int data1, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_16x8 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_1x1 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_1x2 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_2x1 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_2x2 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_2x4 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_3x3 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_3x6 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_4x2 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_4x4 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_4x8 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_5x10 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_5x5 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_6x12 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_6x3 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_6x6 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_7x14 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_7x7 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_8x16 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_8x4 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_9x9 ( int data, byte sample_data, int start_row, int start_col ) : void | ||
jpeg_fdct_float ( float data, byte sample_data, int start_row, int start_col ) : void |
Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. A floating-point implementation of the forward DCT (Discrete Cosine Transform). This implementation should be more accurate than either of the integer DCT implementations. However, it may not give the same results on all machines because of differences in roundoff behavior. Speed will depend on the hardware's floating point capacity. A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with a fixed-point implementation, accuracy is lost due to imprecise representation of the scaled quantization values. However, that problem does not arise if we use floating point arithmetic.
|
|
jpeg_fdct_ifast ( int data, byte sample_data, int start_row, int start_col ) : void |
Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. This file contains a fast, not so accurate integer implementation of the forward DCT (Discrete Cosine Transform). A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with fixed-point math, accuracy is lost due to imprecise representation of the scaled quantization values. The smaller the quantization table entry, the less precise the scaled value, so this implementation does worse with high- quality-setting files than with low-quality ones. Scaling decisions are generally the same as in the LL&M algorithm; see jpeg_fdct_islow for more details. However, we choose to descale (right shift) multiplication products as soon as they are formed, rather than carrying additional fractional bits into subsequent additions. This compromises accuracy slightly, but it lets us save a few shifts. More importantly, 16-bit arithmetic is then adequate (for 8-bit samples) everywhere except in the multiplications proper; this saves a good deal of work on 16-bit-int machines. Again to save a few shifts, the intermediate results between pass 1 and pass 2 are not upscaled, but are represented only to integral precision. A final compromise is to represent the multiplicative constants to only 8 fractional bits, rather than 13. This saves some shifting work on some machines, and may also reduce the cost of multiplication (since there are fewer one-bits in the constants).
|
|
jpeg_fdct_islow ( int data, byte sample_data, int start_row, int start_col ) : void |
Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. A slow-but-accurate integer implementation of the forward DCT (Discrete Cosine Transform). A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on an algorithm described in C. Loeffler, A. Ligtenberg and G. Moschytz, "Practical Fast 1-D DCT Algorithms with 11 Multiplications", Proc. Int'l. Conf. on Acoustics, Speech, and Signal Processing 1989 (ICASSP '89), pp. 988-991. The primary algorithm described there uses 11 multiplies and 29 adds. We use their alternate method with 12 multiplies and 32 adds. The advantage of this method is that no data path contains more than one multiplication; this allows a very simple and accurate implementation in scaled fixed-point arithmetic, with a minimal number of shifts. The poop on this scaling stuff is as follows: Each 1-D DCT step produces outputs which are a factor of sqrt(N) larger than the true DCT outputs. The final outputs are therefore a factor of N larger than desired; since N=8 this can be cured by a simple right shift at the end of the algorithm. The advantage of this arrangement is that we save two multiplications per 1-D DCT, because the y0 and y4 outputs need not be divided by sqrt(N). In the IJG code, this factor of 8 is removed by the quantization step, NOT here. We have to do addition and subtraction of the integer inputs, which is no problem, and multiplication by fractional constants, which is a problem to do in integer arithmetic. We multiply all the constants by CONST_SCALE and convert them to integer constants (thus retaining SLOW_INTEGER_CONST_BITS bits of precision in the constants). After doing a multiplication we have to divide the product by CONST_SCALE, with proper rounding, to produce the correct output. This division can be done cheaply as a right shift of SLOW_INTEGER_CONST_BITS bits. We postpone shifting as long as possible so that partial sums can be added together with full fractional precision. The outputs of the first pass are scaled up by SLOW_INTEGER_PASS1_BITS bits so that they are represented to better-than-integral precision. These outputs require BITS_IN_JSAMPLE + SLOW_INTEGER_PASS1_BITS + 3 bits; this fits in a 16-bit word with the recommended scaling. (For 12-bit sample data, the intermediate array is int anyway.) To avoid overflow of the 32-bit intermediate results in pass 2, we must have BITS_IN_JSAMPLE + SLOW_INTEGER_CONST_BITS + SLOW_INTEGER_PASS1_BITS <= 26. Error analysis shows that the values given below are the most effective.
|
public jpeg_forward_dct ( jpeg_compress_struct cinfo ) | ||
cinfo | jpeg_compress_struct |