C# Class BitMiracle.LibJpeg.Classic.Internal.jpeg_inverse_dct

An inverse DCT routine is given a pointer to the input JBLOCK and a pointer to an output sample array. The routine must dequantize the input data as well as perform the IDCT; for dequantization, it uses the multiplier table pointed to by componentInfo.dct_table. The output data is to be placed into the sample array starting at a specified column. (Any row offset needed will be applied to the array pointer before it is passed to the IDCT code) Note that the number of samples emitted by the IDCT routine is DCT_scaled_size * DCT_scaled_size. Each IDCT routine has its own ideas about the best dct_table element type. The decompressor input side saves away the appropriate quantization table for each component at the start of the first scan involving that component. (This is necessary in order to correctly decode files that reuse Q-table slots.) When we are ready to make an output pass, the saved Q-table is converted to a multiplier table that will actually be used by the IDCT routine. The multiplier table contents are IDCT-method-dependent. To support application changes in IDCT method between scans, we can remake the multiplier tables if necessary. In buffered-image mode, the first output pass may occur before any data has been seen for some components, and thus before their Q-tables have been saved away. To handle this case, multiplier tables are preset to zeroes; the result of the IDCT will be a neutral gray level.

Show file Open project: prepare/HTML-Renderer

Public Methods

Method	Description
inverse ( int component_index, short coef_block, ComponentBuffer output_buf, int output_row, int output_col ) : void
jpeg_inverse_dct ( jpeg_decompress_struct cinfo )
start_pass ( ) : void	Prepare for an output pass. Here we select the proper IDCT routine for each component and build a matching multiplier table.

Private Methods

Method	Description
FAST_INTEGER_DEQUANTIZE ( short coef, int quantval ) : int	Dequantize a coefficient by multiplying it by the multiplier-table entry; produce a DCTELEM result. For 8-bit data a 16x16->16 multiplication will do. For 12-bit data, the multiplier table is declared int, so a 32-bit multiply will be used.
FAST_INTEGER_IDESCALE ( int x, int n ) : int
FAST_INTEGER_IRIGHT_SHIFT ( int x, int shft ) : int	Like DESCALE, but applies to a DCTELEM and produces an int. We assume that int right shift is unsigned if int right shift is.
FAST_INTEGER_MULTIPLY ( int var, int c ) : int	Multiply a DCTELEM variable by an int constant, and immediately descale to yield a DCTELEM result.
FLOAT_DEQUANTIZE ( short coef, float quantval ) : float	Dequantize a coefficient by multiplying it by the multiplier-table entry; produce a float result.
REDUCED_DEQUANTIZE ( short coef, int quantval ) : int	Dequantize a coefficient by multiplying it by the multiplier-table entry; produce an int result. In this module, both inputs and result are 16 bits or less, so either int or short multiply will work.
SLOW_INTEGER_DEQUANTIZE ( int coef, int quantval ) : int	Dequantize a coefficient by multiplying it by the multiplier-table entry; produce an int result. In this module, both inputs and result are 16 bits or less, so either int or short multiply will work.
SLOW_INTEGER_FIX ( double x ) : int
jpeg_idct_10x10 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_10x5 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_11x11 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_12x12 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_12x6 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_13x13 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_14x14 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_14x7 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_15x15 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_16x16 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_16x8 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_1x1 ( int component_index, short coef_block, int output_row, int output_col ) : void	Perform dequantization and inverse DCT on one block of coefficients, producing a reduced-size 1x1 output block.
jpeg_idct_1x2 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_2x1 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_2x2 ( int component_index, short coef_block, int output_row, int output_col ) : void	Perform dequantization and inverse DCT on one block of coefficients, producing a reduced-size 2x2 output block.
jpeg_idct_2x4 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_3x3 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_3x6 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_4x2 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_4x4 ( int component_index, short coef_block, int output_row, int output_col ) : void	Inverse-DCT routines that produce reduced-size output: either 4x4, 2x2, or 1x1 pixels from an 8x8 DCT block. NOTE: this code only copes with 8x8 DCTs. The implementation is based on the Loeffler, Ligtenberg and Moschytz (LL&M) algorithm. We simply replace each 8-to-8 1-D IDCT step with an 8-to-4 step that produces the four averages of two adjacent outputs (or an 8-to-2 step producing two averages of four outputs, for 2x2 output). These steps were derived by computing the corresponding values at the end of the normal LL&M code, then simplifying as much as possible. 1x1 is trivial: just take the DC coefficient divided by 8. Perform dequantization and inverse DCT on one block of coefficients, producing a reduced-size 4x4 output block.
jpeg_idct_4x8 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_5x10 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_5x5 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_6x12 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_6x3 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_6x6 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_7x14 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_7x7 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_8x16 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_8x4 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_9x9 ( int component_index, short coef_block, int output_row, int output_col ) : void
jpeg_idct_float ( int component_index, short coef_block, int output_row, int output_col ) : void	Perform dequantization and inverse DCT on one block of coefficients. NOTE: this code only copes with 8x8 DCTs. A floating-point implementation of the inverse DCT (Discrete Cosine Transform). In the IJG code, this routine must also perform dequantization of the input coefficients. This implementation should be more accurate than either of the integer IDCT implementations. However, it may not give the same results on all machines because of differences in roundoff behavior. Speed will depend on the hardware's floating point capacity. A 2-D IDCT can be done by 1-D IDCT on each column followed by 1-D IDCT on each row (or vice versa, but it's more convenient to emit a row at a time). Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with a fixed-point implementation, accuracy is lost due to imprecise representation of the scaled quantization values. However, that problem does not arise if we use floating point arithmetic.
jpeg_idct_ifast ( int component_index, short coef_block, int output_row, int output_col ) : void	Perform dequantization and inverse DCT on one block of coefficients. NOTE: this code only copes with 8x8 DCTs. A fast, not so accurate integer implementation of the inverse DCT (Discrete Cosine Transform). In the IJG code, this routine must also perform dequantization of the input coefficients. A 2-D IDCT can be done by 1-D IDCT on each column followed by 1-D IDCT on each row (or vice versa, but it's more convenient to emit a row at a time). Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with fixed-point math, accuracy is lost due to imprecise representation of the scaled quantization values. The smaller the quantization table entry, the less precise the scaled value, so this implementation does worse with high- quality-setting files than with low-quality ones. Scaling decisions are generally the same as in the LL&M algorithm; However, we choose to descale (right shift) multiplication products as soon as they are formed, rather than carrying additional fractional bits into subsequent additions. This compromises accuracy slightly, but it lets us save a few shifts. More importantly, 16-bit arithmetic is then adequate (for 8-bit samples) everywhere except in the multiplications proper; this saves a good deal of work on 16-bit-int machines. The dequantized coefficients are not integers because the AA&N scaling factors have been incorporated. We represent them scaled up by FAST_INTEGER_PASS1_BITS, so that the first and second IDCT rounds have the same input scaling. For 8-bit JSAMPLEs, we choose IFAST_SCALE_BITS = FAST_INTEGER_PASS1_BITS so as to avoid a descaling shift; this compromises accuracy rather drastically for small quantization table entries, but it saves a lot of shifts. For 12-bit JSAMPLEs, there's no hope of using 16x16 multiplies anyway, so we use a much larger scaling factor to preserve accuracy. A final compromise is to represent the multiplicative constants to only 8 fractional bits, rather than 13. This saves some shifting work on some machines, and may also reduce the cost of multiplication (since there are fewer one-bits in the constants).
jpeg_idct_islow ( int component_index, short coef_block, int output_row, int output_col ) : void	Perform dequantization and inverse DCT on one block of coefficients. NOTE: this code only copes with 8x8 DCTs. A slow-but-accurate integer implementation of the inverse DCT (Discrete Cosine Transform). In the IJG code, this routine must also perform dequantization of the input coefficients. A 2-D IDCT can be done by 1-D IDCT on each column followed by 1-D IDCT on each row (or vice versa, but it's more convenient to emit a row at a time). Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on an algorithm described in C. Loeffler, A. Ligtenberg and G. Moschytz, "Practical Fast 1-D DCT Algorithms with 11 Multiplications", Proc. Int'l. Conf. on Acoustics, Speech, and Signal Processing 1989 (ICASSP '89), pp. 988-991. The primary algorithm described there uses 11 multiplies and 29 adds. We use their alternate method with 12 multiplies and 32 adds. The advantage of this method is that no data path contains more than one multiplication; this allows a very simple and accurate implementation in scaled fixed-point arithmetic, with a minimal number of shifts. The poop on this scaling stuff is as follows: Each 1-D IDCT step produces outputs which are a factor of sqrt(N) larger than the true IDCT outputs. The final outputs are therefore a factor of N larger than desired; since N=8 this can be cured by a simple right shift at the end of the algorithm. The advantage of this arrangement is that we save two multiplications per 1-D IDCT, because the y0 and y4 inputs need not be divided by sqrt(N). We have to do addition and subtraction of the integer inputs, which is no problem, and multiplication by fractional constants, which is a problem to do in integer arithmetic. We multiply all the constants by CONST_SCALE and convert them to integer constants (thus retaining SLOW_INTEGER_CONST_BITS bits of precision in the constants). After doing a multiplication we have to divide the product by CONST_SCALE, with proper rounding, to produce the correct output. This division can be done cheaply as a right shift of SLOW_INTEGER_CONST_BITS bits. We postpone shifting as long as possible so that partial sums can be added together with full fractional precision. The outputs of the first pass are scaled up by SLOW_INTEGER_PASS1_BITS bits so that they are represented to better-than-integral precision. These outputs require BITS_IN_JSAMPLE + SLOW_INTEGER_PASS1_BITS + 3 bits; this fits in a 16-bit word with the recommended scaling. (To scale up 12-bit sample data further, an intermediate int array would be needed.) To avoid overflow of the 32-bit intermediate results in pass 2, we must have BITS_IN_JSAMPLE + SLOW_INTEGER_CONST_BITS + SLOW_INTEGER_PASS1_BITS <= 26. Error analysis shows that the values given below are the most effective.

Method Details

inverse() public method

public inverse ( int component_index, short coef_block, ComponentBuffer output_buf, int output_row, int output_col ) : void
component_index	int
coef_block	short
output_buf	ComponentBuffer
output_row	int
output_col	int
return	void

jpeg_inverse_dct() public method

public jpeg_inverse_dct ( jpeg_decompress_struct cinfo )
cinfo	jpeg_decompress_struct

start_pass() public method

Prepare for an output pass. Here we select the proper IDCT routine for each component and build a matching multiplier table.

public start_pass ( ) : void
return	void