C# Class BitMiracle.LibJpeg.Classic.Internal.jpeg_forward_dct

Forward DCT (also controls coefficient quantization) A forward DCT routine is given a pointer to an input sample array and a pointer to a work area of type DCTELEM[]; the DCT is to be performed in-place in that buffer. Type DCTELEM is int for 8-bit samples, INT32 for 12-bit samples. (NOTE: Floating-point DCT implementations use an array of type FAST_FLOAT, instead.) The input data is to be fetched from the sample array starting at a specified column. (Any row offset needed will be applied to the array pointer before it is passed to the FDCT code.) Note that the number of samples fetched by the FDCT routine is DCT_h_scaled_size * DCT_v_scaled_size. The DCT outputs are returned scaled up by a factor of 8; they therefore have a range of +-8K for 8-bit data, +-128K for 12-bit data. This convention improves accuracy in integer implementations and saves some work in floating-point ones. Each IDCT routine has its own ideas about the best dct_table element type.
Show file Open project: prepare/HTML-Renderer Class Usage Examples

Public Properties

Property Type Description
forward_DCT forward_DCT_ptr[]

Public Methods

Method Description
jpeg_forward_dct ( jpeg_compress_struct cinfo )
start_pass ( ) : void

Initialize for a processing pass. Verify that all referenced Q-tables are present, and set up the divisor table for each one. In the current implementation, DCT of all components is done during the first pass, even if only some components will be output in the first scan. Hence all components should be examined here.

Private Methods

Method Description
FAST_INTEGER_MULTIPLY ( int var, int c ) : int

Multiply a DCTELEM variable by an int constant, and immediately descale to yield a DCTELEM result.

SLOW_INTEGER_FIX ( double x ) : int
forwardDCTFloatImpl ( jpeg_component_info compptr, byte sample_data, JBLOCK coef_blocks, int start_row, int start_col, int num_blocks ) : void
forwardDCTImpl ( jpeg_component_info compptr, byte sample_data, JBLOCK coef_blocks, int start_row, int start_col, int num_blocks ) : void
jpeg_fdct_10x10 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_10x5 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_11x11 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_12x12 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_12x6 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_13x13 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_14x14 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_14x7 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_15x15 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_16x16 ( int data1, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_16x8 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_1x1 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_1x2 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_2x1 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_2x2 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_2x4 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_3x3 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_3x6 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_4x2 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_4x4 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_4x8 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_5x10 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_5x5 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_6x12 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_6x3 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_6x6 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_7x14 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_7x7 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_8x16 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_8x4 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_9x9 ( int data, byte sample_data, int start_row, int start_col ) : void
jpeg_fdct_float ( float data, byte sample_data, int start_row, int start_col ) : void

Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. A floating-point implementation of the forward DCT (Discrete Cosine Transform). This implementation should be more accurate than either of the integer DCT implementations. However, it may not give the same results on all machines because of differences in roundoff behavior. Speed will depend on the hardware's floating point capacity. A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with a fixed-point implementation, accuracy is lost due to imprecise representation of the scaled quantization values. However, that problem does not arise if we use floating point arithmetic.

jpeg_fdct_ifast ( int data, byte sample_data, int start_row, int start_col ) : void

Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. This file contains a fast, not so accurate integer implementation of the forward DCT (Discrete Cosine Transform). A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on Arai, Agui, and Nakajima's algorithm for scaled DCT. Their original paper (Trans. IEICE E-71(11):1095) is in Japanese, but the algorithm is described in the Pennebaker & Mitchell JPEG textbook (see REFERENCES section in file README). The following code is based directly on figure 4-8 in P&M. While an 8-point DCT cannot be done in less than 11 multiplies, it is possible to arrange the computation so that many of the multiplies are simple scalings of the final outputs. These multiplies can then be folded into the multiplications or divisions by the JPEG quantization table entries. The AA&N method leaves only 5 multiplies and 29 adds to be done in the DCT itself. The primary disadvantage of this method is that with fixed-point math, accuracy is lost due to imprecise representation of the scaled quantization values. The smaller the quantization table entry, the less precise the scaled value, so this implementation does worse with high- quality-setting files than with low-quality ones. Scaling decisions are generally the same as in the LL&M algorithm; see jpeg_fdct_islow for more details. However, we choose to descale (right shift) multiplication products as soon as they are formed, rather than carrying additional fractional bits into subsequent additions. This compromises accuracy slightly, but it lets us save a few shifts. More importantly, 16-bit arithmetic is then adequate (for 8-bit samples) everywhere except in the multiplications proper; this saves a good deal of work on 16-bit-int machines. Again to save a few shifts, the intermediate results between pass 1 and pass 2 are not upscaled, but are represented only to integral precision. A final compromise is to represent the multiplicative constants to only 8 fractional bits, rather than 13. This saves some shifting work on some machines, and may also reduce the cost of multiplication (since there are fewer one-bits in the constants).

jpeg_fdct_islow ( int data, byte sample_data, int start_row, int start_col ) : void

Perform the forward DCT on one block of samples. NOTE: this code only copes with 8x8 DCTs. A slow-but-accurate integer implementation of the forward DCT (Discrete Cosine Transform). A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT on each column. Direct algorithms are also available, but they are much more complex and seem not to be any faster when reduced to code. This implementation is based on an algorithm described in C. Loeffler, A. Ligtenberg and G. Moschytz, "Practical Fast 1-D DCT Algorithms with 11 Multiplications", Proc. Int'l. Conf. on Acoustics, Speech, and Signal Processing 1989 (ICASSP '89), pp. 988-991. The primary algorithm described there uses 11 multiplies and 29 adds. We use their alternate method with 12 multiplies and 32 adds. The advantage of this method is that no data path contains more than one multiplication; this allows a very simple and accurate implementation in scaled fixed-point arithmetic, with a minimal number of shifts. The poop on this scaling stuff is as follows: Each 1-D DCT step produces outputs which are a factor of sqrt(N) larger than the true DCT outputs. The final outputs are therefore a factor of N larger than desired; since N=8 this can be cured by a simple right shift at the end of the algorithm. The advantage of this arrangement is that we save two multiplications per 1-D DCT, because the y0 and y4 outputs need not be divided by sqrt(N). In the IJG code, this factor of 8 is removed by the quantization step, NOT here. We have to do addition and subtraction of the integer inputs, which is no problem, and multiplication by fractional constants, which is a problem to do in integer arithmetic. We multiply all the constants by CONST_SCALE and convert them to integer constants (thus retaining SLOW_INTEGER_CONST_BITS bits of precision in the constants). After doing a multiplication we have to divide the product by CONST_SCALE, with proper rounding, to produce the correct output. This division can be done cheaply as a right shift of SLOW_INTEGER_CONST_BITS bits. We postpone shifting as long as possible so that partial sums can be added together with full fractional precision. The outputs of the first pass are scaled up by SLOW_INTEGER_PASS1_BITS bits so that they are represented to better-than-integral precision. These outputs require BITS_IN_JSAMPLE + SLOW_INTEGER_PASS1_BITS + 3 bits; this fits in a 16-bit word with the recommended scaling. (For 12-bit sample data, the intermediate array is int anyway.) To avoid overflow of the 32-bit intermediate results in pass 2, we must have BITS_IN_JSAMPLE + SLOW_INTEGER_CONST_BITS + SLOW_INTEGER_PASS1_BITS <= 26. Error analysis shows that the values given below are the most effective.

Method Details

jpeg_forward_dct() public method

public jpeg_forward_dct ( jpeg_compress_struct cinfo )
cinfo jpeg_compress_struct

start_pass() public method

Initialize for a processing pass. Verify that all referenced Q-tables are present, and set up the divisor table for each one. In the current implementation, DCT of all components is done during the first pass, even if only some components will be output in the first scan. Hence all components should be examined here.
public start_pass ( ) : void
return void

Property Details

forward_DCT public property

public forward_DCT_ptr[] forward_DCT
return forward_DCT_ptr[]