C# (CSharp) CSJ2K.j2k.entropy.encoder Namespace

Classes

Name Description
EBCOTRateAllocator This implements the EBCOT post compression rate allocation algorithm. This algorithm finds the most suitable truncation points for the set of code-blocks, for each layer target bitrate. It works by first collecting the rate distortion info from all code-blocks, in all tiles and all components, and then running the rate-allocation on the whole image at once, for each layer.

This implementation also provides some timing features. They can be enabled by setting the 'DO_TIMING' constant of this class to true and recompiling. The timing uses the 'System.currentTimeMillis()' Java API call, which returns wall clock time, not the actual CPU time used. The timing results will be printed on the message output. Since the times reported are wall clock times and not CPU usage times they can not be added to find the total used time (i.e. some time might be counted in several places). When timing is disabled ('DO_TIMING' is false) there is no penalty if the compiler performs some basic optimizations. Even if not the penalty should be negligeable.

MQCoder This class implements the MQ arithmetic coder. When initialized a specific state can be specified for each context, which may be adapted to the probability distribution that is expected for that context.

The type of length calculation and termination can be chosen at construction time. ---- Tricks that have been tried to improve speed ----

1) Merging Qe and mPS and doubling the lookup tables
Merge the mPS into Qe, as the sign bit (if Qe>=0 the sense of MPS is 0, if Qe<0 the sense is 1), and double the lookup tables. The first half of the lookup tables correspond to Qe>=0 (i.e. the sense of MPS is 0) and the second half to Qe<0 (i.e. the sense of MPS is 1). The nLPS lookup table is modified to incorporate the changes in the sense of MPS, by making it jump from the first to the second half and vice-versa, when a change is specified by the swicthLM lookup table. See JPEG book, section 13.2, page 225.
There is NO speed improvement in doing this, actually there is a slight decrease, probably due to the fact that often Q has to be negated. Also the fact that a brach of the type "if (bit==mPS[li])" is replaced by two simpler braches of the type "if (bit==0)" and "if (q<0)" may contribute to that.

2) Removing cT
It is possible to remove the cT counter by setting a flag bit in the high bits of the C register. This bit will be automatically shifted left whenever a renormalization shift occurs, which is equivalent to decreasing cT. When the flag bit reaches the sign bit (leftmost bit), which is equivalenet to cT==0, the byteOut() procedure is called. This test can be done efficiently with "c<0" since C is a signed quantity. Care must be taken in byteOut() to reset the bit in order to not interfere with other bits in the C register. See JPEG book, page 228.
There is NO speed improvement in doing this. I don't really know why since the number of operations whenever a renormalization occurs is decreased. Maybe it is due to the number of extra operations in the byteOut(), terminate() and getNumCodedBytes() procedures.

3) Change the convention of MPS and LPS.
Making the LPS interval be above the MPS interval (MQ coder convention is the opposite) can reduce the number of operations along the MPS path. In order to generate the same bit stream as with the MQ convention the output bytes need to be modified accordingly. The basic rule for this is that C = (C'^0xFF...FF)-A, where C is the codestream for the MQ convention and C' is the codestream generated by this other convention. Note that this affects bit-stuffing as well.
This has not been tested yet.

4) Removing normalization while loop on MPS path
Since in the MPS path Q is guaranteed to be always greater than 0x4000 (decimal 0.375) it is never necessary to do more than 1 renormalization shift. Therefore the test of the while loop, and the loop itself, can be removed.

5) Simplifying test on A register
Since A is always less than or equal to 0xFFFF, the test "(a & 0x8000)==0" can be replaced by the simplete test "a < 0x8000". This test is simpler in Java since it involves only 1 operation (although the original test can be converted to only one operation by smart Just-In-Time compilers)
This change has been integrated in the decoding procedures.

6) Speedup mode
Implemented a method that uses the speedup mode of the MQ-coder if possible. This should greately improve performance when coding long runs of MPS symbols that have high probability. However, to take advantage of this, the entropy coder implementation has to explicetely use it. The generated bit stream is the same as if no speedup mode would have been used.
Implemented but performance not tested yet.

7) Multiple-symbol coding
Since the time spent in a method call is non-negligable, coding several symbols with one method call reduces the overhead per coded symbol. The decodeSymbols() method implements this. However, to take advantage of it, the implementation of the entropy coder has to explicitely use it.
Implemented but performance not tested yet.

PostCompRateAllocator This is the abstract class from which post-compression rate allocators which generate layers should inherit. The source of data is a 'CodedCBlkDataSrcEnc' which delivers entropy coded blocks with rate-distortion statistics.

The post compression rate allocator implementation should create the layers, according to a rate allocation policy, and send the packets to a CodestreamWriter. Since the rate allocator sends the packets to the bit stream then it should output the packets to the bit stream in the order imposed by the bit stream profiles.

StdEntropyCoder This class implements the JPEG 2000 entropy coder, which codes stripes in code-blocks. This entropy coding engine can function in a single-threaded mode where one code-block is encoded at a time, or in a multi-threaded mode where multiple code-blocks are entropy coded in parallel. The interface presented by this class is the same in both modes.

The number of threads used by this entropy coder is specified by the "jj2000.j2k.entropy.encoder.StdEntropyCoder.nthreads" Java system property. If set to "0" the single threaded implementation is used. If set to 'n' ('n' larger than 0) then 'n' extra threads are started by this class which are used to encode the code-blocks in parallel (i.e. ideally 'n' code-blocks will be encoded in parallel at a time). On multiprocessor machines under a "native threads" Java Virtual Machine implementation each one of these threads can run on a separate processor speeding up the encoding time. By default the single-threaded implementation is used. The multi-threaded implementation currently assumes that the vast majority of consecutive calls to 'getNextCodeBlock()' will be done on the same component. If this is not the case, the speed-up that can be expected on multiprocessor machines might be significantly decreased.

The code-blocks are rectangular, with dimensions which must be powers of 2. Each dimension has to be no smaller than 4 and no larger than 256. The product of the two dimensions (i.e. area of the code-block) may not exceed 4096.

Context 0 of the MQ-coder is used as the uniform one (uniform, non-adaptive probability distribution). Context 1 is used for RLC coding. Contexts 2-10 are used for zero-coding (ZC), contexts 11-15 are used for sign-coding (SC) and contexts 16-18 are used for magnitude-refinement (MR).

This implementation buffers the symbols and calls the MQ coder only once per stripe and per coding pass, to reduce the method call overhead.

This implementation also provides some timing features. They can be enabled by setting the 'DO_TIMING' constant of this class to true and recompiling. The timing uses the 'System.currentTimeMillis()' Java API call, which returns wall clock time, not the actual CPU time used. The timing results will be printed on the message output. Since the times reported are wall clock times and not CPU usage times they can not be added to find the total used time (i.e. some time might be counted in several places). When timing is disabled ('DO_TIMING' is false) there is no penalty if the compiler performs some basic optimizations. Even if not the penalty should be negligeable.

The source module must implement the CBlkQuantDataSrcEnc interface and code-block's data is received in a CBlkWTData instance. This modules sends code-block's information in a CBlkRateDistStats instance.