C# Class Algorithmix.Forensics.OCR

Inheritance: DisposableObject

Mostrar archivo Open project: Algorithmix/Papyrus Class Usage Examples

Public Methods

Method	Description
Charactors ( ) : Tesseract.Charactor[]	Returns an array of Tesseract Charactors after running the Scan Method
Cost ( Emgu.CV.OCR.Tesseract chars ) : long	Calculates the cost by summing the unique cost of each word
IsEmpty ( Shred shreds ) : void	Determines runs OCR and determines if a shred is empty or not
OCR ( Accuracy accuracy = Accuracy.High, string language = "eng", bool enableTimer = false ) : System	Initialize a new OCR Object. This object is a wrapper for the Emgu Tesseract Wrapper to give a level of abstraction necessary for scanning shreds
OverallCost ( ) : long	Returns the of cos
ParallelDetectOrientation ( Bitmap regs, Bitmap revs, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : Tuple[]	Parallelized OCR Orientation Confidence Detector, this method will run ocr on an image and its corresponding reversed images and return the confidence and both the ocrdata objects
ParallelRecognize ( IEnumerable images, int length, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : OcrData[]	Parallelized Recognize Function takes in a list or array of images, A specified length and for each image returns an OCRData object
Preprocess ( byte>.Image image ) : byte>.Image	OCR Preprocessing, currently this involves binary threholding a gray scaled image using the Otsu Method
Recognize ( Bitmap original, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : OcrData	Execute OCR on a given image, this static member will process the image, Safely open, execute and dispose a Tesseract Object and store the result in a new OcrData object.
Scan ( byte>.Image image ) : string	Given a Color image, it is coverted to grayscale OCR-ed and returned
Scan ( byte>.Image image ) : string	Invokes Tesseract OCR Recognize on given image Stores the resulting data in the Text,Confidence and ScanTime data members
ShredOcr ( Shred shreds, string lang = "eng" ) : void	Given an array of shreds, we OCR them all and save results to the shred object
StripNewLine ( string text ) : string
Text ( ) : string	Getter for the text generated after running the Scan Method

Protected Methods

Method	Description
DisposeObject ( ) : void	Disposes all the necessary objects

Private Methods

Method	Description
Elapsed ( ) : long	Retrieve the time elapsed from the diagnostics Timer
Start ( ) : bool	Explicitly starts the diagnostics timer
Stop ( ) : void	Stops the diagnostics timer

Method Details

Charactors() public method

Returns an array of Tesseract Charactors after running the Scan Method

public Charactors ( ) : Tesseract.Charactor[]
return	Tesseract.Charactor[]

Cost() public static method

Calculates the cost by summing the unique cost of each word

public static Cost ( Emgu.CV.OCR.Tesseract chars ) : long
chars	Emgu.CV.OCR.Tesseract	Tesseract OCR Charactor results
return	long

DisposeObject() protected method

Disposes all the necessary objects

protected DisposeObject ( ) : void
return	void

IsEmpty() public static method

Determines runs OCR and determines if a shred is empty or not

public static IsEmpty ( Shred shreds ) : void
shreds	Shred	A list of Shred Objects
return	void

OCR() public method

Initialize a new OCR Object. This object is a wrapper for the Emgu Tesseract Wrapper to give a level of abstraction necessary for scanning shreds

public OCR ( Accuracy accuracy = Accuracy.High, string language = "eng", bool enableTimer = false ) : System
accuracy	Accuracy	Desired Accuracy setting
language	string	Language of text on image used for OCR model
enableTimer	bool	Set enable Timer to true to measure scan time for diagnostic purposes
return	System

OverallCost() public method

Returns the of cos

public OverallCost ( ) : long
return	long

ParallelDetectOrientation() public static method

Parallelized OCR Orientation Confidence Detector, this method will run ocr on an image and its corresponding reversed images and return the confidence and both the ocrdata objects

public static ParallelDetectOrientation ( Bitmap regs, Bitmap revs, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : Tuple[]
regs	System.Drawing.Bitmap	Images with default regular orientation
revs	System.Drawing.Bitmap	Images with reversed orientation to default
mode	Accuracy	OCR accuracy mode
lang	string	OCR languages
enableTimer	bool	Enable timer for diagnostic purposes
return	Tuple[]

ParallelRecognize() public static method

Parallelized Recognize Function takes in a list or array of images, A specified length and for each image returns an OCRData object

public static ParallelRecognize ( IEnumerable images, int length, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : OcrData[]
images	IEnumerable	Array or List of Bitmaps
length	int	Number of items to be Recognized from the array
mode	Accuracy	Accuracy Mode
lang	string	Desired OCR Language
enableTimer	bool	Enables OCR Scan Timer if true
return	OcrData[]

Preprocess() public method

OCR Preprocessing, currently this involves binary threholding a gray scaled image using the Otsu Method

public Preprocess ( byte>.Image image ) : byte>.Image
image	byte>.Image	Image to be preprocessed
return	byte>.Image

Recognize() public static method

Execute OCR on a given image, this static member will process the image, Safely open, execute and dispose a Tesseract Object and store the result in a new OcrData object.

public static Recognize ( Bitmap original, Accuracy mode = Accuracy.High, string lang = "eng", bool enableTimer = false ) : OcrData
original	System.Drawing.Bitmap	Image to be OCR-ed
mode	Accuracy	Accuracy setting
lang	string	Language of text for OCR Language Model
enableTimer	bool	Measure the Scantime for Diagnostic purposes
return	OcrData

Scan() public method

Given a Color image, it is coverted to grayscale OCR-ed and returned

public Scan ( byte>.Image image ) : string
image	byte>.Image	Source Image to be OCR-ed
return	string

Scan() public method

Invokes Tesseract OCR Recognize on given image Stores the resulting data in the Text,Confidence and ScanTime data members

public Scan ( byte>.Image image ) : string
image	byte>.Image	Source Image to be OCR-ed
return	string

ShredOcr() public static method

Given an array of shreds, we OCR them all and save results to the shred object

public static ShredOcr ( Shred shreds, string lang = "eng" ) : void
shreds	Shred	Array of initialized shred objects
lang	string	Desired OCR langauge
return	void

StripNewLine() public static method

public static StripNewLine ( string text ) : string
text	string
return	string

Text() public method

Getter for the text generated after running the Scan Method

public Text ( ) : string
return	string