C# 클래스 WikipediaAvsAnTrieExtractor.WhitespaceNormalizer

파일 보기 프로젝트 열기: EamonNerbonne/a-vs-an 1 사용 예제들

공개 메소드들

메소드 설명
Normalize ( string text ) : string

This normalizes a string such that consecutive whitespace and tabs are replaced by a single space, and such that any leading or trailing whitespace on any line gets trimmed. Sequences of empty lines are replaced by a single empty line. After the last normal character, at most one line-break is permitted. Implementation limitation: before the first character on the first line, a single whitespace will not be removed. This implementation is somewhat odd, but the regex implementation is surprisingly slow due to backtracking issues which arrise from the matching of consecutive empty lines (which might contain white space). The purpose of this implementation is to essentially remove superfluous spaces being those that lead or trail any line and to remove superflous empty lines, such that a single empty line is still permitted (being a wikipedia paragraph break). Details: carriage returns aren't processed as whitespace (wikipedia doesn't contain these), and it's possible though weird to have a single paragraph break before the text

메소드 상세

Normalize() 공개 정적인 메소드

This normalizes a string such that consecutive whitespace and tabs are replaced by a single space, and such that any leading or trailing whitespace on any line gets trimmed. Sequences of empty lines are replaced by a single empty line. After the last normal character, at most one line-break is permitted. Implementation limitation: before the first character on the first line, a single whitespace will not be removed. This implementation is somewhat odd, but the regex implementation is surprisingly slow due to backtracking issues which arrise from the matching of consecutive empty lines (which might contain white space). The purpose of this implementation is to essentially remove superfluous spaces being those that lead or trail any line and to remove superflous empty lines, such that a single empty line is still permitted (being a wikipedia paragraph break). Details: carriage returns aren't processed as whitespace (wikipedia doesn't contain these), and it's possible though weird to have a single paragraph break before the text
public static Normalize ( string text ) : string
text string
리턴 string