public interface TokenProcessor
|Modifier and Type||Method and Description|
Notify that end of scanned buffer was found.
Notify that the following buffer will be scanned.
Notify that the token was found.
boolean token(TokenID tokenID, TokenContextPath tokenContextPath, int tokenBufferOffset, int tokenLength)
tokenID- ID of the token found
tokenContextPath- Context-path in which the token that was found.
tokenBufferOffset- Offset of the token in the buffer. The buffer is provided in the nextBuffer() method.
tokenLength- Length of the token found
int eot(int offset)
offset- offset of the rest of the characters
void nextBuffer(char buffer, int offset, int len, int startPos, int preScan, boolean lastBuffer)
buffer- buffer that will be scanned. To get the text of the tokens the buffer should be stored in some instance variable.
offset- offset in the buffer with the first character to be scanned. If doesn't reflect the possible preScan. If the preScan would be non-zero then the first buffer offset that contains the valid data is offset - preScan.
len- count of the characters that will be scanned. It doesn't reflect the ppossible reScan.
startPos- starting position of the scanning in the document. It logically corresponds to the offset because of the same text data both in the buffer and in the document. It again doesn't reflect the possible preScan and the startPos - preScan gives the real start of the first token. If it's necessary to know the position of each token, it's a good idea to store the value startPos - offset in an instance variable that could be called bufferStartPos. The position of the token can be then computed as bufferStartPos + tokenBufferOffset.
preScan- preScan needed for the scanning.
lastBuffer- whether this is the last buffer to scan in the document so there are no more characters in the document after this buffer.