TokenSequence (Lexer)

java.lang.Object
- org.netbeans.api.lexer.TokenSequence<T>

```
public final class TokenSequence<T extends TokenId>
extends Object
```
Token sequence allows to iterate between tokens of a token hierarchy.
Token sequence for top-level language of a token hierarchy may be obtained by TokenHierarchy.tokenSequence().
Use of token sequence is a two-step operation:
1. Position token sequence before token that should first be retrieved (or behind desired token when iterating backwards).
  One of the following ways may be used:
  - TokenSequence.move(int) positions TS before token that either starts at the given offset or "contains" it.
  - TokenSequence.moveIndex(int) positions TS before n-th token in the underlying token list.
  - TokenSequence.moveStart() positions TS before the first token.
  - TokenSequence.moveEnd() positions TS behind the last token.
  - Do nothing - TS is positioned before the first token automatically by default.
  Token sequence will always be positioned between tokens when using one of the operations above (TokenSequence.token() will return null to signal between-tokens location).
2. Start iterating through the tokens in forward/backward direction by using TokenSequence.moveNext() or TokenSequence.movePrevious().
  If moveNext() or movePrevious() returned true then TS is positioned over a concrete token retrievable by TokenSequence.token().
  Its offset can be retrieved by TokenSequence.offset().
An example of forward iteration through the tokens:
```
   TokenSequence ts = tokenHierarchy.tokenSequence();
   // Possible positioning by ts.move(offset) or ts.moveIndex(index)
   while (ts.moveNext()) {
       Token t = ts.token();
       if (t.id() == ...) { ... }
       if (TokenUtilities.equals(t.text(), "mytext")) { ... }
       if (ts.offset() == ...) { ... }
   }
 
```
This object should be used by a single thread only. For token hierarchies over mutable input sources the obtaining and using of the token sequence needs to be done under a read-lock of the input source.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`boolean`	`createEmbedding(Language<?> embeddedLanguage, int startSkipLength, int endSkipLength)` Create language embedding without joining of the embedded sections.
`boolean`	`createEmbedding(Language<?> embeddedLanguage, int startSkipLength, int endSkipLength, boolean joinSections)` Create language embedding described by the given parameters.
`TokenSequence<?>`	`embedded()` Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
`<ET extends TokenId> TokenSequence<ET>`	`embedded(Language<ET> embeddedLanguage)` Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
`TokenSequence<?>`	`embeddedJoined()` Get embedded token sequence that possibly joins multiple embeddings with the same language paths (if the embeddings allow it - see `LanguageEmbedding.joinSections()`) into a single input text which is then lexed as a single continuous text.
`<ET extends TokenId> TokenSequence<ET>`	`embeddedJoined(Language<ET> embeddedLanguage)` Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
`int`	`index()` Get an index of token to which (or before which) this TS is currently positioned.
`boolean`	`isEmpty()` Check whether this TS contains zero tokens.
`boolean`	`isValid()` Check whether this token sequence is valid and can be iterated.
`Language<T>`	`language()` Get the language describing token ids used by tokens in this token sequence.
`LanguagePath`	`languagePath()` Get the complete language path of the tokens contained in this token sequence.
`int`	`move(int offset)` Move token sequence to be positioned between `index-1` and `index` tokens where Token[index] either starts at offset or "contains" the offset.
`void`	`moveEnd()` Move the token sequence to be positioned behind the last token.
`int`	`moveIndex(int index)` Position token sequence between `index-1` and `index` tokens.
`boolean`	`moveNext()` Move to the next token in this token sequence.
`boolean`	`movePrevious()` Move to a previous token in this token sequence.
`void`	`moveStart()` Move the token sequence to be positioned before the first token.
`int`	`offset()` Get the offset of the current token in the underlying input.
`Token<T>`	`offsetToken()` Similar to `TokenSequence.token()` but always returns a non-flyweight token with the appropriate offset.
`boolean`	`removeEmbedding(Language<?> embeddedLanguage)` Remove previously created language embedding.
`TokenSequence<T>`	`subSequence(int startOffset)` Create sub sequence of this token sequence that only returns tokens above the given offset.
`TokenSequence<T>`	`subSequence(int startOffset, int endOffset)` Create sub sequence of this token sequence that only returns tokens between the given offsets.
`Token<T>`	`token()` Get token to which this token sequence points to or null if TS is positioned between tokens (`TokenSequence.moveNext()` or `TokenSequence.movePrevious()` were not called yet).
`int`	`tokenCount()` Return total count of tokens in this sequence.
`String`	`toString()`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Method Detail
  - language
```
public Language<T> language()
```
    Get the language describing token ids used by tokens in this token sequence.
  - languagePath
```
public LanguagePath languagePath()
```
    Get the complete language path of the tokens contained in this token sequence.
  - token
```
public Token<T> token()
```
    Get token to which this token sequence points to or null if TS is positioned between tokens (TokenSequence.moveNext() or TokenSequence.movePrevious() were not called yet).
    A typical iteration usage:
```
   TokenSequence ts = tokenHierarchy.tokenSequence();
   // Possible positioning by ts.move(offset) or ts.moveIndex(index)
   while (ts.moveNext()) {
       Token t = ts.token();
       if (t.id() == ...) { ... }
       if (TokenUtilities.equals(t.text(), "mytext")) { ... }
       if (ts.offset() == ...) { ... }
   }
 
```
    The returned token instance may be flyweight (Token.isFlyweight() returns true) which means that its Token.offset(TokenHierarchy) will return -1.
    To find a correct offset use TokenSequence.offset().
    Or if its necessary to revert to a regular non-flyweigt token the TokenSequence.offsetToken() may be used.
    The lifetime of the returned token instance may be limited for mutable inputs. The token instance should not be held across the input source modifications.
    Returns:
    
    token instance to which this token sequence is currently positioned or null if this token sequence is not positioned to any token which may happen after TS creation or after use of TokenSequence.move(int) or TokenSequence.moveIndex(int).
    
    See Also:
    
    TokenSequence.offsetToken()
  - offsetToken
```
public Token<T> offsetToken()
```
    Similar to TokenSequence.token() but always returns a non-flyweight token with the appropriate offset.
    If the current token is flyweight then this method replaces it with the corresponding non-flyweight token which it then returns.
    Subsequent calls to TokenSequence.token() will also return this non-flyweight token.
    This method may be handy if the token instance is referenced in a standalone way (e.g. in an expression node of a parse tree) and it's necessary to get the appropriate offset from the token itself later when a token sequence will not be available.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - offset
```
public int offset()
```
    Get the offset of the current token in the underlying input.
    The token's offset should never be computed by a client of the token sequence by adding/subtracting tokens' length to a client's variable because in case of the immutable token sequences there can be gaps between tokens if some tokens get filtered out.
    Instead this method should always be used because it offers best performance with a constant time complexity.
    
    Returns:
    
    >=0 absolute offset of the current token in the underlying input.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - index
```
public int index()
```
    Get an index of token to which (or before which) this TS is currently positioned.
    
    Initially or after TokenSequence.move(int) or TokenSequence.moveIndex(int) token sequence is positioned between tokens:
```
          Token[0]   Token[1]   ...   Token[n]
        ^          ^                ^
 Index: 0          1                n
 
```
    After use of TokenSequence.moveNext() or TokenSequence.movePrevious() the token sequence is positioned over one of the actual tokens:
```
          Token[0]   Token[1]   ...   Token[n]
             ^          ^                ^
 Index:      0          1                n
 
```
    Returns:
    
    >=0 index of token to which (or before which) this TS is currently positioned.
  - embedded
```
public TokenSequence<?> embedded()
```
    Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
    If there is a custom embedding created by TokenSequence.createEmbedding(Language,int,int) it will be returned instead of the default embedding (the one created by LanguageHierarchy.embedding() or LanguageProvider).
    
    Returns:
    
    embedded sequence or null if no embedding exists for this token.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - embedded
```
public <ET extends TokenId> TokenSequence<ET> embedded(Language<ET> embeddedLanguage)
```
    Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - embeddedJoined
```
public TokenSequence<?> embeddedJoined()
```
    Get embedded token sequence that possibly joins multiple embeddings with the same language paths (if the embeddings allow it - see LanguageEmbedding.joinSections()) into a single input text which is then lexed as a single continuous text.
    If any of the resulting tokens crosses embedding's boundaries then the token is split into multiple part tokens.
    If the embedding does not join sections then this method behaves like TokenSequence.embedded().
    
    Returns:
    
    embedded sequence or null if no embedding exists for this token. The token sequence will be positioned before first token of this embedding or to a join token in case the first token of this embedding is part of the join token.
  - embeddedJoined
```
public <ET extends TokenId> TokenSequence<ET> embeddedJoined(Language<ET> embeddedLanguage)
```
    Get embedded token sequence if the token to which this token sequence is currently positioned has a language embedding.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - createEmbedding
```
public boolean createEmbedding(Language<?> embeddedLanguage,
                               int startSkipLength,
                               int endSkipLength)
```
    Create language embedding without joining of the embedded sections.
    
    Parameters:
    
    startSkipLength - number of characters to be skipped at token's begining.
    
    endSkipLength - number of characters to be skipped at token's end.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
    
    See Also:
    
    TokenSequence.createEmbedding(Language, int, int, boolean)
  - createEmbedding
```
public boolean createEmbedding(Language<?> embeddedLanguage,
                               int startSkipLength,
                               int endSkipLength,
                               boolean joinSections)
```
    Create language embedding described by the given parameters.
    If the underying text input is mutable then this method should only be called within a write lock over the text input.
    Parameters:
    
    embeddedLanguage - non-null embedded language
    
    startSkipLength - >=0 number of characters in an initial part of the token for which the language embedding is defined that should be excluded from the embedded section. The excluded characters will not be lexed and there will be no tokens created for them.
    
    endSkipLength - >=0 number of characters at the end of the token for which the language embedding is defined that should be excluded from the embedded section. The excluded characters will not be lexed and there will be no tokens created for them.
    joinSections - whether sections with this embedding should be joined across the input source or whether they should stay separate.
    For example for HTML sections embedded in JSP this flag should be true:
    <!-- HTML comment start <% System.out.println("Hello"); %> still in HTML comment --<
    
    Only the embedded sections with the same language path can be joined.
    If preceding embeddings requested sections joining for the particular language path then this parameter will be updated from false to true automatically by the method.
    Returns:
    
    true if the embedding was created successfully or false if an embedding with the given language already exists for this token.
    
    Throws:
    
    IllegalStateException - if TokenSequence.token() returns null.
  - removeEmbedding
```
public boolean removeEmbedding(Language<?> embeddedLanguage)
```
    Remove previously created language embedding.
    If the underying text input is mutable then this method should only be called within a write lock over the text input.
  - moveNext
```
public boolean moveNext()
```
    Move to the next token in this token sequence.
    The next token may not necessarily start at the offset where the previous token ends (there may be gaps between tokens caused by token filtering). TokenSequence.offset() should be used for offset retrieval.
    
    Returns:
    
    true if the sequence was successfully moved to the next token or false if it was not moved before there are no more tokens in the forward direction.
    
    Throws:
    
    ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.
  - movePrevious
```
public boolean movePrevious()
```
    Move to a previous token in this token sequence.
    The previous token may not necessarily end at the offset where the previous token started (there may be gaps between tokens caused by token filtering). TokenSequence.offset() should be used for offset retrieval.
    
    Returns:
    
    true if the sequence was successfully moved to the previous token or false if it was not moved because there are no more tokens in the backward direction.
    
    Throws:
    
    ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.
  - moveIndex
```
public int moveIndex(int index)
```
    Position token sequence between index-1 and index tokens.
    TS will be positioned in the following way:
```
          Token[0]   ...   Token[index-1]   Token[index] ...
        ^                ^                ^
 Index: 0             index-1           index
 
```
    Subsequent TokenSequence.moveNext() or TokenSequence.movePrevious() is needed to fetch a concrete token in the desired direction.
    Subsequent TokenSequence.moveNext() will position TS over Token[index] (or TokenSequence.movePrevious() will position TS over Token[index-1]) so that TokenSequence.token() != null.
    Parameters:
    
    index - index of the token to which this sequence should be positioned.
    If index >= TokenSequence.tokenCount() then the TS will be positioned to TokenSequence.tokenCount().
    If index < 0 then the TS will be positioned to index 0.
    
    Returns:
    
    difference between requested index and the index to which TS is really set.
    
    Throws:
    
    ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.
  - moveStart
```
public void moveStart()
```
    Move the token sequence to be positioned before the first token.
    This is equivalent to moveIndex(0).
  - moveEnd
```
public void moveEnd()
```
    Move the token sequence to be positioned behind the last token.
    This is equivalent to moveIndex(tokenCount()).
  - move
```
public int move(int offset)
```
    Move token sequence to be positioned between index-1 and index tokens where Token[index] either starts at offset or "contains" the offset.
```
        +----------+-----+----------------+--------------+------
        | Token[0] | ... | Token[index-1] | Token[index] | ...
        | "public" | ... | "static"       | "int"        | ...
        +----------+-----+----------------+--------------+------
        ^                ^                ^
 Index: 0             index-1           index
 Offset:                                  ---^ (if offset points to 'i','n' or 't')
 
```
    Subsequent TokenSequence.moveNext() or TokenSequence.movePrevious() is needed to fetch a concrete token.
    If the offset is too big then the token sequence will be positioned behind the last token.
    
    If token filtering is used there may be gaps that are not covered by any tokens and if the offset is contained in such gap then the token sequence will be positioned before the token that precedes the gap.
    Parameters:
    
    offset - absolute offset to which the token sequence should be moved.
    
    Returns:
    
    difference between the reqeuested offset and the start offset of the token before which the the token sequence gets positioned.
    If positioned right after the last token then (offset - last-token-end-offset) is returned.
    
    Throws:
    
    ConcurrentModificationException - if this token sequence is no longer valid because of an underlying mutable input source modification.
  - isEmpty
```
public boolean isEmpty()
```
    Check whether this TS contains zero tokens.
    This check is strongly preferred over tokenCount() == 0.
    
    See Also:
    
    TokenSequence.tokenCount()
  - tokenCount
```
public int tokenCount()
```
    Return total count of tokens in this sequence.
    Note: Calling this method will lead to creation of all the remaining tokens in the sequence if they were not yet created.
    
    Returns:
    
    total number of tokens in this token sequence.
  - subSequence
```
public TokenSequence<T> subSequence(int startOffset)
```
    Create sub sequence of this token sequence that only returns tokens above the given offset.
    
    Parameters:
    
    startOffset - only tokens satisfying tokenStartOffset + tokenLength > startOffset will be present in the returned sequence.
    
    Returns:
    
    non-null sub sequence of this token sequence.
  - subSequence
```
public TokenSequence<T> subSequence(int startOffset,
                                    int endOffset)
```
    Create sub sequence of this token sequence that only returns tokens between the given offsets.
    
    Parameters:
    
    startOffset - only tokens satisfying tokenStartOffset + tokenLength > startOffset will be present in the returned sequence.
    
    endOffset - >=startOffset only tokens satisfying tokenStartOffset < endOffset will be present in the returned sequence.
    
    Returns:
    
    non-null sub sequence of this token sequence.
  - isValid
```
public boolean isValid()
```
    Check whether this token sequence is valid and can be iterated.
    If this method returns false then the underlying token hierarchy was modified and this token sequence should be abandoned.
    
    Returns:
    
    true if this token sequence is ready for use or false if it should be abandoned.
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object

Class TokenSequence<T extends TokenId>

Method Summary

Methods inherited from class java.lang.Object

Method Detail

language

languagePath

token

offsetToken

offset

index

embedded

embedded

embeddedJoined

embeddedJoined

createEmbedding

createEmbedding

removeEmbedding

moveNext

movePrevious

moveIndex

moveStart

moveEnd

move

isEmpty

tokenCount

subSequence

subSequence

isValid

toString