public final class TokenHierarchy<I> extends Object
Modifier and Type | Method and Description |
---|---|
void |
addTokenHierarchyListener(TokenHierarchyListener listener)
Add listener for token changes inside this hierarchy.
|
static <I extends CharSequence,T extends TokenId> |
create(I inputText,
boolean copyInputText,
Language<T> language,
Set<T> skipTokenIds,
InputAttributes inputAttributes)
Create token hierarchy for the given input text.
|
static <I extends CharSequence> |
create(I inputText,
Language<?> language)
Create token hierarchy for the given non-mutating input text (for example
java.lang.String).
|
static <I extends Reader,T extends TokenId> |
create(I inputReader,
Language<T> language,
Set<T> skipTokenIds,
InputAttributes inputAttributes)
Create token hierarchy for the given reader.
|
List<TokenSequence<?>> |
embeddedTokenSequences(int offset,
boolean backwardBias)
Gets the list of all embedded
TokenSequence s at the given offset. |
static <D extends Document> |
get(D doc)
Get or create mutable token hierarchy for the given swing document.
|
I |
inputSource()
Get input source providing text over which
this token hierarchy was constructed.
|
boolean |
isActive()
Token hierarchy may be set inactive to release resources consumed
by tokens.
|
boolean |
isMutable()
Whether input text of this token hierarchy is mutable or not.
|
Set<LanguagePath> |
languagePaths()
Get a set of language paths used by this token hierarchy.
|
void |
removeTokenHierarchyListener(TokenHierarchyListener listener)
Remove listener for token changes inside this hierarchy.
|
TokenSequence<?> |
tokenSequence()
Get token sequence of the top level language of the token hierarchy.
|
<T extends TokenId> |
tokenSequence(Language<T> language)
Get token sequence of the top level of the language hierarchy
only if it's of the given language.
|
List<TokenSequence<?>> |
tokenSequenceList(LanguagePath languagePath,
int startOffset,
int endOffset)
Get immutable list of token sequences with the given language path
from this hierarchy.
|
String |
toString() |
public static <D extends Document> TokenHierarchy<D> get(D doc)
doc.putProperty("mimeType", mimeType)
(a language defined for the given mime type will be searched and used)
or by doing putProperty(Language.class, language)
.
Otherwise the returned hierarchy will be inactive and tokenSequence()
will return null.
doc
- non-null swing text document for which the token hiearchy should be obtained.public static <I extends CharSequence> TokenHierarchy<I> create(I inputText, Language<?> language)
public static <I extends CharSequence,T extends TokenId> TokenHierarchy<I> create(I inputText, boolean copyInputText, Language<T> language, Set<T> skipTokenIds, InputAttributes inputAttributes)
inputText
- input text containing the characters to tokenize.copyInputText
- true
in case the content of the input
will not be modified in the future so the created tokens can reference it.
false
means that the text can change in the future
and the tokens should not directly reference it. Instead copy of the necessary text
from the input should be made and the original text should not be referenced.language
- language defining how the input
will be tokenized.skipTokenIds
- set containing the token ids for which the tokens
should not be created in the created token hierarchy.
null
may be passed which means that no tokens will be skipped.
Language.tokenCategoryMembers(String)
or Language.merge(Collection,Collection)
.inputAttributes
- additional properties related to the input
that may influence token creation or lexer operation
for the particular language (such as version of the language to be used).public static <I extends Reader,T extends TokenId> TokenHierarchy<I> create(I inputReader, Language<T> language, Set<T> skipTokenIds, InputAttributes inputAttributes)
inputReader
- input reader containing the characters to tokenize.language
- language defining how the input
will be tokenized.skipTokenIds
- set containing the token ids for which the tokens
should not be created in the created token hierarchy.
null
may be passed which means that no tokens will be skipped.
Language.tokenCategoryMembers(String)
or Language.merge(Collection,Collection)
.inputAttributes
- additional properties related to the input
that may influence token creation or lexer operation
for the particular language (such as version of the language to be used).public TokenSequence<?> tokenSequence()
TokenSequence.embedded()
.isActive()
returns false).public <T extends TokenId> TokenSequence<T> tokenSequence(Language<T> language)
(tokenSequence().language() == language)
.
public List<TokenSequence<?>> tokenSequenceList(LanguagePath languagePath, int startOffset, int endOffset)
ConcurrentModificationException
may be thrown
when iterating over (or retrieving items) from the obsolete list.
languagePath
- non-null language path that the obtained token sequences
will all have.startOffset
- starting offset of the TSs to get. Use 0 for no limit.
If the particular TS ends after this offset then it will be returned.endOffset
- ending offset of the TS to get. Use Integer.MAX_VALUE for no limit.
If the particular TS starts before this offset then it will be returned.TokenSequence
s or null if the token hierarchy
is inactive (isActive()
returns false).public List<TokenSequence<?>> embeddedTokenSequences(int offset, boolean backwardBias)
TokenSequence
s at the given offset.
This method will use the top level TokenSequence
in this
hierarchy to drill down through the token at the specified offset
and all its possible embedded sub-sequences.
If the offset
lies at the border between two tokens the backwardBias
parameter will be used to choose either the token on the left hand side
(backwardBias == true
) of the offset
or
on the right hand side (backwardBias == false
).
For token hierarchies over mutable input sources this method must only be invoked within a read-lock over the mutable input source.
offset
- The offset to look at.backwardBias
- If true
the backward lying token will
be used in case that the offset
specifies position between
two tokens. If false
the forward lying token will be used.TokenSequence
at the given offset and in the specified direction or if the token hierarchy
is inactive (isActive()
returns false).
The sequences in the list are ordered from the top level sequence to the bottom one.public Set<LanguagePath> languagePaths()
LanguageHierarchy.embedding(Token,LanguagePath,InputAttributes)
.
For token hierarchies over mutable input sources this method must only be invoked within a read-lock over the mutable input source.
isActive()
returns false).public boolean isMutable()
public I inputSource()
CharSequence
or Reader
or a mutable input source such as swing text document
Document
.public boolean isActive()
tokenSequence()
return null.
For token hierarchies over mutable input sources this method must only be invoked within a read-lock over the mutable input source.
public void addTokenHierarchyListener(TokenHierarchyListener listener)
listener
- token change listener to be added.public void removeTokenHierarchyListener(TokenHierarchyListener listener)
listener
- token change listener to be removed.Built on June 4 2024. | Copyright © 2017-2024 Apache Software Foundation. All Rights Reserved.