LanguageHierarchy (Lexer)

java.lang.Object
- org.netbeans.spi.lexer.LanguageHierarchy<T>

public abstract class LanguageHierarchy<T extends TokenId>
extends Object

Definition of a language, its lexer and its embedded languages.
It's a mirror of Language on SPI level containing additional information necessary for the lexer infrastructure operation.
The language hierarchies should be implemented by SPI providers and their languages should be given for public use (language hierarchy classes do not need to be public though).
A typical situation may look like this:


 public enum MyTokenId implements TokenId {

     ERROR(null, "error"),
     IDENTIFIER(null, "identifier"),
     ABSTRACT("abstract", "keyword"),
     ...
     SEMICOLON(";", "separator"),
     ...


     private final String fixedText; // Used by lexer for production of flyweight tokens

     private final String primaryCategory;

     MyTokenId(String fixedText, String primaryCategory) {
         this.fixedText = fixedText;
         this.primaryCategory = primaryCategory;
     }

     public String fixedText() {
         return fixedText;
     }

     public String primaryCategory() {
         return primaryCategory;
     }


     private static final Language<MyTokenId> language = new LanguageHierarchy<MyTokenId>() {
         @Override
         protected String mimeType() {
             return "text/x-my";
         }

         @Override
         protected Collection<MyTokenId> createTokenIds() {
             return EnumSet.allOf(MyTokenId.class);
         }

         @Override
         protected Lexer<MyTokenId> createLexer(LexerInput input, TokenFactory<MyTokenId> tokenFactory, Object state) {
             return new MyLexer(input, tokenFactory, state);
         }

     }.language();

     public static Language<MyTokenId> language() {
         return language;
     }

 }

Constructor Summary

Constructors
Constructor and Description

LanguageHierarchy()

Constructors
Constructor and Description
`LanguageHierarchy()`

Method Summary

All Methods Static Methods Instance Methods Abstract Methods Concrete Methods
Modifier and Type	Method and Description
`protected abstract Lexer<T>`	`createLexer(LexerRestartInfo<T> info)` Create lexer prepared for returning tokens from subsequent calls to `Lexer.nextToken()`.
`protected Map<String,Collection<T>>`	`createTokenCategories()` Provide map of token category names to collection of its members.
`protected abstract Collection<T>`	`createTokenIds()` Provide a collection of token ids that comprise the language.
`protected TokenValidator<T>`	`createTokenValidator(T tokenId)` Create token validator for the given token id.
`protected LanguageEmbedding<?>`	`embedding(Token<T> token, LanguagePath languagePath, InputAttributes inputAttributes)` Get language embedding (if exists) for a particular token of the language at this level of language hierarchy.
`protected EmbeddingPresence`	`embeddingPresence(T id)` Determine whether embedding may be present for a token with the given token id.
`boolean`	`equals(Object o)` Enforce default implementation of `equals()`.
`int`	`hashCode()` Enforce default implementation of `hashCode()`.
`protected boolean`	`isRetainTokenText(T tokenId)` This feature is currently not supported - Token.text() will return null for non-flyweight tokens.
`Language<T>`	`language()` Get language constructed for this language hierarchy based on token ids and token categories provided.
`protected abstract String`	`mimeType()` Gets the mime type of the language constructed from this language hierarchy.
`static TokenId`	`newId(String name, int ordinal)` Create a default token id instance in case the token ids are generated (not created by enum class).
`static TokenId`	`newId(String name, int ordinal, String primaryCategory)` Create a default token id instance in case the token ids are generated (not created by enum class).
`String`	`toString()`

Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - LanguageHierarchy
```
public LanguageHierarchy()
```
- Method Detail
  - newId
```
public static TokenId newId(String name,
                            int ordinal)
```
    Create a default token id instance in case the token ids are generated (not created by enum class).
  - newId
```
public static TokenId newId(String name,
                            int ordinal,
                            String primaryCategory)
```
    Create a default token id instance in case the token ids are generated (not created by enum class).
  - createTokenIds
```
protected abstract Collection<T> createTokenIds()
```
    Provide a collection of token ids that comprise the language.
    If token ids are defined as enums then this method should simply return EnumSet.allOf(MyTokenId.class).
    This method is only called once by the infrastructure (when constructing language) so it does not need to cache its result.
    This method is called in synchronized section. If its implementation would use any synchronization a care must be taken to prevent deadlocks.
    
    Returns:
    
    non-null collection of TokenId instances.
  - createTokenCategories
```
protected Map<String,Collection<T>> createTokenCategories()
```
    Provide map of token category names to collection of its members.
    The results of this method will be merged with the primary-category information found in token ids.
    This method is only called once by the infrastructure (when constructing language) so it does not need to cache its result.
    This method is called in synchronized section. If its implementation would use any synchronization a care must be taken to prevent deadlocks.
    There is a convention that the category names should only consist of lowercase letters, numbers and hyphens.
    
    Returns:
    
    mapping of category name to collection of its ids. It may return null to signal no mappings.
  - createLexer
```
protected abstract Lexer<T> createLexer(LexerRestartInfo<T> info)
```
    Create lexer prepared for returning tokens from subsequent calls to Lexer.nextToken().
    
    Parameters:
    
    info - non-null lexer restart info containing the information necessary for lexer restarting.
  - mimeType
```
protected abstract String mimeType()
```
    Gets the mime type of the language constructed from this language hierarchy.
    
    Returns:
    
    non-null language's mime type.
    
    See Also:
    
    LanguagePath.mimePath()
  - embedding
```
protected LanguageEmbedding<?> embedding(Token<T> token,
                                         LanguagePath languagePath,
                                         InputAttributes inputAttributes)
```
    Get language embedding (if exists) for a particular token of the language at this level of language hierarchy.
    This method will only be called if the given token instance will not be flyweight token or token with custom text: token.isFlyweight() == false && token.isCustomText() == false
    That restriction exists because the children token list is constructed lazily and the infrastructure needs to access the token's parent token list which would not be possible if the token would be flyweight.
    
    Parameters:
    
    token - non-null token for which the language embedding will be resolved.
    The token may have a zero length (Token.length() == 0) in case the language infrastructure performs a poll for all embedded languages for the
    
    languagePath - non-null language path at which the language embedding is being created. It may be used for obtaining appropriate information from inputAttributes.
    
    inputAttributes - input attributes that could affect the embedding creation. It may be null if there are no extra attributes.
    
    Returns:
    
    language embedding instance or null if there is no language embedding for this token.
  - embeddingPresence
```
protected EmbeddingPresence embeddingPresence(T id)
```
    Determine whether embedding may be present for a token with the given token id. The embedding for the particular token may either never be present, always present or sometimes present (depending on token's text or properties).
    By default the method returns EmbeddingPresence.CACHED_FIRST_QUERY so the LanguageHierarchy.embedding(Token,LanguagePath,InputAttributes) will be called once (for a first token instance with the given token id) and if there is no embedding then the embedding creation will not be attempted for any other token with the same token id. This should be appropriate for most cases.
    This method allows to avoid frequent queries checking whether particular token might contain embedding or not.
    
    Parameters:
    
    id - non-null token id.
    
    Returns:
    
    embedding presence for the given token id.
  - createTokenValidator
```
protected TokenValidator<T> createTokenValidator(T tokenId)
```
    Create token validator for the given token id.
    
    Parameters:
    
    tokenId - token id for which the token validator should be returned.
    
    Returns:
    
    valid token validator or null if there is no validator for the given token id.
  - isRetainTokenText
```
protected boolean isRetainTokenText(T tokenId)
```
    This feature is currently not supported - Token.text() will return null for non-flyweight tokens.
    Determine whether the text of the token with the particular id should be retained after the token has been removed from the token list because of the underlying mutable input source modification.
    Token.text() will continue to return the value that it had right before the token's removal.
    This may be useful if the tokens are held directly in parse trees and the parser queries the tokens for text.
    Retaining text in the tokens has performance and memory implications and should only be done selectively for tokens where it's desired (such as identifiers).
    The extra performance and memory penalty only happens during token's removal from the token list for the given input. Token creation performance and memory consumption during token's lifetime stay unaffected.
    
    Retaining will only work if the input source is capable of providing the removed text right after the modification has been performed.
    
    Returns:
    
    true if the text should be retained or false if not.
  - language
```
public final Language<T> language()
```
    Get language constructed for this language hierarchy based on token ids and token categories provided.
    
    Returns:
    
    non-null language.
  - hashCode
```
public final int hashCode()
```
    Enforce default implementation of hashCode().
    
    Overrides:
    
    hashCode in class Object
  - equals
```
public final boolean equals(Object o)
```
    Enforce default implementation of equals().
    
    Overrides:
    
    equals in class Object
  - toString
```
public String toString()
```
    Overrides:
    
    toString in class Object

Class LanguageHierarchy<T extends TokenId>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

LanguageHierarchy

Method Detail

newId

newId

createTokenIds

createTokenCategories

createLexer

mimeType

embedding

embeddingPresence

createTokenValidator

isRetainTokenText

language

hashCode

equals

toString