public class CaseSensitiveContextualAnalyzer
extends org.apache.lucene.analysis.Analyzer
StandardTokenizer
with StandardFilter
, LowerCaseFilter
and StopFilter
, using a list of English stop
words.
You must specify the required Version
compatibility when creating StandardAnalyzer:
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_TOKEN_LENGTH
Default maximum allowed token length
|
protected static Set<?> |
STOP_WORDS_SET
An unmodifiable set containing some common English words that are usually not useful for
searching.
|
Constructor and Description |
---|
CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion)
Builds an analyzer with the default stop words (
STOP_WORDS_SET ). |
CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion,
File stopwords)
Builds an analyzer with the stop words from the given file.
|
CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion,
Reader stopwords)
Builds an analyzer with the stop words from the given reader.
|
CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion,
Set<?> stopWords)
Builds an analyzer with the given stop words.
|
Modifier and Type | Method and Description |
---|---|
int |
getMaxTokenLength() |
org.apache.lucene.analysis.TokenStream |
reusableTokenStream(String fieldName,
Reader reader) |
void |
setMaxTokenLength(int length)
Set maximum allowed token length.
|
org.apache.lucene.analysis.TokenStream |
tokenStream(String fieldName,
Reader reader)
Constructs a
StandardTokenizer filtered by a StandardFilter , a LowerCaseFilter and a StopFilter . |
protected static final Set<?> STOP_WORDS_SET
public static final int DEFAULT_MAX_TOKEN_LENGTH
public CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion)
STOP_WORDS_SET
).matchVersion
- Lucene version to match See abovepublic CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion, Set<?> stopWords)
matchVersion
- Lucene version to match See abovestopWords
- stop wordspublic CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion, File stopwords) throws IOException
matchVersion
- Lucene version to match See abovestopwords
- File to read stop words fromIOException
WordlistLoader#getWordSet(File)
public CaseSensitiveContextualAnalyzer(org.apache.lucene.util.Version matchVersion, Reader stopwords) throws IOException
matchVersion
- Lucene version to match See abovestopwords
- Reader to read stop words fromIOException
WordlistLoader.getWordSet(Reader)
public org.apache.lucene.analysis.TokenStream tokenStream(String fieldName, Reader reader)
StandardTokenizer
filtered by a StandardFilter
, a LowerCaseFilter
and a StopFilter
.tokenStream
in class org.apache.lucene.analysis.Analyzer
public int getMaxTokenLength()
setMaxTokenLength(int)
public void setMaxTokenLength(int length)
public org.apache.lucene.analysis.TokenStream reusableTokenStream(String fieldName, Reader reader) throws IOException
IOException
This work is licensed under a Creative Commons Attribution 4.0 International License.