Class NGramFingerprintKeyer
- java.lang.Object
-
- org.openrefine.clustering.binning.Keyer
-
- org.openrefine.clustering.binning.FingerprintKeyer
-
- org.openrefine.clustering.binning.NGramFingerprintKeyer
-
public class NGramFingerprintKeyer extends FingerprintKeyer
Fingerprint keyer which generates a fingerprint from a sorted list of unique character N-grams after removing all whitespace, control characters, and punctuation. N-grams are concatenated to form a single output key.
-
-
Field Summary
-
Fields inherited from class org.openrefine.clustering.binning.FingerprintKeyer
DIACRITICS_AND_FRIENDS
-
-
Constructor Summary
Constructors Constructor Description NGramFingerprintKeyer()
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description String
key(String s, Object... o)
protected TreeSet<String>
ngram_split(String s, int size)
Deprecated.2020-10-17 by tfmorris.protected Stream<String>
sorted_ngrams(String s, int size)
Generate a stream of sorted unique character N-grams from a string-
Methods inherited from class org.openrefine.clustering.binning.FingerprintKeyer
asciify, normalize, normalize, stripDiacritics
-
-
-
-
Method Detail
-
key
public String key(String s, Object... o)
- Overrides:
key
in classFingerprintKeyer
-
sorted_ngrams
protected Stream<String> sorted_ngrams(String s, int size)
Generate a stream of sorted unique character N-grams from a string- Parameters:
s
- String to generate N-grams fromsize
- number of characters per N-gram- Returns:
- a stream of sorted unique N-gram Strings
-
ngram_split
@Deprecated protected TreeSet<String> ngram_split(String s, int size)
Deprecated.2020-10-17 by tfmorris. Usesorted_ngrams(String, int)
-
-