Package org.openrefine.browsing
Class Engine
- java.lang.Object
-
- org.openrefine.browsing.Engine
-
public class Engine extends Object
Faceted browsing engine. Given aGrid
and facet configurations, it can be used to compute facet statistics and obtain a filtered view of the grid according to the facets.
It also computes datatype statistics for each column, serialized in the "columnStats" JSON field.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
Engine.Mode
-
Field Summary
Fields Modifier and Type Field Description protected EngineConfig
_config
protected List<Facet>
_facets
protected Grid.PartialAggregation<AllFacetsState>
_facetsState
protected Grid
_state
static String
INCLUDE_DEPENDENT
static String
MODE
static String
MODE_RECORD_BASED
static String
MODE_ROW_BASED
-
Constructor Summary
Constructors Constructor Description Engine(Grid state, EngineConfig config, long projectId)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description <T extends Serializable>
TaggregateFilteredRecords(RecordAggregator<T> aggregator, T initialState)
Runs an aggregator only on the records that are selected by facets.<T extends Serializable>
TaggregateFilteredRows(RowAggregator<T> aggregator, T initialState)
Runs an aggregator only on the rows that are selected by facets.RecordFilter
combinedRecordFilters()
RowFilter
combinedRowFilters()
long
getAggregatedCount()
Number of rows on which the facets were actually checked.List<ColumnStats>
getColumnStats()
Some statistics for each column: reconciliation and data type statistics.EngineConfig
getConfig()
com.google.common.collect.ImmutableList<FacetResult>
getFacetResults()
The state of the computed facets.protected Grid.PartialAggregation<AllFacetsState>
getFacetsState()
long
getFilteredCount()
Number of rows in the filtered grid, among those which have been checked (getAggregatedCount()
).Grid
getGrid()
CloseableIterator<Record>
getMatchingRecords(SortingConfig sortingConfig)
Iterates over the records matched by the given filters.CloseableIterator<IndexedRow>
getMatchingRows(SortingConfig sortingConfig)
Iterates over the rows matched by the given filters.Engine.Mode
getMode()
long
getTotalCount()
Total number of rows/records in the unfiltered gridlong
getTotalRows()
Total number of rows in the unfiltered grid (needed to provide a link to the last page)boolean
isNeutral()
True when all the facets are in a neutral state, meaning that they do not filter out any row or record.boolean
limitReached()
True when the aggregation stopped because of the imposed aggregation limit, in which case statistics such asgetFilteredCount()
should be understood relative togetAggregatedCount()
.static String
modeToString(Engine.Mode mode)
static Engine.Mode
stringToMode(String s)
-
-
-
Field Detail
-
INCLUDE_DEPENDENT
public static final String INCLUDE_DEPENDENT
- See Also:
- Constant Field Values
-
MODE
public static final String MODE
- See Also:
- Constant Field Values
-
MODE_ROW_BASED
public static final String MODE_ROW_BASED
- See Also:
- Constant Field Values
-
MODE_RECORD_BASED
public static final String MODE_RECORD_BASED
- See Also:
- Constant Field Values
-
_state
protected final Grid _state
-
_config
protected final EngineConfig _config
-
_facetsState
protected Grid.PartialAggregation<AllFacetsState> _facetsState
-
-
Constructor Detail
-
Engine
public Engine(Grid state, EngineConfig config, long projectId)
-
-
Method Detail
-
modeToString
public static String modeToString(Engine.Mode mode)
-
stringToMode
public static Engine.Mode stringToMode(String s)
-
getMode
public Engine.Mode getMode()
-
getGrid
public Grid getGrid()
-
getConfig
public EngineConfig getConfig()
-
getFacetsState
protected Grid.PartialAggregation<AllFacetsState> getFacetsState()
-
getFacetResults
public com.google.common.collect.ImmutableList<FacetResult> getFacetResults()
The state of the computed facets.
-
isNeutral
public boolean isNeutral()
True when all the facets are in a neutral state, meaning that they do not filter out any row or record.
-
getColumnStats
public List<ColumnStats> getColumnStats()
Some statistics for each column: reconciliation and data type statistics.
-
getTotalCount
public long getTotalCount()
Total number of rows/records in the unfiltered grid
-
getTotalRows
public long getTotalRows()
Total number of rows in the unfiltered grid (needed to provide a link to the last page)
-
getAggregatedCount
public long getAggregatedCount()
Number of rows on which the facets were actually checked. Can be less thangetTotalCount()
if the engine configuration capped the number of rows to process.
-
getFilteredCount
public long getFilteredCount()
Number of rows in the filtered grid, among those which have been checked (getAggregatedCount()
).
-
limitReached
public boolean limitReached()
True when the aggregation stopped because of the imposed aggregation limit, in which case statistics such asgetFilteredCount()
should be understood relative togetAggregatedCount()
.
-
getMatchingRows
public CloseableIterator<IndexedRow> getMatchingRows(SortingConfig sortingConfig)
Iterates over the rows matched by the given filters. If the engine is in records mode, the rows corresponding to the matching records are returned.- Parameters:
sortingConfig
- in which order to iterate over rows
-
getMatchingRecords
public CloseableIterator<Record> getMatchingRecords(SortingConfig sortingConfig)
Iterates over the records matched by the given filters. If the engine is in records mode, the rows corresponding to the matching records are returned.- Parameters:
sortingConfig
- in which order to iterate over records
-
combinedRowFilters
public RowFilter combinedRowFilters()
- Returns:
- a row filter obtained from all applied facets
-
combinedRecordFilters
public RecordFilter combinedRecordFilters()
- Returns:
- a record filter obtained from all applied facets
-
aggregateFilteredRows
public <T extends Serializable> T aggregateFilteredRows(RowAggregator<T> aggregator, T initialState)
Runs an aggregator only on the rows that are selected by facets.- Parameters:
aggregator
- the aggregator to run on the selected rowsinitialState
- the initial state of the aggregator (which should act as neutral element)
-
aggregateFilteredRecords
public <T extends Serializable> T aggregateFilteredRecords(RecordAggregator<T> aggregator, T initialState)
Runs an aggregator only on the records that are selected by facets.- Parameters:
aggregator
- the aggregator to run on the selected recordsinitialState
- the initial state of the aggregator (which should act as neutral element)
-
-