Package org.openrefine.model.changes
Interface RecordChangeDataProducer<T>
-
- Type Parameters:
T
-
- All Superinterfaces:
Serializable
- All Known Implementing Classes:
ColumnAdditionByFetchingURLsOperation.URLFetchingChangeProducer
,ExtendDataOperation.DataExtensionProducer
,ReconOperation.ReconChangeDataProducer
,RowInRecordChangeDataProducer
public interface RecordChangeDataProducer<T> extends Serializable
A function which computes change data to be persisted to disk, to be later joined back to the project to produce the new grid. This data might be serialized because it is volatile or expensive to compute. This is the record-wise equivalent toRowChangeDataProducer
.
-
-
Method Summary
All Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description T
call(Record record)
Compute the change data on a given record.default List<T>
callRecordBatch(List<Record> records)
Compute the change data on a batch of consecutive records.default int
getBatchSize()
The size of batches this producer would like to be called on.default int
getMaxConcurrency()
The maximum number of concurrent calls to this change data producer.
-
-
-
Method Detail
-
callRecordBatch
default List<T> callRecordBatch(List<Record> records)
Compute the change data on a batch of consecutive records. This defaults to individual calls if the method is not overridden.- Parameters:
records
- the list of records to fetch change data on- Returns:
- a list of the same size
-
getBatchSize
default int getBatchSize()
The size of batches this producer would like to be called on. Smaller batches can be submitted (for instance at the end of a partition). Defaults to 1.
-
getMaxConcurrency
default int getMaxConcurrency()
The maximum number of concurrent calls to this change data producer. If 0, there is no limit to the concurrency.
-
-