Package org.openrefine.runners.local.pll
Class CroppedPLL<T>
- java.lang.Object
-
- org.openrefine.runners.local.pll.PLL<T>
-
- org.openrefine.runners.local.pll.CroppedPLL<T>
-
- Type Parameters:
T
-
public class CroppedPLL<T> extends PLL<T>
A PLL obtained by removing some rows at the beginning or the end of a PLL.- Author:
- Antonin Delpeuch
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
CroppedPLL.CroppedPartition
-
Nested classes/interfaces inherited from class org.openrefine.runners.local.pll.PLL
PLL.LastFlush, PLL.PLLExecutionError
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
atEnd
protected long
itemsToDrop
protected io.vavr.collection.Array<CroppedPLL.CroppedPartition>
partitions
protected io.vavr.collection.Array<Long>
partitionSizes
protected PLL<T>
pll
-
Fields inherited from class org.openrefine.runners.local.pll.PLL
cachedPartitions, context, id, name
-
-
Constructor Summary
Constructors Constructor Description CroppedPLL(PLL<T> parent, io.vavr.collection.Array<Long> newPartitionSizes, int partitionsToDrop, long dropItems, boolean atEnd)
Constructs a cropped PLL by removing rows at the beginning or the end of a PLL.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected CloseableIterator<T>
compute(Partition partition)
Iterate over the elements of the given partition.io.vavr.collection.Array<Long>
computePartitionSizes()
int
getDroppedPartitions()
The difference between the parents' number of partitions and the new number of partitions in this PLL.List<PLL<?>>
getParents()
Returns the PLLs that this PLL depends on, to compute its contents.io.vavr.collection.Array<? extends Partition>
getPartitions()
boolean
hasCachedPartitionSizes()
Is this PLL aware of the size of its partitions?-
Methods inherited from class org.openrefine.runners.local.pll.PLL
aggregate, batchPartitions, cacheAsync, collect, collectPartitionsAsync, concatenate, concatenate, count, dropFirstElements, dropLastElements, filter, flatMap, getContext, getId, getPartitionSizes, getQueryTree, isCached, isEmpty, iterate, iterateFromPartition, iterator, limitPartitions, map, mapPartitions, mapToPair, mapToPair, numPartitions, retainPartitions, runOnPartitions, runOnPartitions, runOnPartitionsAsync, runOnPartitionsAsync, runOnPartitionsWithoutInterruption, runOnPartitionsWithoutInterruption, saveAsTextFile, saveAsTextFileAsync, scanMap, scanMapStream, sort, take, toString, uncache, withCachedPartitionSizes, writeOriginalPartition, writePartition, writePlannedPartition, zipWithIndex
-
-
-
-
Field Detail
-
itemsToDrop
protected final long itemsToDrop
-
atEnd
protected final boolean atEnd
-
partitions
protected final io.vavr.collection.Array<CroppedPLL.CroppedPartition> partitions
-
partitionSizes
protected final io.vavr.collection.Array<Long> partitionSizes
-
-
Constructor Detail
-
CroppedPLL
public CroppedPLL(PLL<T> parent, io.vavr.collection.Array<Long> newPartitionSizes, int partitionsToDrop, long dropItems, boolean atEnd)
Constructs a cropped PLL by removing rows at the beginning or the end of a PLL.- Parameters:
parent
- the PLL which should be croppednewPartitionSizes
- the resulting partition sizes after the cropping. This must be provided.partitionsToDrop
- the number of partitions to be dropped entirelydropItems
- the number of items to drop in the first partition that is not droppedatEnd
- false if the partitions and items should be dropped at the beginning, true if at the end
-
-
Method Detail
-
compute
protected CloseableIterator<T> compute(Partition partition)
Description copied from class:PLL
Iterate over the elements of the given partition. This is the method that should be implemented by subclasses. As this method forces computation, ignoring any caching, consumers should not call it directly but rather usePLL.iterate(Partition)
. Once the iterator is not needed anymore, it should be closed. This makes it possible to release the underlying resources supporting it, such as open files or sockets.
-
getPartitions
public io.vavr.collection.Array<? extends Partition> getPartitions()
- Specified by:
getPartitions
in classPLL<T>
- Returns:
- the partitions in this list
-
getParents
public List<PLL<?>> getParents()
Description copied from class:PLL
Returns the PLLs that this PLL depends on, to compute its contents. This is used for debugging purposes, to display the tree of dependencies of a given PLL.- Specified by:
getParents
in classPLL<T>
- See Also:
PLL.getQueryTree()
-
hasCachedPartitionSizes
public boolean hasCachedPartitionSizes()
Description copied from class:PLL
Is this PLL aware of the size of its partitions?- Overrides:
hasCachedPartitionSizes
in classPLL<T>
-
computePartitionSizes
public io.vavr.collection.Array<Long> computePartitionSizes()
- Overrides:
computePartitionSizes
in classPLL<T>
-
getDroppedPartitions
public int getDroppedPartitions()
The difference between the parents' number of partitions and the new number of partitions in this PLL.
-
-