Package org.openrefine.runners.local.pll
Class InMemoryPLL<T>
- java.lang.Object
-
- org.openrefine.runners.local.pll.PLL<T>
-
- org.openrefine.runners.local.pll.InMemoryPLL<T>
-
- Type Parameters:
T
-
public class InMemoryPLL<T> extends PLL<T>
A PLL which is created out of a regular Java collection. The collection is split into contiguous partitions which can be enumerated from independently.- Author:
- Antonin Delpeuch
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static class
InMemoryPLL.InMemoryPartition
-
Nested classes/interfaces inherited from class org.openrefine.runners.local.pll.PLL
PLL.LastFlush, PLL.PLLExecutionError
-
-
Field Summary
Fields Modifier and Type Field Description protected ArrayList<T>
list
protected io.vavr.collection.Array<InMemoryPLL.InMemoryPartition>
partitions
-
Fields inherited from class org.openrefine.runners.local.pll.PLL
cachedPartitions, context, id, name
-
-
Constructor Summary
Constructors Constructor Description InMemoryPLL(PLLContext context, Collection<T> elements, int nbPartitions)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description ProgressingFuture<Void>
cacheAsync()
Loads the contents of all partitions in memory.CloseableIterator<T>
compute(Partition partition)
Iterate over the elements of the given partition.protected io.vavr.collection.Array<Long>
computePartitionSizes()
protected static io.vavr.collection.Array<InMemoryPLL.InMemoryPartition>
createPartitions(int size, int nbPartitions)
Computes the list of partitions given the size of the collection and the desired number of partitions.List<PLL<?>>
getParents()
Returns the PLLs that this PLL depends on, to compute its contents.io.vavr.collection.Array<? extends Partition>
getPartitions()
boolean
hasCachedPartitionSizes()
Is this PLL aware of the size of its partitions?boolean
isCached()
Are the contents of this collection loaded in memory?void
uncache()
Unloads the partition contents from memory-
Methods inherited from class org.openrefine.runners.local.pll.PLL
aggregate, batchPartitions, collect, collectPartitionsAsync, concatenate, concatenate, count, dropFirstElements, dropLastElements, filter, flatMap, getContext, getId, getPartitionSizes, getQueryTree, isEmpty, iterate, iterateFromPartition, iterator, limitPartitions, map, mapPartitions, mapToPair, mapToPair, numPartitions, retainPartitions, runOnPartitions, runOnPartitions, runOnPartitionsAsync, runOnPartitionsAsync, runOnPartitionsWithoutInterruption, runOnPartitionsWithoutInterruption, saveAsTextFile, saveAsTextFileAsync, scanMap, scanMapStream, sort, take, toString, withCachedPartitionSizes, writeOriginalPartition, writePartition, writePlannedPartition, zipWithIndex
-
-
-
-
Field Detail
-
partitions
protected final io.vavr.collection.Array<InMemoryPLL.InMemoryPartition> partitions
-
-
Constructor Detail
-
InMemoryPLL
public InMemoryPLL(PLLContext context, Collection<T> elements, int nbPartitions)
-
-
Method Detail
-
compute
public CloseableIterator<T> compute(Partition partition)
Description copied from class:PLL
Iterate over the elements of the given partition. This is the method that should be implemented by subclasses. As this method forces computation, ignoring any caching, consumers should not call it directly but rather usePLL.iterate(Partition)
. Once the iterator is not needed anymore, it should be closed. This makes it possible to release the underlying resources supporting it, such as open files or sockets.
-
computePartitionSizes
protected io.vavr.collection.Array<Long> computePartitionSizes()
- Overrides:
computePartitionSizes
in classPLL<T>
-
hasCachedPartitionSizes
public boolean hasCachedPartitionSizes()
Description copied from class:PLL
Is this PLL aware of the size of its partitions?- Overrides:
hasCachedPartitionSizes
in classPLL<T>
-
getPartitions
public io.vavr.collection.Array<? extends Partition> getPartitions()
- Specified by:
getPartitions
in classPLL<T>
- Returns:
- the partitions in this list
-
cacheAsync
public ProgressingFuture<Void> cacheAsync()
Description copied from class:PLL
Loads the contents of all partitions in memory.- Overrides:
cacheAsync
in classPLL<T>
-
isCached
public boolean isCached()
Description copied from class:PLL
Are the contents of this collection loaded in memory?
-
uncache
public void uncache()
Description copied from class:PLL
Unloads the partition contents from memory
-
getParents
public List<PLL<?>> getParents()
Description copied from class:PLL
Returns the PLLs that this PLL depends on, to compute its contents. This is used for debugging purposes, to display the tree of dependencies of a given PLL.- Specified by:
getParents
in classPLL<T>
- See Also:
PLL.getQueryTree()
-
createPartitions
protected static io.vavr.collection.Array<InMemoryPLL.InMemoryPartition> createPartitions(int size, int nbPartitions)
Computes the list of partitions given the size of the collection and the desired number of partitions.- Parameters:
size
-nbPartitions
-- Returns:
-
-