github.com/searKing/golang/go@v1.2.74/util/spliterator/spliterator.go (about) 1 // Copyright 2020 The searKing Author. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 // A sequence of elements supporting sequential and parallel aggregate 6 // operations. The following example illustrates an aggregate operation using 7 // SEE java/util/function/Consumer.java 8 package spliterator 9 10 import ( 11 "context" 12 13 "github.com/searKing/golang/go/util/function/consumer" 14 "github.com/searKing/golang/go/util/object" 15 ) 16 17 /** 18 * An object for traversing and partitioning elements of a source. The source 19 * of elements covered by a Spliterator could be, for example, an array, a 20 * {@link Collection}, an IO channel, or a generator function. 21 * 22 * <p>A Spliterator may traverse elements individually ({@link 23 * #tryAdvance tryAdvance()}) or sequentially in bulk 24 * ({@link #forEachRemaining forEachRemaining()}). 25 * 26 * <p>A Spliterator may also partition off some of its elements (using 27 * {@link #trySplit}) as another Spliterator, to be used in 28 * possibly-parallel operations. Operations using a Spliterator that 29 * cannot split, or does so in a highly imbalanced or inefficient 30 * manner, are unlikely to benefit from parallelism. Traversal 31 * and splitting exhaust elements; each Spliterator is useful for only a single 32 * bulk computation. 33 * 34 * <p>A Spliterator also reports a set of {@link #characteristics()} of its 35 * structure, source, and elements from among {@link #ORDERED}, 36 * {@link #SpliteratorDISTINCT}, {@link #SORTED}, {@link #SpliteratorSIZED}, {@link #NONNULL}, 37 * {@link #IMMUTABLE}, {@link #CONCURRENT}, and {@link #SUBSIZED}. These may 38 * be employed by Spliterator clients to control, specialize or simplify 39 * computation. For example, a Spliterator for a {@link Collection} would 40 * report {@code SpliteratorSIZED}, a Spliterator for a {@link Set} would report 41 * {@code SpliteratorDISTINCT}, and a Spliterator for a {@link SortedSet} would also 42 * report {@code SORTED}. Characteristics are reported as a simple unioned bit 43 * set. 44 * 45 * Some characteristics additionally constrain method behavior; for example if 46 * {@code ORDERED}, traversal methods must conform to their documented ordering. 47 * New characteristics may be defined in the future, so implementors should not 48 * assign meanings to unlisted values. 49 * 50 * <p><a id="binding">A Spliterator that does not report {@code IMMUTABLE} or 51 * {@code CONCURRENT} is expected to have a documented policy concerning: 52 * when the spliterator <em>binds</em> to the element source; and detection of 53 * structural interference of the element source detected after binding.</a> A 54 * <em>late-binding</em> Spliterator binds to the source of elements at the 55 * point of first traversal, first split, or first query for estimated size, 56 * rather than at the time the Spliterator is created. A Spliterator that is 57 * not <em>late-binding</em> binds to the source of elements at the point of 58 * construction or first invocation of any method. Modifications made to the 59 * source prior to binding are reflected when the Spliterator is traversed. 60 * After binding a Spliterator should, on a best-effort basis, throw 61 * {@link ConcurrentModificationException} if structural interference is 62 * detected. Spliterators that do this are called <em>fail-fast</em>. The 63 * bulk traversal method ({@link #forEachRemaining forEachRemaining()}) of a 64 * Spliterator may optimize traversal and check for structural interference 65 * after all elements have been traversed, rather than checking per-element and 66 * failing immediately. 67 * 68 * <p>Spliterators can provide an estimate of the number of remaining elements 69 * via the {@link #estimateSize} method. Ideally, as reflected in characteristic 70 * {@link #SpliteratorSIZED}, this value corresponds exactly to the number of elements 71 * that would be encountered in a successful traversal. However, even when not 72 * exactly known, an estimated value may still be useful to operations 73 * being performed on the source, such as helping to determine whether it is 74 * preferable to split further or traverse the remaining elements sequentially. 75 * 76 * <p>Despite their obvious utility in parallel algorithms, spliterators are not 77 * expected to be thread-safe; instead, implementations of parallel algorithms 78 * using spliterators should ensure that the spliterator is only used by one 79 * thread at a time. This is generally easy to attain via <em>serial 80 * thread-confinement</em>, which often is a natural consequence of typical 81 * parallel algorithms that work by recursive decomposition. A thread calling 82 * {@link #trySplit()} may hand over the returned Spliterator to another thread, 83 * which in turn may traverse or further split that Spliterator. The behaviour 84 * of splitting and traversal is undefined if two or more threads operate 85 * concurrently on the same spliterator. If the original thread hands a 86 * spliterator off to another thread for processing, it is best if that handoff 87 * occurs before any elements are consumed with {@link #tryAdvance(Consumer) 88 * tryAdvance()}, as certain guarantees (such as the accuracy of 89 * {@link #estimateSize()} for {@code SpliteratorSIZED} spliterators) are only valid before 90 * traversal has begun. 91 * 92 * <p>Primitive subtype specializations of {@code Spliterator} are provided for 93 * {@link OfInt int}, {@link OfLong long}, and {@link OfDouble double} values. 94 * The subtype default implementations of 95 * {@link Spliterator#tryAdvance(java.util.function.Consumer)} 96 * and {@link Spliterator#forEachRemaining(java.util.function.Consumer)} box 97 * primitive values to instances of their corresponding wrapper class. Such 98 * boxing may undermine any performance advantages gained by using the primitive 99 * specializations. To avoid boxing, the corresponding primitive-based methods 100 * should be used. For example, 101 * {@link Spliterator.OfInt#tryAdvance(java.util.function.IntConsumer)} 102 * and {@link Spliterator.OfInt#forEachRemaining(java.util.function.IntConsumer)} 103 * should be used in preference to 104 * {@link Spliterator.OfInt#tryAdvance(java.util.function.Consumer)} and 105 * {@link Spliterator.OfInt#forEachRemaining(java.util.function.Consumer)}. 106 * Traversal of primitive values using boxing-based methods 107 * {@link #tryAdvance tryAdvance()} and 108 * {@link #forEachRemaining(java.util.function.Consumer) forEachRemaining()} 109 * does not affect the order in which the values, transformed to boxed values, 110 * are encountered. 111 * 112 * @apiNote 113 * <p>Spliterators, like {@code Iterator}s, are for traversing the elements of 114 * a source. The {@code Spliterator} API was designed to support efficient 115 * parallel traversal in addition to sequential traversal, by supporting 116 * decomposition as well as single-element iteration. In addition, the 117 * protocol for accessing elements via a Spliterator is designed to impose 118 * smaller per-element overhead than {@code Iterator}, and to avoid the inherent 119 * race involved in having separate methods for {@code hasNext()} and 120 * {@code next()}. 121 * 122 * <p>For mutable sources, arbitrary and non-deterministic behavior may occur if 123 * the source is structurally interfered with (elements added, replaced, or 124 * removed) between the time that the Spliterator binds to its data source and 125 * the end of traversal. For example, such interference will produce arbitrary, 126 * non-deterministic results when using the {@code java.util.stream} framework. 127 * 128 * <p>Structural interference of a source can be managed in the following ways 129 * (in approximate order of decreasing desirability): 130 * <ul> 131 * <li>The source cannot be structurally interfered with. 132 * <br>For example, an instance of 133 * {@link java.util.concurrent.CopyOnWriteArrayList} is an immutable source. 134 * A Spliterator created from the source reports a characteristic of 135 * {@code IMMUTABLE}.</li> 136 * <li>The source manages concurrent modifications. 137 * <br>For example, a key set of a {@link java.util.concurrent.ConcurrentHashMap} 138 * is a concurrent source. A Spliterator created from the source reports a 139 * characteristic of {@code CONCURRENT}.</li> 140 * <li>The mutable source provides a late-binding and fail-fast Spliterator. 141 * <br>Late binding narrows the window during which interference can affect 142 * the calculation; fail-fast detects, on a best-effort basis, that structural 143 * interference has occurred after traversal has commenced and throws 144 * {@link ConcurrentModificationException}. For example, {@link ArrayList}, 145 * and many other non-concurrent {@code Collection} classes in the JDK, provide 146 * a late-binding, fail-fast spliterator.</li> 147 * <li>The mutable source provides a non-late-binding but fail-fast Spliterator. 148 * <br>The source increases the likelihood of throwing 149 * {@code ConcurrentModificationException} since the window of potential 150 * interference is larger.</li> 151 * <li>The mutable source provides a late-binding and non-fail-fast Spliterator. 152 * <br>The source risks arbitrary, non-deterministic behavior after traversal 153 * has commenced since interference is not detected. 154 * </li> 155 * <li>The mutable source provides a non-late-binding and non-fail-fast 156 * Spliterator. 157 * <br>The source increases the risk of arbitrary, non-deterministic behavior 158 * since non-detected interference may occur after construction. 159 * </li> 160 * </ul> 161 * 162 * <p><b>Example.</b> Here is a class (not a very useful one, except 163 * for illustration) that maintains an array in which the actual data 164 * are held in even locations, and unrelated tag data are held in odd 165 * locations. Its Spliterator ignores the tags. 166 * 167 * <pre> {@code 168 * class TaggedArray<T> { 169 * private final Object[] elements; // immutable after construction 170 * TaggedArray(T[] data, Object[] tags) { 171 * int size = data.length; 172 * if (tags.length != size) throw new IllegalArgumentException(); 173 * this.elements = new Object[2 * size]; 174 * for (int i = 0, j = 0; i < size; ++i) { 175 * elements[j++] = data[i]; 176 * elements[j++] = tags[i]; 177 * } 178 * } 179 * 180 * public Spliterator<T> spliterator() { 181 * return new TaggedArraySpliterator<>(elements, 0, elements.length); 182 * } 183 * 184 * static class TaggedArraySpliterator<T> implements Spliterator<T> { 185 * private final Object[] array; 186 * private int origin; // current index, advanced on split or traversal 187 * private final int fence; // one past the greatest index 188 * 189 * TaggedArraySpliterator(Object[] array, int origin, int fence) { 190 * this.array = array; this.origin = origin; this.fence = fence; 191 * } 192 * 193 * public void forEachRemaining(Consumer<? super T> action) { 194 * for (; origin < fence; origin += 2) 195 * action.accept((T) array[origin]); 196 * } 197 * 198 * public boolean tryAdvance(Consumer<? super T> action) { 199 * if (origin < fence) { 200 * action.accept((T) array[origin]); 201 * origin += 2; 202 * return true; 203 * } 204 * else // cannot advance 205 * return false; 206 * } 207 * 208 * public Spliterator<T> trySplit() { 209 * int lo = origin; // divide range in half 210 * int mid = ((lo + fence) >>> 1) & ~1; // force midpoint to be even 211 * if (lo < mid) { // split out left half 212 * origin = mid; // reset this Spliterator's origin 213 * return new TaggedArraySpliterator<>(array, lo, mid); 214 * } 215 * else // too small to split 216 * return null; 217 * } 218 * 219 * public long estimateSize() { 220 * return (long)((fence - origin) / 2); 221 * } 222 * 223 * public int characteristics() { 224 * return ORDERED | SpliteratorSIZED | IMMUTABLE | SUBSIZED; 225 * } 226 * } 227 * }}</pre> 228 * 229 * <p>As an example how a parallel computation framework, such as the 230 * {@code java.util.stream} package, would use Spliterator in a parallel 231 * computation, here is one way to implement an associated parallel forEach, 232 * that illustrates the primary usage idiom of splitting off subtasks until 233 * the estimated amount of work is small enough to perform 234 * sequentially. Here we assume that the order of processing across 235 * subtasks doesn't matter; different (forked) tasks may further split 236 * and process elements concurrently in undetermined order. This 237 * example uses a {@link java.util.concurrent.CountedCompleter}; 238 * similar usages apply to other parallel task constructions. 239 * 240 * <pre>{@code 241 * static <T> void parEach(TaggedArray<T> a, Consumer<T> action) { 242 * Spliterator<T> s = a.spliterator(); 243 * long targetBatchSize = s.estimateSize() / (ForkJoinPool.getCommonPoolParallelism() * 8); 244 * new ParEach(null, s, action, targetBatchSize).invoke(); 245 * } 246 * 247 * static class ParEach<T> extends CountedCompleter<Void> { 248 * final Spliterator<T> spliterator; 249 * final Consumer<T> action; 250 * final long targetBatchSize; 251 * 252 * ParEach(ParEach<T> parent, Spliterator<T> spliterator, 253 * Consumer<T> action, long targetBatchSize) { 254 * super(parent); 255 * this.spliterator = spliterator; this.action = action; 256 * this.targetBatchSize = targetBatchSize; 257 * } 258 * 259 * public void compute() { 260 * Spliterator<T> sub; 261 * while (spliterator.estimateSize() > targetBatchSize && 262 * (sub = spliterator.trySplit()) != null) { 263 * addToPendingCount(1); 264 * new ParEach<>(this, sub, action, targetBatchSize).fork(); 265 * } 266 * spliterator.forEachRemaining(action); 267 * propagateCompletion(); 268 * } 269 * }}</pre> 270 * 271 * @implNote 272 * If the boolean system property {@code org.openjdk.java.util.stream.tripwire} 273 * is set to {@code true} then diagnostic warnings are reported if boxing of 274 * primitive values occur when operating on primitive subtype specializations. 275 * 276 * @param <T> the type of elements returned by this Spliterator 277 * 278 * @see Collection 279 * @since 1.8 280 */ 281 type Spliterator interface { 282 /** 283 * If a remaining element exists, performs the given action on it, 284 * returning {@code true}; else returns {@code false}. If this 285 * Spliterator is {@link #ORDERED} the action is performed on the 286 * next element in encounter order. Exceptions thrown by the 287 * action are relayed to the caller. 288 * 289 * @param action The action 290 * @return {@code false} if no remaining elements existed 291 * upon entry to this method, else {@code true}. 292 * @throws NullPointerException if the specified action is null 293 */ 294 TryAdvance(ctx context.Context, consumer consumer.Consumer) bool 295 296 /** 297 * Performs the given action for each remaining element, sequentially in 298 * the current thread, until all elements have been processed or the action 299 * throws an exception. If this Spliterator is {@link #ORDERED}, actions 300 * are performed in encounter order. Exceptions thrown by the action 301 * are relayed to the caller. 302 * 303 * @implSpec 304 * The default implementation repeatedly invokes {@link #tryAdvance} until 305 * it returns {@code false}. It should be overridden whenever possible. 306 * 307 * @param action The action 308 * @throws NullPointerException if the specified action is null 309 */ 310 ForEachRemaining(ctx context.Context, consumer consumer.Consumer) 311 312 /** 313 * If this spliterator can be partitioned, returns a Spliterator 314 * covering elements, that will, upon return from this method, not 315 * be covered by this Spliterator. 316 * 317 * <p>If this Spliterator is {@link #ORDERED}, the returned Spliterator 318 * must cover a strict prefix of the elements. 319 * 320 * <p>Unless this Spliterator covers an infinite number of elements, 321 * repeated calls to {@code trySplit()} must eventually return {@code null}. 322 * Upon non-null return: 323 * <ul> 324 * <li>the value reported for {@code estimateSize()} before splitting, 325 * must, after splitting, be greater than or equal to {@code estimateSize()} 326 * for this and the returned Spliterator; and</li> 327 * <li>if this Spliterator is {@code SUBSIZED}, then {@code estimateSize()} 328 * for this spliterator before splitting must be equal to the sum of 329 * {@code estimateSize()} for this and the returned Spliterator after 330 * splitting.</li> 331 * </ul> 332 * 333 * <p>This method may return {@code null} for any reason, 334 * including emptiness, inability to split after traversal has 335 * commenced, data structure constraints, and efficiency 336 * considerations. 337 * 338 * @apiNote 339 * An ideal {@code trySplit} method efficiently (without 340 * traversal) divides its elements exactly in half, allowing 341 * balanced parallel computation. Many departures from this ideal 342 * remain highly effective; for example, only approximately 343 * splitting an approximately balanced tree, or for a tree in 344 * which leaf nodes may contain either one or two elements, 345 * failing to further split these nodes. However, large 346 * deviations in balance and/or overly inefficient {@code 347 * trySplit} mechanics typically result in poor parallel 348 * performance. 349 * 350 * @return a {@code Spliterator} covering some portion of the 351 * elements, or {@code null} if this spliterator cannot be split 352 */ 353 TrySplit() Spliterator 354 355 /** 356 * Returns an estimate of the number of elements that would be 357 * encountered by a {@link #forEachRemaining} traversal, or returns {@link 358 * Long#MAX_VALUE} if infinite, unknown, or too expensive to compute. 359 * 360 * <p>If this Spliterator is {@link #SpliteratorSIZED} and has not yet been partially 361 * traversed or split, or this Spliterator is {@link #SUBSIZED} and has 362 * not yet been partially traversed, this estimate must be an accurate 363 * count of elements that would be encountered by a complete traversal. 364 * Otherwise, this estimate may be arbitrarily inaccurate, but must decrease 365 * as specified across invocations of {@link #trySplit}. 366 * 367 * @apiNote 368 * Even an inexact estimate is often useful and inexpensive to compute. 369 * For example, a sub-spliterator of an approximately balanced binary tree 370 * may return a value that estimates the number of elements to be half of 371 * that of its parent; if the root Spliterator does not maintain an 372 * accurate count, it could estimate size to be the power of two 373 * corresponding to its maximum depth. 374 * 375 * @return the estimated size, or {@code Long.MAX_VALUE} if infinite, 376 * unknown, or too expensive to compute. 377 */ 378 379 EstimateSize() int 380 381 /** 382 * Convenience method that returns {@link #estimateSize()} if this 383 * Spliterator is {@link #SpliteratorSIZED}, else {@code -1}. 384 * @implSpec 385 * The default implementation returns the result of {@code estimateSize()} 386 * if the Spliterator reports a characteristic of {@code SpliteratorSIZED}, and 387 * {@code -1} otherwise. 388 * 389 * @return the exact size, if known, else {@code -1}. 390 */ 391 GetExactSizeIfKnown() int 392 393 /** 394 * Returns a set of characteristics of this Spliterator and its 395 * elements. The result is represented as ORed values from {@link 396 * #ORDERED}, {@link #SpliteratorDISTINCT}, {@link #SORTED}, {@link #SpliteratorSIZED}, 397 * {@link #NONNULL}, {@link #IMMUTABLE}, {@link #CONCURRENT}, 398 * {@link #SUBSIZED}. Repeated calls to {@code characteristics()} on 399 * a given spliterator, prior to or in-between calls to {@code trySplit}, 400 * should always return the same result. 401 * 402 * <p>If a Spliterator reports an inconsistent set of 403 * characteristics (either those returned from a single invocation 404 * or across multiple invocations), no guarantees can be made 405 * about any computation using this Spliterator. 406 * 407 * @apiNote The characteristics of a given spliterator before splitting 408 * may differ from the characteristics after splitting. For specific 409 * examples see the characteristic values {@link #SpliteratorSIZED}, {@link #SUBSIZED} 410 * and {@link #CONCURRENT}. 411 * 412 * @return a representation of characteristics 413 */ 414 415 Characteristics() Characteristic 416 417 /** 418 * Returns {@code true} if this Spliterator's {@link 419 * #characteristics} contain all of the given characteristics. 420 * 421 * @implSpec 422 * The default implementation returns true if the corresponding bits 423 * of the given characteristics are set. 424 * 425 * @param characteristics the characteristics to check for 426 * @return {@code true} if all the specified characteristics are present, 427 * else {@code false} 428 */ 429 HasCharacteristics(characteristics Characteristic) bool 430 431 /** 432 * If this Spliterator's source is {@link #SORTED} by a {@link Comparator}, 433 * returns that {@code Comparator}. If the source is {@code SORTED} in 434 * {@linkplain Comparable natural order}, returns {@code null}. Otherwise, 435 * if the source is not {@code SORTED}, throws {@link IllegalStateException}. 436 * 437 * @implSpec 438 * The default implementation always throws {@link IllegalStateException}. 439 * 440 * @return a Comparator, or {@code null} if the elements are sorted in the 441 * natural order. 442 * @throws IllegalStateException if the spliterator does not report 443 * a characteristic of {@code SORTED}. 444 */ 445 GetComparator() object.Comparator 446 }