github.com/NebulousLabs/Sia@v1.3.7/modules/renter/download.go (about) 1 package renter 2 3 // The download code follows a hopefully clean/intuitive flow for getting super 4 // high and computationally efficient parallelism on downloads. When a download 5 // is requested, it gets split into its respective chunks (which are downloaded 6 // individually) and then put into the download heap. The primary purpose of the 7 // download heap is to keep downloads on standby until there is enough memory 8 // available to send the downloads off to the workers. The heap is sorted first 9 // by priority, but then a few other criteria as well. 10 // 11 // Some downloads, in particular downloads issued by the repair code, have 12 // already had their memory allocated. These downloads get to skip the heap and 13 // go straight for the workers. 14 // 15 // When a download is distributed to workers, it is given to every single worker 16 // without checking whether that worker is appropriate for the download. Each 17 // worker has their own queue, which is bottlenecked by the fact that a worker 18 // can only process one item at a time. When the worker gets to a download 19 // request, it determines whether it is suited for downloading that particular 20 // file. The criteria it uses include whether or not it has a piece of that 21 // chunk, how many other workers are currently downloading pieces or have 22 // completed pieces for that chunk, and finally things like worker latency and 23 // worker price. 24 // 25 // If the worker chooses to download a piece, it will register itself with that 26 // piece, so that other workers know how many workers are downloading each 27 // piece. This keeps everything cleanly coordinated and prevents too many 28 // workers from downloading a given piece, while at the same time you don't need 29 // a giant messy coordinator tracking everything. If a worker chooses not to 30 // download a piece, it will add itself to the list of standby workers, so that 31 // in the event of a failure, the worker can be returned to and used again as a 32 // backup worker. The worker may also decide that it is not suitable at all (for 33 // example, if the worker has recently had some consecutive failures, or if the 34 // worker doesn't have access to a piece of that chunk), in which case it will 35 // mark itself as unavailable to the chunk. 36 // 37 // As workers complete, they will release memory and check on the overall state 38 // of the chunk. If some workers fail, they will enlist the standby workers to 39 // pick up the slack. 40 // 41 // When the final required piece finishes downloading, the worker who completed 42 // the final piece will spin up a separate thread to decrypt, decode, and write 43 // out the download. That thread will then clean up any remaining resources, and 44 // if this was the final unfinished chunk in the download, it'll mark the 45 // download as complete. 46 47 // The download process has a slightly complicating factor, which is overdrive 48 // workers. Traditionally, if you need 10 pieces to recover a file, you will use 49 // 10 workers. But if you have an overdrive of '2', you will actually use 12 50 // workers, meaning you download 2 more pieces than you need. This means that up 51 // to two of the workers can be slow or fail and the download can still complete 52 // quickly. This complicates resource handling, because not all memory can be 53 // released as soon as a download completes - there may be overdrive workers 54 // still out fetching the file. To handle this, a catchall 'cleanUp' function is 55 // used which gets called every time a worker finishes, and every time recovery 56 // completes. The result is that memory gets cleaned up as required, and no 57 // overarching coordination is needed between the overdrive workers (who do not 58 // even know that they are overdrive workers) and the recovery function. 59 60 // By default, the download code organizes itself around having maximum possible 61 // throughput. That is, it is highly parallel, and exploits that parallelism as 62 // efficiently and effectively as possible. The hostdb does a good of selecting 63 // for hosts that have good traits, so we can generally assume that every host 64 // or worker at our disposable is reasonably effective in all dimensions, and 65 // that the overall selection is generally geared towards the user's 66 // preferences. 67 // 68 // We can leverage the standby workers in each unfinishedDownloadChunk to 69 // emphasize various traits. For example, if we want to prioritize latency, 70 // we'll put a filter in the 'managedProcessDownloadChunk' function that has a 71 // worker go standby instead of accept a chunk if the latency is higher than the 72 // targeted latency. These filters can target other traits as well, such as 73 // price and total throughput. 74 75 // TODO: One of the biggest requested features for users is to improve the 76 // latency of the system. The biggest fruit actually doesn't happen here, right 77 // now the hostdb doesn't discriminate based on latency at all, and simply 78 // adding some sort of latency scoring will probably be the biggest thing that 79 // we can do to improve overall file latency. 80 // 81 // After we do that, the second most important thing that we can do is enable 82 // partial downloads. It's hard to have a low latency when to get any data back 83 // at all you need to download a full 40 MiB. If we can leverage partial 84 // downloads to drop that to something like 256kb, we'll get much better overall 85 // latency for small files and for starting video streams. 86 // 87 // After both of those, we can leverage worker latency discrimination. We can 88 // add code to 'managedProcessDownloadChunk' to put a worker on standby 89 // initially instead of have it grab a piece if the latency of the worker is 90 // higher than the faster workers. This will prevent the slow workers from 91 // bottlenecking a chunk that we are trying to download quickly, though it will 92 // harm overall system throughput because it means that the slower workers will 93 // idle some of the time. 94 95 // TODO: Currently the number of overdrive workers is set to '2' for the first 2 96 // chunks of any user-initiated download. But really, this should be a parameter 97 // of downloading that gets set by the user through the API on a per-file basis 98 // instead of set by default. 99 100 // TODO: I tried to write the code such that the transition to true partial 101 // downloads would be as seamless as possible, but there's a lot of work that 102 // still needs to be done to make that fully possible. The most disruptive thing 103 // probably is the place where we call 'Sector' in worker.managedDownload. 104 // That's going to need to be changed to a partial sector. This is probably 105 // going to result in downloading that's 64-byte aligned instead of perfectly 106 // byte-aligned. Further, the encryption and erasure coding may also have 107 // alignment requirements which interfere with how the call to Sector can work. 108 // So you need to make sure that in 'managedDownload' you download at least 109 // enough data to fit the alignment requirements of all 3 steps (download from 110 // host, encryption, erasure coding). After the logical data has been recovered, 111 // we slice it to whatever is meant to be written to the underlying 112 // downloadWriter, that code is going to need to be adjusted as well to slice 113 // things in the right way. 114 // 115 // Overall I don't think it's going to be all that difficult, but it's not 116 // nearly as clean-cut as some of the other potential extensions that we can do. 117 118 // TODO: Right now the whole download will build and send off chunks even if 119 // there are not enough hosts to download the file, and even if there are not 120 // enough hosts to download a particular chunk. For the downloads and chunks 121 // which are doomed from the outset, we can skip some computation by checking 122 // and failing earlier. Another optimization we can make is to not count a 123 // worker for a chunk if the worker's contract does not appear in the chunk 124 // heap. 125 126 import ( 127 "fmt" 128 "os" 129 "path/filepath" 130 "sync" 131 "sync/atomic" 132 "time" 133 134 "github.com/NebulousLabs/Sia/modules" 135 "github.com/NebulousLabs/Sia/persist" 136 "github.com/NebulousLabs/Sia/types" 137 138 "github.com/NebulousLabs/errors" 139 ) 140 141 type ( 142 // A download is a file download that has been queued by the renter. 143 download struct { 144 // Data progress variables. 145 atomicDataReceived uint64 // Incremented as data completes, will stop at 100% file progress. 146 atomicTotalDataTransferred uint64 // Incremented as data arrives, includes overdrive, contract negotiation, etc. 147 148 // Other progress variables. 149 chunksRemaining uint64 // Number of chunks whose downloads are incomplete. 150 completeChan chan struct{} // Closed once the download is complete. 151 err error // Only set if there was an error which prevented the download from completing. 152 153 // Timestamp information. 154 endTime time.Time // Set immediately before closing 'completeChan'. 155 staticStartTime time.Time // Set immediately when the download object is created. 156 157 // Basic information about the file. 158 destination downloadDestination 159 destinationString string // The string reported to the user to indicate the download's destination. 160 staticDestinationType string // "memory buffer", "http stream", "file", etc. 161 staticLength uint64 // Length to download starting from the offset. 162 staticOffset uint64 // Offset within the file to start the download. 163 staticSiaPath string // The path of the siafile at the time the download started. 164 165 // Retrieval settings for the file. 166 staticLatencyTarget time.Duration // In milliseconds. Lower latency results in lower total system throughput. 167 staticOverdrive int // How many extra pieces to download to prevent slow hosts from being a bottleneck. 168 staticPriority uint64 // Downloads with higher priority will complete first. 169 170 // Utilities. 171 log *persist.Logger // Same log as the renter. 172 memoryManager *memoryManager // Same memoryManager used across the renter. 173 mu sync.Mutex // Unique to the download object. 174 } 175 176 // downloadParams is the set of parameters to use when downloading a file. 177 downloadParams struct { 178 destination downloadDestination // The place to write the downloaded data. 179 destinationType string // "file", "buffer", "http stream", etc. 180 destinationString string // The string to report to the user for the destination. 181 file *file // The file to download. 182 183 latencyTarget time.Duration // Workers above this latency will be automatically put on standby initially. 184 length uint64 // Length of download. Cannot be 0. 185 needsMemory bool // Whether new memory needs to be allocated to perform the download. 186 offset uint64 // Offset within the file to start the download. Must be less than the total filesize. 187 overdrive int // How many extra pieces to download to prevent slow hosts from being a bottleneck. 188 priority uint64 // Files with a higher priority will be downloaded first. 189 } 190 ) 191 192 // managedFail will mark the download as complete, but with the provided error. 193 // If the download has already failed, the error will be updated to be a 194 // concatenation of the previous error and the new error. 195 func (d *download) managedFail(err error) { 196 d.mu.Lock() 197 defer d.mu.Unlock() 198 199 // If the download is already complete, extend the error. 200 complete := d.staticComplete() 201 if complete && d.err != nil { 202 return 203 } else if complete && d.err == nil { 204 d.log.Critical("download is marked as completed without error, but then managedFail was called with err:", err) 205 return 206 } 207 208 // Mark the download as complete and set the error. 209 d.err = err 210 close(d.completeChan) 211 if d.destination != nil { 212 err = d.destination.Close() 213 d.destination = nil 214 } 215 if err != nil { 216 d.log.Println("unable to close download destination:", err) 217 } 218 } 219 220 // staticComplete is a helper function to indicate whether or not the download 221 // has completed. 222 func (d *download) staticComplete() bool { 223 select { 224 case <-d.completeChan: 225 return true 226 default: 227 return false 228 } 229 } 230 231 // Err returns the error encountered by a download, if it exists. 232 func (d *download) Err() (err error) { 233 d.mu.Lock() 234 err = d.err 235 d.mu.Unlock() 236 return err 237 } 238 239 // Download performs a file download using the passed parameters and blocks 240 // until the download is finished. 241 func (r *Renter) Download(p modules.RenterDownloadParameters) error { 242 d, err := r.managedDownload(p) 243 if err != nil { 244 return err 245 } 246 // Block until the download has completed 247 select { 248 case <-d.completeChan: 249 return d.Err() 250 case <-r.tg.StopChan(): 251 return errors.New("download interrupted by shutdown") 252 } 253 } 254 255 // DownloadAsync performs a file download using the passed parameters without 256 // blocking until the download is finished. 257 func (r *Renter) DownloadAsync(p modules.RenterDownloadParameters) error { 258 _, err := r.managedDownload(p) 259 return err 260 } 261 262 // managedDownload performs a file download using the passed parameters and 263 // returns the download object and an error that indicates if the download 264 // setup was successful. 265 func (r *Renter) managedDownload(p modules.RenterDownloadParameters) (*download, error) { 266 // Lookup the file associated with the nickname. 267 lockID := r.mu.RLock() 268 file, exists := r.files[p.SiaPath] 269 r.mu.RUnlock(lockID) 270 if !exists { 271 return nil, fmt.Errorf("no file with that path: %s", p.SiaPath) 272 } 273 274 // Validate download parameters. 275 isHTTPResp := p.Httpwriter != nil 276 if p.Async && isHTTPResp { 277 return nil, errors.New("cannot async download to http response") 278 } 279 if isHTTPResp && p.Destination != "" { 280 return nil, errors.New("destination cannot be specified when downloading to http response") 281 } 282 if !isHTTPResp && p.Destination == "" { 283 return nil, errors.New("destination not supplied") 284 } 285 if p.Destination != "" && !filepath.IsAbs(p.Destination) { 286 return nil, errors.New("destination must be an absolute path") 287 } 288 if p.Offset == file.size { 289 return nil, errors.New("offset equals filesize") 290 } 291 // Sentinel: if length == 0, download the entire file. 292 if p.Length == 0 { 293 p.Length = file.size - p.Offset 294 } 295 // Check whether offset and length is valid. 296 if p.Offset < 0 || p.Offset+p.Length > file.size { 297 return nil, fmt.Errorf("offset and length combination invalid, max byte is at index %d", file.size-1) 298 } 299 300 // Instantiate the correct downloadWriter implementation. 301 var dw downloadDestination 302 var destinationType string 303 if isHTTPResp { 304 dw = newDownloadDestinationWriteCloserFromWriter(p.Httpwriter) 305 destinationType = "http stream" 306 } else { 307 osFile, err := os.OpenFile(p.Destination, os.O_CREATE|os.O_WRONLY, os.FileMode(file.mode)) 308 if err != nil { 309 return nil, err 310 } 311 dw = osFile 312 destinationType = "file" 313 } 314 315 // Create the download object. 316 d, err := r.managedNewDownload(downloadParams{ 317 destination: dw, 318 destinationType: destinationType, 319 destinationString: p.Destination, 320 file: file, 321 322 latencyTarget: 25e3 * time.Millisecond, // TODO: high default until full latency support is added. 323 length: p.Length, 324 needsMemory: true, 325 offset: p.Offset, 326 overdrive: 3, // TODO: moderate default until full overdrive support is added. 327 priority: 5, // TODO: moderate default until full priority support is added. 328 }) 329 if err != nil { 330 return nil, err 331 } 332 333 // Add the download object to the download queue. 334 r.downloadHistoryMu.Lock() 335 r.downloadHistory = append(r.downloadHistory, d) 336 r.downloadHistoryMu.Unlock() 337 338 // Return the download object 339 return d, nil 340 } 341 342 // managedNewDownload creates and initializes a download based on the provided 343 // parameters. 344 func (r *Renter) managedNewDownload(params downloadParams) (*download, error) { 345 // Input validation. 346 if params.file == nil { 347 return nil, errors.New("no file provided when requesting download") 348 } 349 if params.length <= 0 { 350 return nil, errors.New("download length must be a positive whole number") 351 } 352 if params.offset < 0 { 353 return nil, errors.New("download offset cannot be a negative number") 354 } 355 if params.offset+params.length > params.file.size { 356 return nil, errors.New("download is requesting data past the boundary of the file") 357 } 358 359 // Create the download object. 360 d := &download{ 361 completeChan: make(chan struct{}), 362 363 staticStartTime: time.Now(), 364 365 destination: params.destination, 366 destinationString: params.destinationString, 367 staticDestinationType: params.destinationType, 368 staticLatencyTarget: params.latencyTarget, 369 staticLength: params.length, 370 staticOffset: params.offset, 371 staticOverdrive: params.overdrive, 372 staticSiaPath: params.file.name, 373 staticPriority: params.priority, 374 375 log: r.log, 376 memoryManager: r.memoryManager, 377 } 378 379 // Determine which chunks to download. 380 minChunk := params.offset / params.file.staticChunkSize() 381 maxChunk := (params.offset + params.length - 1) / params.file.staticChunkSize() 382 383 // For each chunk, assemble a mapping from the contract id to the index of 384 // the piece within the chunk that the contract is responsible for. 385 chunkMaps := make([]map[string]downloadPieceInfo, maxChunk-minChunk+1) 386 for i := range chunkMaps { 387 chunkMaps[i] = make(map[string]downloadPieceInfo) 388 } 389 params.file.mu.Lock() 390 for id, contract := range params.file.contracts { 391 resolvedKey := r.hostContractor.ResolveIDToPubKey(id) 392 for _, piece := range contract.Pieces { 393 if piece.Chunk >= minChunk && piece.Chunk <= maxChunk { 394 // Sanity check - the same worker should not have two pieces for 395 // the same chunk. 396 _, exists := chunkMaps[piece.Chunk-minChunk][string(resolvedKey.Key)] 397 if exists { 398 r.log.Println("ERROR: Worker has multiple pieces uploaded for the same chunk.") 399 } 400 chunkMaps[piece.Chunk-minChunk][string(resolvedKey.Key)] = downloadPieceInfo{ 401 index: piece.Piece, 402 root: piece.MerkleRoot, 403 } 404 } 405 } 406 } 407 params.file.mu.Unlock() 408 409 // Queue the downloads for each chunk. 410 writeOffset := int64(0) // where to write a chunk within the download destination. 411 d.chunksRemaining += maxChunk - minChunk + 1 412 for i := minChunk; i <= maxChunk; i++ { 413 udc := &unfinishedDownloadChunk{ 414 destination: params.destination, 415 erasureCode: params.file.erasureCode, 416 masterKey: params.file.masterKey, 417 418 staticChunkIndex: i, 419 staticCacheID: fmt.Sprintf("%v:%v", d.staticSiaPath, i), 420 staticChunkMap: chunkMaps[i-minChunk], 421 staticChunkSize: params.file.staticChunkSize(), 422 staticPieceSize: params.file.pieceSize, 423 424 // TODO: 25ms is just a guess for a good default. Really, we want to 425 // set the latency target such that slower workers will pick up the 426 // later chunks, but only if there's a very strong chance that 427 // they'll finish before the earlier chunks finish, so that they do 428 // no contribute to low latency. 429 // 430 // TODO: There is some sane minimum latency that should actually be 431 // set based on the number of pieces 'n', and the 'n' fastest 432 // workers that we have. 433 staticLatencyTarget: params.latencyTarget + (25 * time.Duration(i-minChunk)), // Increase target by 25ms per chunk. 434 staticNeedsMemory: params.needsMemory, 435 staticPriority: params.priority, 436 437 physicalChunkData: make([][]byte, params.file.erasureCode.NumPieces()), 438 pieceUsage: make([]bool, params.file.erasureCode.NumPieces()), 439 440 download: d, 441 staticStreamCache: r.staticStreamCache, 442 } 443 444 // Set the fetchOffset - the offset within the chunk that we start 445 // downloading from. 446 if i == minChunk { 447 udc.staticFetchOffset = params.offset % params.file.staticChunkSize() 448 } else { 449 udc.staticFetchOffset = 0 450 } 451 // Set the fetchLength - the number of bytes to fetch within the chunk 452 // that we start downloading from. 453 if i == maxChunk && (params.length+params.offset)%params.file.staticChunkSize() != 0 { 454 udc.staticFetchLength = ((params.length + params.offset) % params.file.staticChunkSize()) - udc.staticFetchOffset 455 } else { 456 udc.staticFetchLength = params.file.staticChunkSize() - udc.staticFetchOffset 457 } 458 // Set the writeOffset within the destination for where the data should 459 // be written. 460 udc.staticWriteOffset = writeOffset 461 writeOffset += int64(udc.staticFetchLength) 462 463 // TODO: Currently all chunks are given overdrive. This should probably 464 // be changed once the hostdb knows how to measure host speed/latency 465 // and once we can assign overdrive dynamically. 466 udc.staticOverdrive = params.overdrive 467 468 // Add this chunk to the chunk heap, and notify the download loop that 469 // there is work to do. 470 r.managedAddChunkToDownloadHeap(udc) 471 select { 472 case r.newDownloads <- struct{}{}: 473 default: 474 } 475 } 476 return d, nil 477 } 478 479 // DownloadHistory returns the list of downloads that have been performed. Will 480 // include downloads that have not yet completed. Downloads will be roughly, 481 // but not precisely, sorted according to start time. 482 // 483 // TODO: Currently the DownloadHistory only contains downloads from this 484 // session, does not contain downloads that were executed for the purposes of 485 // repairing, and has no way to clear the download history if it gets long or 486 // unwieldy. It's not entirely certain which of the missing features are 487 // actually desirable, please consult core team + app dev community before 488 // deciding what to implement. 489 func (r *Renter) DownloadHistory() []modules.DownloadInfo { 490 r.downloadHistoryMu.Lock() 491 defer r.downloadHistoryMu.Unlock() 492 493 downloads := make([]modules.DownloadInfo, len(r.downloadHistory)) 494 for i := range r.downloadHistory { 495 // Order from most recent to least recent. 496 d := r.downloadHistory[len(r.downloadHistory)-i-1] 497 d.mu.Lock() // Lock required for d.endTime only. 498 downloads[i] = modules.DownloadInfo{ 499 Destination: d.destinationString, 500 DestinationType: d.staticDestinationType, 501 Length: d.staticLength, 502 Offset: d.staticOffset, 503 SiaPath: d.staticSiaPath, 504 505 Completed: d.staticComplete(), 506 EndTime: d.endTime, 507 Received: atomic.LoadUint64(&d.atomicDataReceived), 508 StartTime: d.staticStartTime, 509 StartTimeUnix: d.staticStartTime.UnixNano(), 510 TotalDataTransferred: atomic.LoadUint64(&d.atomicTotalDataTransferred), 511 } 512 // Release download lock before calling d.Err(), which will acquire the 513 // lock. The error needs to be checked separately because we need to 514 // know if it's 'nil' before grabbing the error string. 515 d.mu.Unlock() 516 if d.Err() != nil { 517 downloads[i].Error = d.Err().Error() 518 } else { 519 downloads[i].Error = "" 520 } 521 } 522 return downloads 523 } 524 525 // ClearDownloadHistory clears the renter's download history inclusive of the 526 // provided before and after timestamps 527 // 528 // TODO: This function can be improved by implementing a binary search, the 529 // trick will be making the binary search be just as readable while handling 530 // all the edge cases 531 func (r *Renter) ClearDownloadHistory(after, before time.Time) error { 532 if err := r.tg.Add(); err != nil { 533 return err 534 } 535 defer r.tg.Done() 536 r.downloadHistoryMu.Lock() 537 defer r.downloadHistoryMu.Unlock() 538 539 // Check to confirm there are downloads to clear 540 if len(r.downloadHistory) == 0 { 541 return nil 542 } 543 544 // Timestamp validation 545 if before.Before(after) { 546 return errors.New("before timestamp can not be newer then after timestamp") 547 } 548 549 // Clear download history if both before and after timestamps are zero values 550 if before.Equal(types.EndOfTime) && after.IsZero() { 551 r.downloadHistory = r.downloadHistory[:0] 552 return nil 553 } 554 555 // Find and return downloads that are not within the given range 556 withinTimespan := func(t time.Time) bool { 557 return (t.After(after) || t.Equal(after)) && (t.Before(before) || t.Equal(before)) 558 } 559 filtered := r.downloadHistory[:0] 560 for _, d := range r.downloadHistory { 561 if !withinTimespan(d.staticStartTime) { 562 filtered = append(filtered, d) 563 } 564 } 565 r.downloadHistory = filtered 566 return nil 567 }