github.com/fozzysec/SiaPrime@v0.0.0-20190612043147-66c8e8d11fe3/modules/renter/download.go (about) 1 package renter 2 3 // The download code follows a hopefully clean/intuitive flow for getting super 4 // high and computationally efficient parallelism on downloads. When a download 5 // is requested, it gets split into its respective chunks (which are downloaded 6 // individually) and then put into the download heap. The primary purpose of the 7 // download heap is to keep downloads on standby until there is enough memory 8 // available to send the downloads off to the workers. The heap is sorted first 9 // by priority, but then a few other criteria as well. 10 // 11 // Some downloads, in particular downloads issued by the repair code, have 12 // already had their memory allocated. These downloads get to skip the heap and 13 // go straight for the workers. 14 // 15 // When a download is distributed to workers, it is given to every single worker 16 // without checking whether that worker is appropriate for the download. Each 17 // worker has their own queue, which is bottlenecked by the fact that a worker 18 // can only process one item at a time. When the worker gets to a download 19 // request, it determines whether it is suited for downloading that particular 20 // file. The criteria it uses include whether or not it has a piece of that 21 // chunk, how many other workers are currently downloading pieces or have 22 // completed pieces for that chunk, and finally things like worker latency and 23 // worker price. 24 // 25 // If the worker chooses to download a piece, it will register itself with that 26 // piece, so that other workers know how many workers are downloading each 27 // piece. This keeps everything cleanly coordinated and prevents too many 28 // workers from downloading a given piece, while at the same time you don't need 29 // a giant messy coordinator tracking everything. If a worker chooses not to 30 // download a piece, it will add itself to the list of standby workers, so that 31 // in the event of a failure, the worker can be returned to and used again as a 32 // backup worker. The worker may also decide that it is not suitable at all (for 33 // example, if the worker has recently had some consecutive failures, or if the 34 // worker doesn't have access to a piece of that chunk), in which case it will 35 // mark itself as unavailable to the chunk. 36 // 37 // As workers complete, they will release memory and check on the overall state 38 // of the chunk. If some workers fail, they will enlist the standby workers to 39 // pick up the slack. 40 // 41 // When the final required piece finishes downloading, the worker who completed 42 // the final piece will spin up a separate thread to decrypt, decode, and write 43 // out the download. That thread will then clean up any remaining resources, and 44 // if this was the final unfinished chunk in the download, it'll mark the 45 // download as complete. 46 47 // The download process has a slightly complicating factor, which is overdrive 48 // workers. Traditionally, if you need 10 pieces to recover a file, you will use 49 // 10 workers. But if you have an overdrive of '2', you will actually use 12 50 // workers, meaning you download 2 more pieces than you need. This means that up 51 // to two of the workers can be slow or fail and the download can still complete 52 // quickly. This complicates resource handling, because not all memory can be 53 // released as soon as a download completes - there may be overdrive workers 54 // still out fetching the file. To handle this, a catchall 'cleanUp' function is 55 // used which gets called every time a worker finishes, and every time recovery 56 // completes. The result is that memory gets cleaned up as required, and no 57 // overarching coordination is needed between the overdrive workers (who do not 58 // even know that they are overdrive workers) and the recovery function. 59 60 // By default, the download code organizes itself around having maximum possible 61 // throughput. That is, it is highly parallel, and exploits that parallelism as 62 // efficiently and effectively as possible. The hostdb does a good of selecting 63 // for hosts that have good traits, so we can generally assume that every host 64 // or worker at our disposable is reasonably effective in all dimensions, and 65 // that the overall selection is generally geared towards the user's 66 // preferences. 67 // 68 // We can leverage the standby workers in each unfinishedDownloadChunk to 69 // emphasize various traits. For example, if we want to prioritize latency, 70 // we'll put a filter in the 'managedProcessDownloadChunk' function that has a 71 // worker go standby instead of accept a chunk if the latency is higher than the 72 // targeted latency. These filters can target other traits as well, such as 73 // price and total throughput. 74 75 // TODO: One of the biggest requested features for users is to improve the 76 // latency of the system. The biggest fruit actually doesn't happen here, right 77 // now the hostdb doesn't discriminate based on latency at all, and simply 78 // adding some sort of latency scoring will probably be the biggest thing that 79 // we can do to improve overall file latency. 80 // 81 // After we do that, the second most important thing that we can do is enable 82 // partial downloads. It's hard to have a low latency when to get any data back 83 // at all you need to download a full 40 MiB. If we can leverage partial 84 // downloads to drop that to something like 256kb, we'll get much better overall 85 // latency for small files and for starting video streams. 86 // 87 // After both of those, we can leverage worker latency discrimination. We can 88 // add code to 'managedProcessDownloadChunk' to put a worker on standby 89 // initially instead of have it grab a piece if the latency of the worker is 90 // higher than the faster workers. This will prevent the slow workers from 91 // bottlenecking a chunk that we are trying to download quickly, though it will 92 // harm overall system throughput because it means that the slower workers will 93 // idle some of the time. 94 95 // TODO: Currently the number of overdrive workers is set to '2' for the first 2 96 // chunks of any user-initiated download. But really, this should be a parameter 97 // of downloading that gets set by the user through the API on a per-file basis 98 // instead of set by default. 99 100 // TODO: I tried to write the code such that the transition to true partial 101 // downloads would be as seamless as possible, but there's a lot of work that 102 // still needs to be done to make that fully possible. The most disruptive thing 103 // probably is the place where we call 'Sector' in worker.managedDownload. 104 // That's going to need to be changed to a partial sector. This is probably 105 // going to result in downloading that's 64-byte aligned instead of perfectly 106 // byte-aligned. Further, the encryption and erasure coding may also have 107 // alignment requirements which interfere with how the call to Sector can work. 108 // So you need to make sure that in 'managedDownload' you download at least 109 // enough data to fit the alignment requirements of all 3 steps (download from 110 // host, encryption, erasure coding). After the logical data has been recovered, 111 // we slice it to whatever is meant to be written to the underlying 112 // downloadWriter, that code is going to need to be adjusted as well to slice 113 // things in the right way. 114 // 115 // Overall I don't think it's going to be all that difficult, but it's not 116 // nearly as clean-cut as some of the other potential extensions that we can do. 117 118 // TODO: Right now the whole download will build and send off chunks even if 119 // there are not enough hosts to download the file, and even if there are not 120 // enough hosts to download a particular chunk. For the downloads and chunks 121 // which are doomed from the outset, we can skip some computation by checking 122 // and failing earlier. Another optimization we can make is to not count a 123 // worker for a chunk if the worker's contract does not appear in the chunk 124 // heap. 125 126 import ( 127 "fmt" 128 "net/http" 129 "os" 130 "path/filepath" 131 "sync" 132 "sync/atomic" 133 "time" 134 135 "SiaPrime/modules" 136 "SiaPrime/persist" 137 "SiaPrime/types" 138 139 "gitlab.com/NebulousLabs/errors" 140 ) 141 142 type ( 143 // A download is a file download that has been queued by the renter. 144 download struct { 145 // Data progress variables. 146 atomicDataReceived uint64 // Incremented as data completes, will stop at 100% file progress. 147 atomicTotalDataTransferred uint64 // Incremented as data arrives, includes overdrive, contract negotiation, etc. 148 149 // Other progress variables. 150 chunksRemaining uint64 // Number of chunks whose downloads are incomplete. 151 completeChan chan struct{} // Closed once the download is complete. 152 err error // Only set if there was an error which prevented the download from completing. 153 154 // Timestamp information. 155 endTime time.Time // Set immediately before closing 'completeChan'. 156 staticStartTime time.Time // Set immediately when the download object is created. 157 158 // Basic information about the file. 159 destination downloadDestination 160 destinationString string // The string reported to the user to indicate the download's destination. 161 staticDestinationType string // "memory buffer", "http stream", "file", etc. 162 staticLength uint64 // Length to download starting from the offset. 163 staticOffset uint64 // Offset within the file to start the download. 164 staticSiaPath string // The path of the siafile at the time the download started. 165 166 // Retrieval settings for the file. 167 staticLatencyTarget time.Duration // In milliseconds. Lower latency results in lower total system throughput. 168 staticPriority uint64 // Downloads with higher priority will complete first. 169 170 // Utilities. 171 log *persist.Logger // Same log as the renter. 172 memoryManager *memoryManager // Same memoryManager used across the renter. 173 mu sync.Mutex // Unique to the download object. 174 } 175 176 // downloadParams is the set of parameters to use when downloading a file. 177 downloadParams struct { 178 destination downloadDestination // The place to write the downloaded data. 179 destinationType string // "file", "buffer", "http stream", etc. 180 destinationString string // The string to report to the user for the destination. 181 file *file // The file to download. 182 183 latencyTarget time.Duration // Workers above this latency will be automatically put on standby initially. 184 length uint64 // Length of download. Cannot be 0. 185 needsMemory bool // Whether new memory needs to be allocated to perform the download. 186 offset uint64 // Offset within the file to start the download. Must be less than the total filesize. 187 overdrive int // How many extra pieces to download to prevent slow hosts from being a bottleneck. 188 priority uint64 // Files with a higher priority will be downloaded first. 189 } 190 ) 191 192 // managedFail will mark the download as complete, but with the provided error. 193 // If the download has already failed, the error will be updated to be a 194 // concatenation of the previous error and the new error. 195 func (d *download) managedFail(err error) { 196 d.mu.Lock() 197 defer d.mu.Unlock() 198 199 // If the download is already complete, extend the error. 200 complete := d.staticComplete() 201 if complete && d.err != nil { 202 return 203 } else if complete && d.err == nil { 204 d.log.Critical("download is marked as completed without error, but then managedFail was called with err:", err) 205 return 206 } 207 208 // Mark the download as complete and set the error. 209 d.err = err 210 close(d.completeChan) 211 if d.destination != nil { 212 err = d.destination.Close() 213 d.destination = nil 214 } 215 if err != nil { 216 d.log.Println("unable to close download destination:", err) 217 } 218 } 219 220 // staticComplete is a helper function to indicate whether or not the download 221 // has completed. 222 func (d *download) staticComplete() bool { 223 select { 224 case <-d.completeChan: 225 return true 226 default: 227 return false 228 } 229 } 230 231 // Err returns the error encountered by a download, if it exists. 232 func (d *download) Err() (err error) { 233 d.mu.Lock() 234 err = d.err 235 d.mu.Unlock() 236 return err 237 } 238 239 // Download performs a file download using the passed parameters and blocks 240 // until the download is finished. 241 func (r *Renter) Download(p modules.RenterDownloadParameters) error { 242 d, err := r.managedDownload(p) 243 if err != nil { 244 return err 245 } 246 // Block until the download has completed 247 select { 248 case <-d.completeChan: 249 return d.Err() 250 case <-r.tg.StopChan(): 251 return errors.New("download interrupted by shutdown") 252 } 253 } 254 255 // DownloadAsync performs a file download using the passed parameters without 256 // blocking until the download is finished. 257 func (r *Renter) DownloadAsync(p modules.RenterDownloadParameters) error { 258 _, err := r.managedDownload(p) 259 return err 260 } 261 262 // managedDownload performs a file download using the passed parameters and 263 // returns the download object and an error that indicates if the download 264 // setup was successful. 265 func (r *Renter) managedDownload(p modules.RenterDownloadParameters) (*download, error) { 266 // Lookup the file associated with the nickname. 267 lockID := r.mu.RLock() 268 file, exists := r.files[p.SiaPath] 269 r.mu.RUnlock(lockID) 270 if !exists { 271 return nil, fmt.Errorf("no file with that path: %s", p.SiaPath) 272 } 273 274 // Validate download parameters. 275 isHTTPResp := p.Httpwriter != nil 276 if p.Async && isHTTPResp { 277 return nil, errors.New("cannot async download to http response") 278 } 279 if isHTTPResp && p.Destination != "" { 280 return nil, errors.New("destination cannot be specified when downloading to http response") 281 } 282 if !isHTTPResp && p.Destination == "" { 283 return nil, errors.New("destination not supplied") 284 } 285 if p.Destination != "" && !filepath.IsAbs(p.Destination) { 286 return nil, errors.New("destination must be an absolute path") 287 } 288 if p.Offset == file.size && file.size != 0 { 289 return nil, errors.New("offset equals filesize") 290 } 291 // Sentinel: if length == 0, download the entire file. 292 if p.Length == 0 { 293 if p.Offset > file.size { 294 return nil, errors.New("offset cannot be greater than file size") 295 } 296 p.Length = file.size - p.Offset 297 } 298 // Check whether offset and length is valid. 299 if p.Offset < 0 || p.Offset+p.Length > file.size { 300 return nil, fmt.Errorf("offset and length combination invalid, max byte is at index %d", file.size-1) 301 } 302 303 // Instantiate the correct downloadWriter implementation. 304 var dw downloadDestination 305 var destinationType string 306 if isHTTPResp { 307 dw = newDownloadDestinationWriteCloserFromWriter(p.Httpwriter) 308 destinationType = "http stream" 309 } else { 310 osFile, err := os.OpenFile(p.Destination, os.O_CREATE|os.O_WRONLY, os.FileMode(file.mode)) 311 if err != nil { 312 return nil, err 313 } 314 dw = osFile 315 destinationType = "file" 316 } 317 318 // If the destination is a httpWriter, we set the Content-Length in the 319 // header. 320 if isHTTPResp { 321 w, ok := p.Httpwriter.(http.ResponseWriter) 322 if ok { 323 w.Header().Set("Content-Length", fmt.Sprint(p.Length)) 324 } 325 } 326 327 // Create the download object. 328 d, err := r.managedNewDownload(downloadParams{ 329 destination: dw, 330 destinationType: destinationType, 331 destinationString: p.Destination, 332 file: file, 333 334 latencyTarget: 25e3 * time.Millisecond, // TODO: high default until full latency support is added. 335 length: p.Length, 336 needsMemory: true, 337 offset: p.Offset, 338 overdrive: 3, // TODO: moderate default until full overdrive support is added. 339 priority: 5, // TODO: moderate default until full priority support is added. 340 }) 341 if err != nil { 342 return nil, err 343 } 344 345 // Add the download object to the download queue. 346 r.downloadHistoryMu.Lock() 347 r.downloadHistory = append(r.downloadHistory, d) 348 r.downloadHistoryMu.Unlock() 349 350 // Return the download object 351 return d, nil 352 } 353 354 // managedNewDownload creates and initializes a download based on the provided 355 // parameters. 356 func (r *Renter) managedNewDownload(params downloadParams) (*download, error) { 357 // Input validation. 358 if params.file == nil { 359 return nil, errors.New("no file provided when requesting download") 360 } 361 if params.length < 0 { 362 return nil, errors.New("download length must be zero or a positive whole number") 363 } 364 if params.offset < 0 { 365 return nil, errors.New("download offset cannot be a negative number") 366 } 367 if params.offset+params.length > params.file.size { 368 return nil, errors.New("download is requesting data past the boundary of the file") 369 } 370 371 // Create the download object. 372 d := &download{ 373 completeChan: make(chan struct{}), 374 375 staticStartTime: time.Now(), 376 377 destination: params.destination, 378 destinationString: params.destinationString, 379 staticDestinationType: params.destinationType, 380 staticLatencyTarget: params.latencyTarget, 381 staticLength: params.length, 382 staticOffset: params.offset, 383 staticSiaPath: params.file.name, 384 staticPriority: params.priority, 385 386 log: r.log, 387 memoryManager: r.memoryManager, 388 } 389 390 // Determine which chunks to download. 391 minChunk := params.offset / params.file.staticChunkSize() 392 maxChunk := (params.offset + params.length - 1) / params.file.staticChunkSize() 393 // Protect maxChunk underflow on tiny files 394 if params.file.size < 4096 { 395 maxChunk = 0 396 } 397 398 // For each chunk, assemble a mapping from the contract id to the index of 399 // the piece within the chunk that the contract is responsible for. 400 chunkMaps := make([]map[string]downloadPieceInfo, maxChunk-minChunk+1) 401 for i := range chunkMaps { 402 chunkMaps[i] = make(map[string]downloadPieceInfo) 403 } 404 params.file.mu.Lock() 405 for id, contract := range params.file.contracts { 406 resolvedKey := r.hostContractor.ResolveIDToPubKey(id) 407 for _, piece := range contract.Pieces { 408 if piece.Chunk >= minChunk && piece.Chunk <= maxChunk { 409 // Sanity check - the same worker should not have two pieces for 410 // the same chunk. 411 _, exists := chunkMaps[piece.Chunk-minChunk][string(resolvedKey.Key)] 412 if exists { 413 r.log.Println("ERROR: Worker has multiple pieces uploaded for the same chunk.") 414 } 415 chunkMaps[piece.Chunk-minChunk][string(resolvedKey.Key)] = downloadPieceInfo{ 416 index: piece.Piece, 417 root: piece.MerkleRoot, 418 } 419 } 420 } 421 } 422 params.file.mu.Unlock() 423 424 // Queue the downloads for each chunk. 425 writeOffset := int64(0) // where to write a chunk within the download destination. 426 d.chunksRemaining += maxChunk - minChunk + 1 427 for i := minChunk; i <= maxChunk; i++ { 428 udc := &unfinishedDownloadChunk{ 429 destination: params.destination, 430 erasureCode: params.file.erasureCode, 431 masterKey: params.file.masterKey, 432 433 staticChunkIndex: i, 434 staticCacheID: fmt.Sprintf("%v:%v", d.staticSiaPath, i), 435 staticChunkMap: chunkMaps[i-minChunk], 436 staticChunkSize: params.file.staticChunkSize(), 437 staticPieceSize: params.file.pieceSize, 438 439 // TODO: 25ms is just a guess for a good default. Really, we want to 440 // set the latency target such that slower workers will pick up the 441 // later chunks, but only if there's a very strong chance that 442 // they'll finish before the earlier chunks finish, so that they do 443 // no contribute to low latency. 444 // 445 // TODO: There is some sane minimum latency that should actually be 446 // set based on the number of pieces 'n', and the 'n' fastest 447 // workers that we have. 448 staticLatencyTarget: params.latencyTarget + (25 * time.Duration(i-minChunk)), // Increase target by 25ms per chunk. 449 staticNeedsMemory: params.needsMemory, 450 staticPriority: params.priority, 451 452 physicalChunkData: make([][]byte, params.file.erasureCode.NumPieces()), 453 pieceUsage: make([]bool, params.file.erasureCode.NumPieces()), 454 455 download: d, 456 staticStreamCache: r.staticStreamCache, 457 } 458 459 // Set the fetchOffset - the offset within the chunk that we start 460 // downloading from. 461 if i == minChunk { 462 udc.staticFetchOffset = params.offset % params.file.staticChunkSize() 463 } else { 464 udc.staticFetchOffset = 0 465 } 466 // Set the fetchLength - the number of bytes to fetch within the chunk 467 // that we start downloading from. 468 if i == maxChunk && (params.length+params.offset)%params.file.staticChunkSize() != 0 { 469 udc.staticFetchLength = ((params.length + params.offset) % params.file.staticChunkSize()) - udc.staticFetchOffset 470 } else { 471 udc.staticFetchLength = params.file.staticChunkSize() - udc.staticFetchOffset 472 } 473 // Set the writeOffset within the destination for where the data should 474 // be written. 475 udc.staticWriteOffset = writeOffset 476 writeOffset += int64(udc.staticFetchLength) 477 478 // TODO: Currently all chunks are given overdrive. This should probably 479 // be changed once the hostdb knows how to measure host speed/latency 480 // and once we can assign overdrive dynamically. 481 udc.staticOverdrive = params.overdrive 482 483 // Add this chunk to the chunk heap, and notify the download loop that 484 // there is work to do. 485 r.managedAddChunkToDownloadHeap(udc) 486 select { 487 case r.newDownloads <- struct{}{}: 488 default: 489 } 490 } 491 return d, nil 492 } 493 494 // DownloadHistory returns the list of downloads that have been performed. Will 495 // include downloads that have not yet completed. Downloads will be roughly, 496 // but not precisely, sorted according to start time. 497 // 498 // TODO: Currently the DownloadHistory only contains downloads from this 499 // session, does not contain downloads that were executed for the purposes of 500 // repairing, and has no way to clear the download history if it gets long or 501 // unwieldy. It's not entirely certain which of the missing features are 502 // actually desirable, please consult core team + app dev community before 503 // deciding what to implement. 504 func (r *Renter) DownloadHistory() []modules.DownloadInfo { 505 r.downloadHistoryMu.Lock() 506 defer r.downloadHistoryMu.Unlock() 507 508 downloads := make([]modules.DownloadInfo, len(r.downloadHistory)) 509 for i := range r.downloadHistory { 510 // Order from most recent to least recent. 511 d := r.downloadHistory[len(r.downloadHistory)-i-1] 512 d.mu.Lock() // Lock required for d.endTime only. 513 downloads[i] = modules.DownloadInfo{ 514 Destination: d.destinationString, 515 DestinationType: d.staticDestinationType, 516 Length: d.staticLength, 517 Offset: d.staticOffset, 518 SiaPath: d.staticSiaPath, 519 520 Completed: d.staticComplete(), 521 EndTime: d.endTime, 522 Received: atomic.LoadUint64(&d.atomicDataReceived), 523 StartTime: d.staticStartTime, 524 StartTimeUnix: d.staticStartTime.UnixNano(), 525 TotalDataTransferred: atomic.LoadUint64(&d.atomicTotalDataTransferred), 526 } 527 // Release download lock before calling d.Err(), which will acquire the 528 // lock. The error needs to be checked separately because we need to 529 // know if it's 'nil' before grabbing the error string. 530 d.mu.Unlock() 531 if d.Err() != nil { 532 downloads[i].Error = d.Err().Error() 533 } else { 534 downloads[i].Error = "" 535 } 536 } 537 return downloads 538 } 539 540 // ClearDownloadHistory clears the renter's download history inclusive of the 541 // provided before and after timestamps 542 // 543 // TODO: This function can be improved by implementing a binary search, the 544 // trick will be making the binary search be just as readable while handling 545 // all the edge cases 546 func (r *Renter) ClearDownloadHistory(after, before time.Time) error { 547 if err := r.tg.Add(); err != nil { 548 return err 549 } 550 defer r.tg.Done() 551 r.downloadHistoryMu.Lock() 552 defer r.downloadHistoryMu.Unlock() 553 554 // Check to confirm there are downloads to clear 555 if len(r.downloadHistory) == 0 { 556 return nil 557 } 558 559 // Timestamp validation 560 if before.Before(after) { 561 return errors.New("before timestamp can not be newer then after timestamp") 562 } 563 564 // Clear download history if both before and after timestamps are zero values 565 if before.Equal(types.EndOfTime) && after.IsZero() { 566 r.downloadHistory = r.downloadHistory[:0] 567 return nil 568 } 569 570 // Find and return downloads that are not within the given range 571 withinTimespan := func(t time.Time) bool { 572 return (t.After(after) || t.Equal(after)) && (t.Before(before) || t.Equal(before)) 573 } 574 filtered := r.downloadHistory[:0] 575 for _, d := range r.downloadHistory { 576 if !withinTimespan(d.staticStartTime) { 577 filtered = append(filtered, d) 578 } 579 } 580 r.downloadHistory = filtered 581 return nil 582 }