github.com/crowdsecurity/crowdsec@v1.6.1/pkg/leakybucket/README.md (about) 1 # Leakybuckets 2 3 ## Bucket concepts 4 5 The Leakybucket is used for decision making. Under certain conditions, 6 enriched events are poured into these buckets. When these buckets are 7 full, we raise a new event. After this event is raised the bucket is 8 destroyed. There are many types of buckets, and we welcome any new 9 useful design of buckets. 10 11 Usually, the bucket configuration generates the creation of many 12 buckets. They are differentiated by a field called stackkey. When two 13 events arrive with the same stackkey they go in the same matching 14 bucket. 15 16 The very purpose of these buckets is to detect clients that exceed a 17 certain rate of attempts to do something (ssh connection, http 18 authentication failure, etc...). Thus, the most used stackkey field is 19 often the source_ip. 20 21 ## Standard leaky buckets 22 23 Default buckets have two main configuration options: 24 25 * capacity: number of events the bucket can hold. When the capacity 26 is reached and a new event is poured, a new event is raised. We 27 call this type of event overflow. This is an int. 28 29 * leakspeed: duration needed for an event to leak. When an event 30 leaks, it disappears from the bucket. 31 32 ## Trigger 33 34 A Trigger is a special type of bucket with a capacity of zero. Thus, when an 35 event is poured into a trigger, it always raises an overflow. 36 37 ## Uniq 38 39 A Uniq is a bucket working like the standard leaky bucket except for one 40 thing: a filter returns a property for each event and only one 41 occurrence of this property is allowed in the bucket, thus the bucket 42 is called uniq. 43 44 ## Counter 45 46 A Counter is a special type of bucket with an infinite capacity and an 47 infinite leakspeed (it never overflows, nor leaks). Nevertheless, 48 the event is raised after a fixed duration. The option is called 49 duration. 50 51 ## Bayesian 52 53 A Bayesian is a special bucket that runs bayesian inference instead of 54 counting events. Each event must have its likelihoods specified in the 55 yaml file under `prob_given_benign` and `prob_given_evil`. The bucket 56 will continue evaluating events until the posterior goes above the 57 threshold (triggering the overflow) or the duration (specified by leakspeed) 58 expires. 59 60 ## Available configuration options for buckets 61 62 ### Fields for standard buckets 63 64 * type: mandatory field. Must be one of "leaky", "trigger", "uniq" or 65 "counter" 66 67 * name: mandatory field, but the value is totally open. Nevertheless, 68 this value will tag the events raised by the bucket. 69 70 * filter: mandatory field. It's a filter that is run to decide whether 71 an event matches the bucket or not. The filter has to return 72 a boolean. As a filter implementation we use 73 https://github.com/antonmedv/expr 74 75 * capacity: [mandatory for now, shouldn't be mandatory in the final 76 version] it's the size of the bucket. When pouring in a bucket 77 already with size events, it overflows. 78 79 * leakspeed: leakspeed is a time duration (it has to be parsed by 80 https://golang.org/pkg/time/#ParseDuration). After each interval, an 81 event is leaked from the bucket. 82 83 * stackkey: mandatory field. This field is used to differentiate on 84 which instance of the bucket the matching events will be poured. 85 When an unknown stackkey is seen in an event, a new bucket is created. 86 87 * on_overflow: optional field, that tells what to do when the 88 bucket is returning the overflow event. As of today, the possibilities 89 are "ban,1h", "Reprocess" or "Delete". 90 Reprocess is used to send the raised event back to the event pool to 91 be matched against buckets 92 93 ### Fields for special buckets 94 95 #### Uniq 96 97 * uniq_filter: an expression that must comply with the syntax defined 98 in https://github.com/antonmedv/expr and must return a string. 99 All strings returned by this filter in the same buckets have to be different. 100 Thus if a string is seen twice, the event is dismissed. 101 102 #### Trigger 103 104 Capacity and leakspeed are not relevant for this kind of bucket. 105 106 #### Counter 107 108 * duration: the Counter will be destroyed after this interval 109 has elapsed since its creation. The duration must be parsed 110 by https://golang.org/pkg/time/#ParseDuration. 111 Nevertheless, this kind of bucket is often used with an infinite 112 leakspeed and an infinite capacity [capacity set to -1 for now]. 113 114 #### Bayesian 115 116 * bayesian_prior: The prior to start with 117 * bayesian_threshold: The threshold for the posterior to trigger the overflow. 118 * bayesian_conditions: List of Bayesian conditions with likelihoods 119 120 Bayesian Conditions are built from: 121 * condition: The expr for this specific condition to be true 122 * prob_given_evil: The likelihood an IP satisfies the condition given the fact 123 that it is a maliscious IP 124 * prob_given_benign: The likelihood an IP satisfies the condition given the fact 125 that it is a benign IP 126 * guillotine: Bool to stop the condition from getting evaluated if it has 127 evaluated to true once. This should be used if evaluating the condition is 128 computationally expensive. 129 130 131 ## Add examples here 132 133 ``` 134 # ssh bruteforce 135 - type: leaky 136 name: ssh_bruteforce 137 filter: "Meta.log_type == 'ssh_failed-auth'" 138 leakspeed: "10s" 139 capacity: 5 140 stackkey: "source_ip" 141 on_overflow: ban,1h 142 143 # reporting of src_ip,dest_port seen 144 - type: counter 145 name: counter 146 filter: "Meta.service == 'tcp' && Event.new_connection == 'true'" 147 distinct: "Meta.source_ip + ':' + Meta.dest_port" 148 duration: 5m 149 capacity: -1 150 151 - type: trigger 152 name: "New connection" 153 filter: "Meta.service == 'tcp' && Event.new_connection == 'true'" 154 on_overflow: Reprocess 155 ``` 156 157 # Note on leakybuckets implementation 158 159 [This is not dry enough to have many details here, but:] 160 161 The bucket code is triggered by runPour in pour.go, by calling the `leaky.PourItemToHolders` function. 162 There is one struct called buckets which is for now a 163 `map[string]interface{}` that holds all buckets. The key of this map 164 is derived from the filter configured for the bucket and its 165 stackkey. This looks complicated, but it allows us to use 166 only one struct. This is done in buckets.go. 167 168 On top of that the implementation defines only the standard leaky 169 bucket. A goroutine is launched for every bucket (`bucket.go`). This 170 goroutine manages the life of the bucket. 171 172 For special buckets, hooks are defined at initialization time in 173 manager.go. Hooks are called when relevant by the bucket goroutine 174 when events are poured and/or when a bucket overflows.