github.com/tickoalcantara12/micro/v3@v3.0.0-20221007104245-9d75b9bcbab9/docs/blog/_posts/2016-05-16-resiliency.md (about)

     1  ---
     2  layout: post
     3  title:  Building Resilient and Fault Tolerant Applications with Micro
     4  date:   2016-05-15 00:00:00
     5  ---
     6  <br>
     7  It's been a little while since the last blog post but we've been hard at work on Micro and it's definitely starting 
     8  to pay off. Let's dive into it all now!
     9  
    10  If you want to read up on the [**Micro**](https://github.com/tickoalcantara12/micro) toolkit first, check out the previous blog post 
    11  [here]({{ site.baseurl }}/2016/03/20/micro.html) or if you would like to learn more about the concept of microservices look [here]({{ site.baseurl }}/2016/03/17/introduction.html).
    12  
    13  It's no secret that building distributed systems can be challenging. While we've solved a lot of problems as an industry along 
    14  the way, we still go through cycles of rebuilding many of the building blocks. Whether it's because of the move 
    15  to the next level of abstraction, virtual machines to containers, adopting new languages, leveraging cloud based 
    16  services or even this coming shift to microservices. There's always something that seems to require us to relearn 
    17  how to build performant and fault tolerant systems for the next wave of technology.
    18  
    19  It's a never ending battle between iteration and innovation but we need to do something to help alleviate a lot 
    20  of the pains as the shift to Cloud, Containers and Microservices continues.
    21  
    22  ### The Motivations
    23  
    24  So why are we doing this? Why do we keep rebuilding the building blocks and why do we keep attempting to solve the same 
    25  scale, fault tolerance and distributed systems problems?
    26  
    27  The term that comes to mind are, <i>"bigger, stronger, faster"</i>, or perhaps even, <i>"speed, scale, agility"</i>. You'll 
    28  hear these a lot from C-level executives but the key takeaways are really that there's always a need for us to build more performant 
    29  and resilient systems.
    30  
    31  In the early days of the internet, there were only thousands or maybe even hundreds of thousands of people coming online. Over time 
    32  we saw that accelerate and we're now into the order of billions. Billions of people and billions of devices. 
    33  We've had to learn how to build systems for this.
    34  
    35  For the older generation you may remember the [C10K problem](http://www.kegel.com/c10k.html). I'm not sure where we are with this now 
    36  but I think we're talking about solving the issue of millions of concurrent connections if not more. The biggest technology players in the world 
    37  really solved this a decade ago and have patterns for building systems at scale but the rest of us are still learning.
    38  
    39  The likes of Amazon, Google and Microsoft now provide us with Cloud Computing platforms to leverage significant scale but we're 
    40  still trying to figure out how to write applications that can effectively leverage it. You're hearing the terms container 
    41  orchestration, microservices and cloud native a lot these days. The work is underway on a multitude of levels and it's going 
    42  to be a while before we as an industry have really nailed down the patterns and solutions needed moving forward.
    43  
    44  A lot of companies are now helping with the question of, "how do I run my applications in a scalable and fault tolerant manner?", but 
    45  there's still very few helping with the more important question...
    46  
    47  How do I actually <i>write</i> applications in a scalable and fault tolerant manner?
    48  
    49  Micro looks to address these problems by focusing on the key software development requirements for microservices. We'll run through 
    50  some of what can help you build resilient and fault tolerant applications now, starting with the client side.
    51  
    52  ### The Client
    53  
    54  The client is a building block for making requests in go-micro. If you've built microservices or SOA architectures before then you'll know that 
    55  a significant portion of time and execution is spent on calling other services for relevant information. 
    56  
    57  Whereas in a monolithic application the focus on mainly on serving content, in a microservices world it's more about retrieving or publishing content.
    58  
    59  Here's a cut down version of the go-micro client interface with the three most important methods; Call, Publish and Stream.
    60  
    61  ```
    62  type Client interface {
    63  	Call(ctx context.Context, req Request, rsp interface{}, opts ...CallOption) error
    64  	Publish(ctx context.Context, p Publication, opts ...PublishOption) error
    65  	Stream(ctx context.Context, req Request, opts ...CallOption) (Streamer, error)
    66  }
    67  
    68  type Request interface {
    69  	Service() string
    70  	Method() string
    71  	ContentType() string
    72  	Request() interface{}
    73  	Stream() bool
    74  }
    75  ```
    76  
    77  Call and Stream are used to make synchronous requests. Call returns a single result whereas Stream is a bidirectional streaming connection maintained 
    78  with another service, over which messages can be sent back and forth. Publish is used to publish asynchronous messages via the broker but we're 
    79  not going to discuss that today.
    80  
    81  How the client works behind the scenes was addressed in a couple of previous blog posts which you can find [here]({{ site.baseurl }}/2016/03/20/micro.html) and 
    82  [here]({{ site.baseurl }}/2016/04/18/micro-architecture.html). Check those out if you want to learn about the details.
    83  
    84  We'll just briefly mention some important internal details. 
    85  
    86  The client deals with the RPC layer while leveraging the broker, codec, registry, selector and transport packages
    87  for various pieces of functionality. The layered architecture is important as we separate the concerns of each component, reducing the 
    88  complexity and providing pluggability.
    89  
    90  ###### Why Does The Client Matter?
    91  
    92  The client is essentially abstracting away the details of providing resilient and fault tolerant communication between services. Making a 
    93  call to another service seems fairly straight forward but there's all sort of ways in which it could potentially fail.
    94  
    95  Let's start to walk through some of the functionality and how it helps.
    96  
    97  #### Service Discovery
    98  
    99  In a distributed system, instances of a service could be coming and going based on any number of reasons. Network partitions, machine failure, 
   100  rescheduling, etc. We don't really want to have to care about it this.
   101  
   102  When making a call to another service, we do it by name and allow the client to use service discovery to resolve the name to a list of 
   103  instances with their address and port. Services register with discovery on startup and deregister on shutdown.
   104  
   105  <p align="center">
   106    <img src="{{ site.baseurl }}/blog/images/discovery.png" />
   107  </p>
   108  
   109  As we mentioned though, any number of issues can occur in a distributed system and service discovery is no exception. So we rely on battle 
   110  tested distributed service discovery systems such as consul, etcd and zookeeper to store the information about services. 
   111  
   112  Each of these either use the Raft of Paxos network consensus algorithms which gives us consistency and partition tolerance from the CAP theorem. 
   113  By running a cluster of 3 or 5 nodes, we can tolerate most system failures and get reliable service discovery for the client.
   114  
   115  #### Node Selection
   116  
   117  So now we can reliably resolve service names to a list of addresses. How do we actually select which one to call? This is where the go-micro Selector 
   118  comes into play. It builds on the registry and provides load balancing strategies such as round robin or random hashing while also providing 
   119  methods of filtering, caching and blacklisting failed nodes.
   120  
   121  Here's a cut down interface.
   122  
   123  ```
   124  type Selector interface {
   125  	Select(service string, opts ...SelectOption) (Next, error)
   126  	Mark(service string, node *registry.Node, err error)
   127  	Reset(service string)
   128  }
   129  
   130  type Next func() (*registry.Node, error)
   131  type Filter func([]*registry.Service) []*registry.Service
   132  type Strategy func([]*registry.Service) Next
   133  ```
   134  
   135  ###### Balancing Strategies
   136  
   137  The current strategies are fairly straight forward. When Select is called the Selector will retrieve the service from the Registry 
   138  and create a Next function that encapsulates the pool of nodes using the default strategy or the one passed in as an option if overridden.
   139  
   140  The client will call the Next function to retrieve the next node in the list based on the load balancing strategy and make the request. 
   141  If the request fails and retries are set above 1, it will go through the same process, retrieving the next node to call.
   142  
   143  There's a variety of strategies that can be used here such as round robin, random hashing, leastconn, weighted, etc. Load balancing strategies 
   144  are essential for distributing requests evenly across services.
   145  
   146  ###### Selection Caching
   147  
   148  While its great to have a robust service discovery system it can be inefficient and costly to do a lookup on every request. 
   149  If you imagine a large scale system in which every service is doing this, it can be quite easy to overload the discovery system. There may 
   150  be cases in which it becomes completely unavailable.
   151  
   152  To avoid this we can use a caching. Most discovery systems provide a way to listen for updates, normally known as a Watcher. Rather 
   153  than polling discovery we wait for events to be sent to us. The go-micro Registry provides a Watch abstraction for this. 
   154  
   155  We've written a caching selector which maintains an in memory cache of services. On a cache miss it looks up discovery for the info, caches 
   156  it and then uses this for subsequent requests. If watch events are received for services we know about then the cache will be updated accordingly.
   157  
   158  Firstly, this drastically improves performance by removing the service lookup. It also provides some fault tolerance in the case of 
   159  service discovery being down. We are a little paranoid though and the cache could go stale because of some failure scenario so nodes are TTLed appropriately.
   160  
   161  ###### Blacklisting Nodes
   162  
   163  Next on the list, blacklisting. Notice the Selector interface has Mark and Reset methods. We can never really guarantee that healthy 
   164  nodes are registered with discovery so something needs to be done about it. 
   165  
   166  Whenever a request is made we'll keep track of the result. If a service instance fails multiple 
   167  times we can essentially blacklist the node and filter it out the next time a Select request is made. 
   168  
   169  A node is blacklisted for a set period of time before being put back in the pool. It's really critical that if a particular node 
   170  of a service is failing we remove it from the list so that we can continue to serve successful requests without delay. 
   171  
   172  #### Timeouts & Retries
   173  
   174  Adrian Cockroft has recently started to talk about the missing components from microservice architectures. One of the very 
   175  interesting things that came up is classic timeout and retry strategies that lead to cascading failures. I implore you 
   176  to go look at his slides [here](http://www.slideshare.net/adriancockcroft/microservices-whats-missing-oreilly-software-architecture-new-york#24). 
   177  I've linked directly to where it starts to cover timeouts and retries. Thanks to Adrian for letting me use the slides.
   178  
   179  This slide really summarises the problem quite well.
   180  
   181  <p align="center">
   182    <img src="{{ site.baseurl }}/blog/images/timeouts.png" />
   183  </p>
   184  
   185  What Adrian describes above is the common case in which a slow response can lead to a timeout then causing the client to retry. 
   186  Since a request is actually a chain of requests downstream, this creates a whole new set of requests through the system 
   187  while old work may still be going on. The misconfiguration can result in overloading services in the call chain and creating a failure 
   188  scenario that's difficult to recover from.
   189  
   190  In a microservices world, we need to rethink the strategy around handling timeouts and retries. Adrian goes on to discuss potential solutions 
   191  to this problem. One of which being timeout budgets and retrying against new nodes.
   192  
   193  <p align="center">
   194    <img src="{{ site.baseurl }}/blog/images/good-timeouts.png" />
   195  </p>
   196  
   197  On the retries side, we've been doing this in Micro for a while. The Number of retries can be configured as an option to the Client. 
   198  If a call fails the Client will retrieve a new node and attempt to make the request again.
   199  
   200  The timeouts were something being considered more thoughtfully but actually started with the classic static timeout setting. It wasn't until 
   201  Adrian presented his thoughts that it became clear what the strategy should be. 
   202  
   203  Budgeted timouts are now built into Micro. Let's run through how that works.
   204  
   205  The first Caller sets the timeout, this usually happens at the edge. On every request in the chain the timeout is decreased to account for 
   206  the amount of time that has passed. When zero time is left we stop processing any further requests or retries and return up the call stack. 
   207  
   208  As Adrian mentions, this is a great way to provide dynamic timeout budgeting and remove any unnecessary work occurring downstream.
   209  
   210  Further to this, the next steps should really be to remove any kind of static timeouts. How services respond will differ based on environment, 
   211  request load, etc. This should really be a dynamic SLA that's changing based on its current state but something to be left for another day.
   212  
   213  #### What About Connection Pooling?
   214  
   215  Connection pooling is an important part of building scalable systems. We've very quickly seen the limitations posed 
   216  without it. Usually leading to hitting file descriptor limits and port exhaustion. 
   217  
   218  There's currently a [PR](https://github.com/micro/go-micro/pull/86) in the works to add connection pooling to go-micro. Given the pluggable 
   219  nature of Micro, it was important to address this a layer above the [Transport](https://godoc.org/github.com/micro/go-micro/transport#Transport) 
   220  so that any implementation, whether it be HTTP, NATS, RabbitMQ, etc, would benefit.
   221  
   222  You might be thinking, well this is implementation specific, some transports may already support it. While this is true 
   223  it's not always guaranteed to work the same way across each transport. By addressing this specific problem a layer up, 
   224  we reduce the complexity and needs of the transport itself.
   225  
   226  
   227  ### What Else?
   228  
   229  Those are some pretty useful things built in to go-micro, but what else?
   230  
   231  I'm glad you asked... or well, I assume you're asking...anyway.
   232  
   233  #### Service Version Canarying?
   234  
   235  We have it! It was actually discussed in a previous blog post on architecture and design patterns for microservices which 
   236  you can check out [here]({{ site.baseurl }}/2016/04/18/micro-architecture.html).
   237  
   238  Services contain Name and Version as a pair in service discovery. When a service is retrieved from the registry, it's nodes 
   239  are grouped by version. The selector can then be leveraged to distribute traffic across the nodes of each version using 
   240  various load balancing strategies.
   241  
   242  <p align="center">
   243    <img src="{{ site.baseurl }}/blog/images/selector.png" />
   244  </p>
   245  
   246  ###### Why Is Canarying Important?
   247  
   248  This is really quite useful when releasing new versions of a service and ensuring everything is functioning correctly before 
   249  rolling out to the entire fleet. The new version can be deployed to a small pool of nodes with the client automatically 
   250  distributing a percentage of traffic to the new service. In combination with an orchestration system such as Kubernetes 
   251  you can canary the deployment with confidence and rollback if there's any issues.
   252  
   253  #### What About Filtering?
   254  
   255  We have it! The selector is very powerful and includes the ability to pass in filters at time of selection to filter nodes. These can be 
   256  passed in as Call Options to the client when making a request. Some existing filters can be found 
   257  [here](https://github.com/micro/go-micro/blob/master/selector/filter.go) for metadata, endpoint or version filtering. 
   258  
   259  ###### Why Is Filtering Important?
   260  
   261  You might have some functionality that only exists across a set of versions of services. Pinning the request flow between 
   262  the services to those particular versions ensures you always hit the right services. This is great where multiple 
   263  versions are running in the system at the same time. 
   264  
   265  The other useful use case is where you want route to services based on locality. By setting a datacenter label on each 
   266  service you can apply a filter that will only return local nodes. Filtering based on metadata is pretty powerful and has 
   267  much broader applications which we hope to hear more about from usage in the wild.
   268  
   269  ### The Pluggable Architecture
   270  
   271  One of the things that you'll keep hearing over and over is the pluggable nature of Micro. This was something 
   272  addressed in the design from day one. It was very important that Micro provide building blocks as opposed to 
   273  a complete system. Something that works out of the box but can be enhanced.
   274  
   275  ###### Why Does Being Pluggable Matter?
   276  
   277  Everyone will have different ideas about what it means to build distributed systems and 
   278  we really want to provide a way for people to design the solutions they want to use. Not only that but 
   279  there are robust battle tested tools out there which we can leverage rather than writing everything from 
   280  scratch.
   281  
   282  Technology is always evolving, new and better tools appear everyday. How do we avoid lock in? A pluggable 
   283  architecture means we can use components today and switch them out tomorrow with minimal effort.
   284  
   285  #### Plugins
   286  
   287  Each of the features of go-micro are created as Go interfaces. By doing so and only referencing the interface, 
   288  we can actually swap out the underlying implementations with minimal to zero code changes. In most cases 
   289  a simple import statement and flag specified on the command line.
   290  
   291  There are a number of plugins in the [go-plugins](https://github.com/micro/go-plugins) repo on GitHub.
   292  
   293  While go-micro provides some defaults such as consul for discovery and http for transport, you may want to use 
   294  something different within your architecture or even implement your own plugins. We've already had community 
   295  contributions with a [Kubernetes](https://github.com/micro/go-plugins/tree/master/registry/kubernetes) registry 
   296  plugin and [Zookeeper](https://github.com/micro/go-plugins/pull/24) registry in PR mode right now.
   297  
   298  ###### How do I use plugins?
   299  
   300  Most of the time it's as simple as this.
   301  
   302  ```
   303  # Import the plugin
   304  import _ "github.com/micro/go-plugins/registry/etcd"
   305  ```
   306  
   307  ```
   308  go run main.go --registry=etcd --registry_address=10.0.0.1:2379
   309  ```
   310  
   311  If you want to see more of it in action, check out the post on [Micro on NATS]({{ site.baseurl }}/2016/04/11/micro-on-nats.html).
   312  
   313  #### Wrappers
   314  
   315  What's more, the Client and Server support the notion of middleware with something called Wrappers. By supporting 
   316  middleware we can add pre and post hooks with additional functionality around request-response handling.
   317  
   318  Middleware is a well understood concept and something used across thousands of libraries to date. You can 
   319  immediately see the benefits in use cases such as circuit breaking, rate limiting, authentication, logging, tracing, etc.
   320  
   321  ```
   322  # Client Wrappers
   323  type Wrapper func(Client) Client
   324  type StreamWrapper func(Streamer) Streamer
   325  
   326  # Server Wrappers
   327  type HandlerWrapper func(HandlerFunc) HandlerFunc
   328  type SubscriberWrapper func(SubscriberFunc) SubscriberFunc
   329  type StreamerWrapper func(Streamer) Streamer
   330  ```
   331  
   332  ###### How do I use Wrappers?
   333  
   334  This is just as straight forward as plugins.
   335  
   336  ```
   337  import (
   338  	"github.com/micro/go-micro"
   339  	"github.com/micro/go-plugins/wrapper/breaker/hystrix"
   340  )
   341  
   342  func main() {
   343  	service := micro.NewService(
   344  		micro.Name("myservice"),
   345  		micro.WrapClient(hystrix.NewClientWrapper()),
   346  	)
   347  }
   348  ```
   349  
   350  Easy right? We find many companies create their own layer on top of Micro to initialise most of the default wrappers 
   351  they're looking for so if any new wrappers need to be added it can all be done in one place.
   352  
   353  Let's look at a couple wrappers now for resiliency and fault tolerance.
   354  
   355  #### Circuit Breaking
   356  
   357  In an SOA or microservices world, a single request can actually result in a call to multiple services and in many cases, 
   358  to dozens or more to gather the necessary information to return to the caller. In the successful case, this works quite 
   359  well but if an issue occurs it can quickly descend into cascading failures which are difficult to recover from without 
   360  resetting the entire system. 
   361  
   362  We partially solve some of these problems in the client with request retries and blacklisting nodes that 
   363  have failed multiple times but at some point there may be a need to stop the client from even attempting to make the 
   364  request. 
   365  
   366  This is where circuit breakers come into play.
   367  
   368  <p align="center">
   369    <img src="{{ site.baseurl }}/blog/images/circuit.png" />
   370  </p>
   371  
   372  The concept of circuit breakers are straight forward. The execution of a function is wrapped or associated with a monitor of 
   373  some kind which tracks failures. When the number of failures exceeds a certain threshold, the breaker is tripped and 
   374  any further call attempts return an error without executing the wrapped function. After a timeout period the circuit 
   375  is put into a half open state. If a single call fails in this state the breaker is once again tripped however if it succeeds 
   376  we reset back to the normal state of a closed circuit.
   377  
   378  While the internals of the Micro client have some fault tolerant features built in, we shouldn't expect to be able to solve 
   379  every problem. Using Wrappers in conjuction with existing circuit breaker implementations we can benefit greatly.
   380  
   381  #### Rate Limiting
   382  
   383  Wouldn't it be nice if we could just serve all the requests in the world without breaking a sweat. Ah the dream. Well the real 
   384  world doesn't really work like that. Processing a query takes a certain period of time and given the limitations of resources 
   385  there's only so many requests we can actually serve.
   386  
   387  At some point we need to think about limiting the number of requests we can either make or serve in parallel. This is where 
   388  rate limiting comes into play. Without rate limiting it can be very easy to run into resource exhaustion or completely cripple 
   389  the system and stop it from being able to serve any further requests. This is usually the basis for a great DDOS attack.
   390  
   391  Everyone has heard of, used or maybe even implemented some form of rate limiting. There's quite a few different rate limiting 
   392  algorithms out there, one of which being the [Leaky Bucket](https://en.wikipedia.org/wiki/Leaky_bucket) algorithm. We're not 
   393  going to go into the specifics of the algorithm here but it's worth reading about.
   394  
   395  Once again we can make use of Micro Wrappers and existing libraries to perform this function. An existing implementation 
   396  can be found [here](https://github.com/micro/go-plugins/blob/master/wrapper/ratelimiter/ratelimit/ratelimit.go). 
   397  
   398  A system we're actually interested in seeing an implementation for is YouTube's [Doorman](https://github.com/youtube/doorman), 
   399  a global distributed client side rate limiter. We're looking for a community contribution for this, so please get in touch!
   400  
   401  ### The Server Side
   402  
   403  All of this has covered quite a lot about the client side features or use cases. What about the server side? The first thing to note 
   404  is that Micro leverages the go-micro client for the API, CLI, Sidecar and so on. These benefits translate across the entire 
   405  architecture from the edge down to the very last backend service. We still need to address some basics for the server though.
   406  
   407  While on the client side, the registry is used to find services, the server side is where the registration actually occurs. When a 
   408  an instance of a service comes up, it registers itself with the service discovery mechanism and deregisters when it exits gracefully. 
   409  The keyword being being "gracefully".
   410  
   411  <p align="center">
   412    <img src="{{ site.baseurl }}/blog/images/register.png" />
   413  </p>
   414  
   415  ###### Dealing With Failure
   416  
   417  In a distributed system we have to deal with failures, we need to be fault tolerant. The registry supports TTLs to expire or mark 
   418  nodes as unhealthy based on whatever the underlying service discovery mechanism is e.g consul, etcd. While the service itself also 
   419  supports re-registration. The combination of the two means the service node will re-register on a set interval while it's healthy 
   420  and the registry will expire the node if not refreshed. If the node fails for any reason and does not re-register, it will be 
   421  removed from the registry.
   422  
   423  This fault tolerant behaviour was not initially included as part of go-micro but we quickly saw from real world use that 
   424  it was very easy to fill the registry with stale nodes because of panics and other failures which causes services to exit ungracefully. 
   425  
   426  The knock on effect was that the client would be left to deal with dozens if not hundreds of stale entries. While the client 
   427  needs to be fault tolerant as well, we think this functionality eliminates a lot of issues upfront.
   428  
   429  ###### Adding Further Functionality
   430  
   431  Another thing to note, as mentioned above, the server also provides the ability to use Wrappers or Middleware as its more commonly known. Which means 
   432  we can use circuit breaking, rate limiting, and other features at this layer to control request flow, concurrency, etc. 
   433  
   434  The functionality of the server is purposely kept simple but pluggable so that features can be layered on top as required.
   435  
   436  ### Clients vs Sidecars
   437  
   438  Most of what's being discussed here exists in the core [go-micro](https://github.com/micro/go-micro) library. While this is great 
   439  for all the Go programmers everyone else may be wondering, how do I get all these benefits.
   440  
   441  From the very beginning, Micro has included the concept of a [Sidecar](https://github.com/tickoalcantara12/micro/tree/master/car), a HTTP proxy with all 
   442  the features of go-micro built in. So regardless of which language you're building your applications with, you can benefit from all 
   443  we've discussed above by using the Micro Sidecar.
   444  
   445  <p align="center">
   446    <img src="{{ site.baseurl }}/blog/images/sidecar-rpc.png" style="width: 100%; height: auto;" />
   447  </p>
   448  
   449  The sidecar pattern is nothing new. NetflixOSS has one called [Prana](https://github.com/Netflix/Prana) which leverages the JVM based 
   450  NetflixOSS stack. Buoyant have recently entered the game with an incredibly feature rich system called [Linkerd](https://linkerd.io/), 
   451  an RPC proxy that layers on top of Twitter's [Finagle](https://finagle.github.io/blog/) library.
   452  
   453  The Micro Sidecar uses the default go-micro Client. So if you want to add other functionality you can augment it very easily and rebuild. 
   454  We'll look to simplify this process much more in the future and provide a version prebuilt with all the nifty fault tolerant features.
   455  
   456  ### Wait, There's More
   457  
   458  The blog post covers a lot about the core [go-micro](https://github.com/micro/go-micro) library and surrounding toolkit. These tools 
   459  are a great start but they're not enough. When you want to run at scale, when you want hundreds of microservices that serve millions of 
   460  requests there's still a lot more to be addressed. 
   461  
   462  ###### The Platform
   463  
   464  This is where the [go-platform](https://github.com/micro/go-platform) and [platform](https://github.com/micro/platform) come into play. 
   465  Where micro addresses the fundamental building blocks, the platform goes a step further by addressing the requirements for running 
   466  at scale. Authentication, distributed tracing, synchronization, healthcheck monitoring, etc, etc. 
   467  
   468  Distributed systems require a different set of tools for observability, consensus and coordinating fault tolerance, the micro platform 
   469  looks to help with those needs. By providing a layered architecture we can build on the primitives defined by the core tools and 
   470  enhance their functionality where needed.
   471  
   472  It's still early days but the hope is that the micro platform will solve a lot of the problems organisations have with building 
   473  distributed systems platforms.
   474  
   475  ### How Do I Use All These Tools?
   476  
   477  As you can gather from the blog post, most of these features are built into the Micro toolkit. You can go check out the project on 
   478  [GitHub](https://github.com/tickoalcantara12/micro) and get started writing fault tolerant Micro services almost instantly.
   479  
   480  If you need help or have questions, come join the community on [Slack](https://slack.m3o.com). It's very active and 
   481  growing fast, with a broad range of users, from people hacking on side projects to companies already using Micro in production today. 
   482  
   483  ### Summary
   484  
   485  Technology is rapidly evolving, cloud computing now gives us access to almost unlimited scale. Trying to keep up with the pace of 
   486  change can be difficult and building scalable fault tolerant systems for the new world is still challenging.
   487  
   488  But it doesn't have to be this way. As a community we can help each other to adapt to this new environment and build products 
   489  that will scale with our growing demands.
   490  
   491  Micro looks to help in this journey by providing the tools to simplify building and managing distributed systems. Hopefully 
   492  this blog post has helped demonstrate some of the ways we're looking to do just that.
   493  
   494  If you want to learn more about the services we offer or microservices, check out the [blog](/), the  website 
   495  [micro.mu](https://m3o.com) or the github [repo](https://github.com/tickoalcantara12/micro).
   496  
   497  Follow us on Twitter at [@MicroHQ](https://twitter.com/m3ocloud) or join the [Slack](https://slack.m3o.com) 
   498  community [here](http://slack.m3o.com).
   499