1
0
Fork 0

Middleware documentation fixes

This commit is contained in:
Brendan Le Glaunec 2021-02-11 14:34:04 +01:00 committed by GitHub
parent 5ef6297daa
commit 502c88ee3f
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
24 changed files with 536 additions and 490 deletions

View file

@ -3,27 +3,24 @@
Don't Waste Time Calling Unhealthy Services
{: .subtitle }
![CircuitBreaker](../assets/img/middleware/circuitbreaker.png)
![CircuitBreaker](../assets/img/middleware/circuitbreaker.png)
The circuit breaker protects your system from stacking requests to unhealthy services (resulting in cascading failures).
The circuit breaker protects your system from stacking requests to unhealthy services, resulting in cascading failures.
When your system is healthy, the circuit is closed (normal operations).
When your system becomes unhealthy, the circuit becomes open and the requests are no longer forwarded (but handled by a fallback mechanism).
When your system becomes unhealthy, the circuit opens, and the requests are no longer forwarded, but instead are handled by a fallback mechanism.
To assess if your system is healthy, the circuit breaker constantly monitors the services.
To assess if your system is healthy, the circuit breaker constantly monitors the services.
!!! note ""
- The CircuitBreaker only analyses what happens _after_ it is positioned in the middleware chain. What happens _before_ has no impact on its state.
- The CircuitBreaker only affects the routers that use it. Routers that don't use the CircuitBreaker won't be affected by its state.
The CircuitBreaker only analyzes what happens _after_ its position within the middleware chain. What happens _before_ has no impact on its state.
!!! important
Each router will eventually gets its own instance of a given circuit breaker.
If two different routers refer to the same circuit breaker definition, they will get one instance each.
It means that one circuit breaker can be open while the other stays closed: their state is not shared.
Each router gets its own instance of a given circuit breaker.
One circuit breaker instance can be open while the other remains closed: their state is not shared.
This is the expected behavior, we want you to be able to define what makes a service healthy without having to declare a circuit breaker for each route.
## Configuration Examples
@ -90,70 +87,71 @@ There are three possible states for your circuit breaker:
While the circuit is closed, the circuit breaker only collects metrics to analyze the behavior of the requests.
At specified intervals (`checkPeriod`), it will evaluate `expression` to decide if its state must change.
At specified intervals (`checkPeriod`), the circuit breaker evaluates `expression` to decide if its state must change.
### Open
While open, the fallback mechanism takes over the normal service calls for a duration of `FallbackDuration`.
After this duration, it will enter the recovering state.
After this duration, it enters the recovering state.
### Recovering
While recovering, the circuit breaker will progressively send requests to your service again (in a linear way, for `RecoveryDuration`).
If your service fails during recovery, the circuit breaker becomes open again.
If the service operates normally during the whole recovering duration, then the circuit breaker returns to close.
While recovering, the circuit breaker sends linearly increasing amounts of requests to your service (for `RecoveryDuration`).
If your service fails during recovery, the circuit breaker opens again.
If the service operates normally during the entire recovery duration, then the circuit breaker closes.
## Configuration Options
### Configuring the Trigger
You can specify an `expression` that, once matched, will trigger the circuit breaker (and apply the fallback mechanism instead of calling your services).
You can specify an `expression` that, once matched, opens the circuit breaker and applies the fallback mechanism instead of calling your services.
The `expression` can check three different metrics:
The `expression` option can check three different metrics:
- The network error ratio (`NetworkErrorRatio`)
- The status code ratio (`ResponseCodeRatio`)
- The latency at quantile, in milliseconds (`LatencyAtQuantileMS`)
- The latency at a quantile in milliseconds (`LatencyAtQuantileMS`)
#### `NetworkErrorRatio`
If you want the circuit breaker to trigger at a 30% ratio of network errors, the expression will be `NetworkErrorRatio() > 0.30`
If you want the circuit breaker to open at a 30% ratio of network errors, the `expression` is `NetworkErrorRatio() > 0.30`
#### `ResponseCodeRatio`
You can trigger the circuit breaker based on the ratio of a given range of status codes.
You can configure the circuit breaker to open based on the ratio of a given range of status codes.
The `ResponseCodeRatio` accepts four parameters, `from`, `to`, `dividedByFrom`, `dividedByTo`.
The operation that will be computed is sum(`to` -> `from`) / sum (`dividedByFrom` -> `dividedByTo`).
!!! note ""
If sum (`dividedByFrom` -> `dividedByTo`) equals 0, then `ResponseCodeRatio` returns 0.
`from`is inclusive, `to` is exclusive.
For example, the expression `ResponseCodeRatio(500, 600, 0, 600) > 0.25` will trigger the circuit breaker if 25% of the requests returned a 5XX status (amongst the request that returned a status code from 0 to 5XX).
If sum (`dividedByFrom` -> `dividedByTo`) equals 0, then `ResponseCodeRatio` returns 0.
`from`is inclusive, `to` is exclusive.
For example, the expression `ResponseCodeRatio(500, 600, 0, 600) > 0.25` will trigger the circuit breaker if 25% of the requests returned a 5XX status (amongst the request that returned a status code from 0 to 5XX).
#### `LatencyAtQuantileMS`
You can trigger the circuit breaker when a given proportion of your requests become too slow.
You can configure the circuit breaker to open when a given proportion of your requests become too slow.
For example, the expression `LatencyAtQuantileMS(50.0) > 100` will trigger the circuit breaker when the median latency (quantile 50) reaches 100MS.
For example, the expression `LatencyAtQuantileMS(50.0) > 100` opens the circuit breaker when the median latency (quantile 50) reaches 100ms.
!!! note ""
You must provide a float number (with the trailing .0) for the quantile value
#### Using multiple metrics
You must provide a floating point number (with the trailing .0) for the quantile value
You can combine multiple metrics using operators in your expression.
#### Using Multiple Metrics
You can combine multiple metrics using operators in your `expression`.
Supported operators are:
- AND (`&&`)
- OR (`||`)
For example, `ResponseCodeRatio(500, 600, 0, 600) > 0.30 || NetworkErrorRatio() > 0.10` triggers the circuit breaker when 30% of the requests return a 5XX status code, or when the ratio of network errors reaches 10%.
For example, `ResponseCodeRatio(500, 600, 0, 600) > 0.30 || NetworkErrorRatio() > 0.10` triggers the circuit breaker when 30% of the requests return a 5XX status code, or when the ratio of network errors reaches 10%.
#### Operators
@ -168,8 +166,8 @@ Here is the list of supported operators:
### Fallback mechanism
The fallback mechanism returns a `HTTP 503 Service Unavailable` to the client (instead of calling the target service).
This behavior cannot be configured.
The fallback mechanism returns a `HTTP 503 Service Unavailable` to the client instead of calling the target service.
This behavior cannot be configured.
### `CheckPeriod`
@ -182,6 +180,6 @@ By default, `FallbackDuration` is 10 seconds. This value cannot be configured.
### `RecoveringDuration`
The duration of the recovering mode (recovering state).
The duration of the recovering mode (recovering state).
By default, `RecoveringDuration` is 10 seconds. This value cannot be configured.
By default, `RecoveringDuration` is 10 seconds. This value cannot be configured.