Middleware documentation fixes
This commit is contained in:
parent
5ef6297daa
commit
502c88ee3f
24 changed files with 536 additions and 490 deletions
|
@ -3,27 +3,24 @@
|
|||
Don't Waste Time Calling Unhealthy Services
|
||||
{: .subtitle }
|
||||
|
||||

|
||||

|
||||
|
||||
The circuit breaker protects your system from stacking requests to unhealthy services (resulting in cascading failures).
|
||||
The circuit breaker protects your system from stacking requests to unhealthy services, resulting in cascading failures.
|
||||
|
||||
When your system is healthy, the circuit is closed (normal operations).
|
||||
When your system becomes unhealthy, the circuit becomes open and the requests are no longer forwarded (but handled by a fallback mechanism).
|
||||
When your system becomes unhealthy, the circuit opens, and the requests are no longer forwarded, but instead are handled by a fallback mechanism.
|
||||
|
||||
To assess if your system is healthy, the circuit breaker constantly monitors the services.
|
||||
To assess if your system is healthy, the circuit breaker constantly monitors the services.
|
||||
|
||||
!!! note ""
|
||||
|
||||
- The CircuitBreaker only analyses what happens _after_ it is positioned in the middleware chain. What happens _before_ has no impact on its state.
|
||||
- The CircuitBreaker only affects the routers that use it. Routers that don't use the CircuitBreaker won't be affected by its state.
|
||||
The CircuitBreaker only analyzes what happens _after_ its position within the middleware chain. What happens _before_ has no impact on its state.
|
||||
|
||||
!!! important
|
||||
|
||||
Each router will eventually gets its own instance of a given circuit breaker.
|
||||
|
||||
If two different routers refer to the same circuit breaker definition, they will get one instance each.
|
||||
It means that one circuit breaker can be open while the other stays closed: their state is not shared.
|
||||
|
||||
Each router gets its own instance of a given circuit breaker.
|
||||
One circuit breaker instance can be open while the other remains closed: their state is not shared.
|
||||
|
||||
This is the expected behavior, we want you to be able to define what makes a service healthy without having to declare a circuit breaker for each route.
|
||||
|
||||
## Configuration Examples
|
||||
|
@ -90,70 +87,71 @@ There are three possible states for your circuit breaker:
|
|||
|
||||
While the circuit is closed, the circuit breaker only collects metrics to analyze the behavior of the requests.
|
||||
|
||||
At specified intervals (`checkPeriod`), it will evaluate `expression` to decide if its state must change.
|
||||
At specified intervals (`checkPeriod`), the circuit breaker evaluates `expression` to decide if its state must change.
|
||||
|
||||
### Open
|
||||
|
||||
While open, the fallback mechanism takes over the normal service calls for a duration of `FallbackDuration`.
|
||||
After this duration, it will enter the recovering state.
|
||||
After this duration, it enters the recovering state.
|
||||
|
||||
### Recovering
|
||||
|
||||
While recovering, the circuit breaker will progressively send requests to your service again (in a linear way, for `RecoveryDuration`).
|
||||
If your service fails during recovery, the circuit breaker becomes open again.
|
||||
If the service operates normally during the whole recovering duration, then the circuit breaker returns to close.
|
||||
While recovering, the circuit breaker sends linearly increasing amounts of requests to your service (for `RecoveryDuration`).
|
||||
If your service fails during recovery, the circuit breaker opens again.
|
||||
If the service operates normally during the entire recovery duration, then the circuit breaker closes.
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Configuring the Trigger
|
||||
|
||||
You can specify an `expression` that, once matched, will trigger the circuit breaker (and apply the fallback mechanism instead of calling your services).
|
||||
You can specify an `expression` that, once matched, opens the circuit breaker and applies the fallback mechanism instead of calling your services.
|
||||
|
||||
The `expression` can check three different metrics:
|
||||
The `expression` option can check three different metrics:
|
||||
|
||||
- The network error ratio (`NetworkErrorRatio`)
|
||||
- The status code ratio (`ResponseCodeRatio`)
|
||||
- The latency at quantile, in milliseconds (`LatencyAtQuantileMS`)
|
||||
- The latency at a quantile in milliseconds (`LatencyAtQuantileMS`)
|
||||
|
||||
#### `NetworkErrorRatio`
|
||||
|
||||
If you want the circuit breaker to trigger at a 30% ratio of network errors, the expression will be `NetworkErrorRatio() > 0.30`
|
||||
If you want the circuit breaker to open at a 30% ratio of network errors, the `expression` is `NetworkErrorRatio() > 0.30`
|
||||
|
||||
#### `ResponseCodeRatio`
|
||||
|
||||
You can trigger the circuit breaker based on the ratio of a given range of status codes.
|
||||
You can configure the circuit breaker to open based on the ratio of a given range of status codes.
|
||||
|
||||
The `ResponseCodeRatio` accepts four parameters, `from`, `to`, `dividedByFrom`, `dividedByTo`.
|
||||
|
||||
The operation that will be computed is sum(`to` -> `from`) / sum (`dividedByFrom` -> `dividedByTo`).
|
||||
|
||||
!!! note ""
|
||||
If sum (`dividedByFrom` -> `dividedByTo`) equals 0, then `ResponseCodeRatio` returns 0.
|
||||
|
||||
`from`is inclusive, `to` is exclusive.
|
||||
|
||||
For example, the expression `ResponseCodeRatio(500, 600, 0, 600) > 0.25` will trigger the circuit breaker if 25% of the requests returned a 5XX status (amongst the request that returned a status code from 0 to 5XX).
|
||||
If sum (`dividedByFrom` -> `dividedByTo`) equals 0, then `ResponseCodeRatio` returns 0.
|
||||
|
||||
`from`is inclusive, `to` is exclusive.
|
||||
|
||||
For example, the expression `ResponseCodeRatio(500, 600, 0, 600) > 0.25` will trigger the circuit breaker if 25% of the requests returned a 5XX status (amongst the request that returned a status code from 0 to 5XX).
|
||||
|
||||
#### `LatencyAtQuantileMS`
|
||||
|
||||
You can trigger the circuit breaker when a given proportion of your requests become too slow.
|
||||
You can configure the circuit breaker to open when a given proportion of your requests become too slow.
|
||||
|
||||
For example, the expression `LatencyAtQuantileMS(50.0) > 100` will trigger the circuit breaker when the median latency (quantile 50) reaches 100MS.
|
||||
For example, the expression `LatencyAtQuantileMS(50.0) > 100` opens the circuit breaker when the median latency (quantile 50) reaches 100ms.
|
||||
|
||||
!!! note ""
|
||||
|
||||
You must provide a float number (with the trailing .0) for the quantile value
|
||||
|
||||
#### Using multiple metrics
|
||||
You must provide a floating point number (with the trailing .0) for the quantile value
|
||||
|
||||
You can combine multiple metrics using operators in your expression.
|
||||
#### Using Multiple Metrics
|
||||
|
||||
You can combine multiple metrics using operators in your `expression`.
|
||||
|
||||
Supported operators are:
|
||||
|
||||
- AND (`&&`)
|
||||
- OR (`||`)
|
||||
|
||||
For example, `ResponseCodeRatio(500, 600, 0, 600) > 0.30 || NetworkErrorRatio() > 0.10` triggers the circuit breaker when 30% of the requests return a 5XX status code, or when the ratio of network errors reaches 10%.
|
||||
For example, `ResponseCodeRatio(500, 600, 0, 600) > 0.30 || NetworkErrorRatio() > 0.10` triggers the circuit breaker when 30% of the requests return a 5XX status code, or when the ratio of network errors reaches 10%.
|
||||
|
||||
#### Operators
|
||||
|
||||
|
@ -168,8 +166,8 @@ Here is the list of supported operators:
|
|||
|
||||
### Fallback mechanism
|
||||
|
||||
The fallback mechanism returns a `HTTP 503 Service Unavailable` to the client (instead of calling the target service).
|
||||
This behavior cannot be configured.
|
||||
The fallback mechanism returns a `HTTP 503 Service Unavailable` to the client instead of calling the target service.
|
||||
This behavior cannot be configured.
|
||||
|
||||
### `CheckPeriod`
|
||||
|
||||
|
@ -182,6 +180,6 @@ By default, `FallbackDuration` is 10 seconds. This value cannot be configured.
|
|||
|
||||
### `RecoveringDuration`
|
||||
|
||||
The duration of the recovering mode (recovering state).
|
||||
The duration of the recovering mode (recovering state).
|
||||
|
||||
By default, `RecoveringDuration` is 10 seconds. This value cannot be configured.
|
||||
By default, `RecoveringDuration` is 10 seconds. This value cannot be configured.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue