pomerium · Jun 18, 2024
diff --git a/‎content/docs/troubleshooting.mdx
+127-47 b/‎content/docs/troubleshooting.mdx
+127-47
diff --git a/‎content/docs/troubleshooting/img/zero/zero-change-port-address.png
43.5 KB b/‎content/docs/troubleshooting/img/zero/zero-change-port-address.png
43.5 KB
@@ -14,6 +14,62 @@ import GenerateRecoveryToken from '@site/content/_generate-recovery-token.md';
 
 This article provides troubleshooting information for various tools and features in Pomerium.
 
+## Pomerium Zero
+
+### Configure port 443 to allow inbound access
+
+**Problem**
+
+Whenever you deploy a cluster, the Pomerium Zero cloud sends an inbound request to the cluster on port 443 to establish a secure connection. This is the default behavior. If the port is unavailable (for example, another process is already listening on port 443, or you haven't allowed a non-root process to bind to port 443), Pomerium Zero won't be able to establish a connection to your cluster.
+
+**Solution**
+
+Open the port so that it grants Pomerium inbound access on port 443. (For example, you can do this in Linux systems with the `CAP_NET_BIND_SERVICE` capability.)
+
+If you've reserved port 443 for something else, you can change the port Pomerium sends inbound requests to by specifying a different listening port (like `:8443`) in the [**Address**](/docs/reference/address) field of the Zero Console:
+
+1. Select **Settings**
+1. Select **Advanced**
+1. Enter the preferred port address
+1. Apply your changes
+
+![Changing the default port address for incoming connections in the Zero Console](./troubleshooting/img/zero/zero-change-port-address.png)
+
+:::info
+
+Pomerium Zero also makes several outbound connections to the following `pomerium.app` domains on port `443` to fetch a cluster's configuration and status:
+
+- console.pomerium.app:443
+- connect.pomerium.app:443
+- telemetry.pomerium.app:443
+
+:::
+
+### Delete a cluster
+
+At some point, you may want to delete a cluster. Currently, you can only delete a cluster if you have multiple clusters.
+
+To delete a cluster:
+
+1. Select the clusters dropdown in the Zero Console navigation bar
+1. Select **Manage Clusters**
+1. Select the checkbox next to the cluster you want to delete, then select the **Delete** button in the table
+1. In the popup, select **Delete** to confirm
+
+### Pomerium Zero loses configuration after upgrading
+
+If you installed Pomerium using the Linux install script during the Pomerium Zero beta, you will need to re-run the install script the first time you upgrade Pomerium. (Subsequent upgrades will not require this step.)
+
+1. First, find your current cluster token: look for a line beginning with `Environment=POMERIUM_ZERO_TOKEN=` in the file `/usr/lib/systemd/system/pomerium.service`.
+1. Copy this token into the following command and run it:
+
+```bash
+$ curl https://console.pomerium.app/install.bash | \
+  env POMERIUM_ZERO_TOKEN=<cluster_token> bash -s install
+```
+
+---
+
 ## Pomerium Core
 
 ### JWT Authentication
@@ -204,6 +260,8 @@ $ sudo rm /tmp/pomerium-envoy-admin.sock
 
 Then start Pomerium again.
 
+---
+
 ## Pomerium Enterprise
 
 ### Generate Recovery Token
@@ -238,53 +296,75 @@ For example, the `administrators` key allows you to specify a list of names, ema
 
 If you wanted to add an email address like `John.Admin@example.com` to the `administrators` file key, Pomerium wouldn't recognize an email like `john.admin@example` because the strings aren't an exact match.
 
-## Envoy error messages
-
-Because Pomerium relies on Envoy to manage HTTP connections, you will notice Envoy connection errors and messages at some point in your logs as you configure Pomerium.
-
-The [Envoy Response Code Details](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/response_code_details.html) provides an exhaustive list of Envoy-related message details.
-
-We've repurposed a truncated version of the Response Code Details list in the table below for your convenience:
-
-| **Name** | **Description** |
-| :-- | :-- |
-| absolute_path_rejected | The request was rejected due to using an absolute path on a route not supporting them. |
-| admin_filter_response | The response was generated by the admin filter. |
-| cluster_not_found | The request was rejected by the router filter because there was no cluster found for the selected route. |
-| downstream_local_disconnect | The client connection was locally closed for the provided reason. |
-| downstream_remote_disconnect | The client disconnected unexpectedly. |
-| duration_timeout | The max connection duration was exceeded. |
-| direct_response | A direct response was generated by the router filter. |
-| filter_added_invalid_request_data | A filter added request data at the wrong stage in the filter chain. |
-| filter_added_invalid_response_data | A filter added response data at the wrong stage in the filter chain. |
-| filter_chain_not_found | The request was rejected due to no matching filter chain. |
-| filter_removed_required_request_headers | The request was rejected in the filter manager because a configured filter removed required request headers. |
-| filter_removed_required_response_headers | The response was rejected in the filter manager because a configured filter removed required response headers or these values were invalid (e.g. overflown status). |
-| internal_redirect | The original stream was replaced with an internal redirect. |
-| low_version | The HTTP/1.0 or HTTP/0.9 request was rejected due to HTTP/1.0 support not being configured. |
-| maintenance_mode | The request was rejected by the router filter because the cluster was in maintenance mode. |
-| max_duration_timeout | The per-stream max duration timeout was exceeded. |
-| missing_host_header | The request was rejected due to a missing Host: or :authority field. |
-| missing_path_rejected | The request was rejected due to a missing Path or :path header field. |
-| no_healthy_upstream | The request was rejected by the router filter because there was no healthy upstream found. |
-| overload | The request was rejected due to the Overload Manager reaching configured resource limits. |
-| rejecting_because_detection_failed | The request was rejected because the original IP couldn’t be detected. |
-| path_normalization_failed | The request was rejected because path normalization was configured on and failed, probably due to an invalid path. |
-| request_headers_failed_strict_check | The request was rejected due to x-envoy-\* headers failing strict header validation. |
-| request_overall_timeout | The per-stream total request timeout was exceeded. |
-| request_payload_exceeded_retry_buffer_limit | Envoy is doing streaming proxying but too much data arrived while waiting to attempt a retry. |
-| request_payload_too_large | Envoy is doing non-streaming proxying and the request payload exceeded configured limits. |
-| response_payload_too_large | Envoy is doing non-streaming proxying and the response payload exceeded configured limits. |
-| route_configuration_not_found | The request was rejected because there was no route configuration found. |
-| route_not_found | The request was rejected because there was no route found. |
-| stream_idle_timeout | The per-stream keepalive timeout was exceeded. |
-| upgrade_failed | The request was rejected because it attempted an unsupported upgrade. |
-| upstream_max_stream_duration_reached | The request was destroyed because of it exceeded the configured max stream duration. |
-| upstream_per_try_timeout | The final upstream try timed out. |
-| upstream_reset_after_response_started | The upstream connection was reset after a response was started. This may include further details about the cause of the disconnect. |
-| upstream_reset_before_response_started | The upstream connection was reset before a response was started This may include further details about the cause of the disconnect. |
-| upstream_response_timeout | The upstream response timed out. |
-| via_upstream | The response code was set by the upstream. |
+---
+
+## Upstream connection errors
+
+Upstream connection errors indicate that something is wrong with the upstream server, not Pomerium. Please refer to the list of errors below to learn more about a specific issue, and how you can resolve it.
+
+:::note
+
+Configuration errors in Pomerium itself can also cause upstream connection errors. In this case, you'd need to debug your Pomerium configuration to resolve the error.
+
+:::
+
+### No healthy upstream
+
+The `no_healthy_upstream` error means that there is an issue with the upstream server that makes it unreachable from Pomerium. The error may be caused by or related to the upstream server's:
+
+- Configuration or application code
+
+  **Resolution**: Check that there are no errors in the server's configuration files or application code that prevent it from running as expected.
+
+- Network or firewall settings
+
+  **Resolution**: Check your network or firewall settings to make sure your server is reachable.
+
+- DNS records
+
+  **Resolution**: This error may be caused by unresolvable DNS records applied to the upstream server. Make sure the server's DNS records are pointing to the correct IP address.
+
+- Failing health checks configured in Pomerium
+
+  **Resolution**: If you've configured [Load Balancing Health Checks](/docs/reference/routes/load-balancing#health-checks) in Pomerium, the `no_healthy_upstream` could be the result of a failing health check from an upstream server. Please check the server's configuration for any errors.
+
+### Upstream Max Stream Duration Reached
+
+The `upstream_max_stream_duration_reached` error means that Pomerium cancelled the request because it exceeded the upstream server's maximum stream duration.
+
+    **Resolution**: By default, Pomerium sets a 10-second timeout for all requests. If your requests are taking longer than expected, see the [Connections - Timeouts](/docs/internals/connection#timeouts) page to learn how timeouts work with upstream connections, and how to configure timeouts to avoid this error.
+
+### Upstream Per Try Timeout
+
+The `upstream_per_try_timeout` error means that the final attempt to connect to the upstream server timed out.
+
+    **Resolution**: See the [Connections - Timeouts](/docs/internals/connection#timeouts) page to learn how timeouts work with upstream connections, and how to configure timeouts in Pomerium to avoid this error.
+
+### Upstream Reset After Response Started
+
+The `upstream_reset_after_response_started` error means that the upstream server reset the connection _after_ it began transmitting the response.
+
+    **Resolution**: See the [Connections - Timeouts](/docs/internals/connection#timeouts) page to learn how timeouts work with upstream connections, and how to configure timeouts in Pomerium to avoid this error.
+
+### Upstream Reset Before Response Started
+
+The `upstream_reset_before_response_started` error means the upstream server reset the connection _before_ it began transmitting the response.
+
+    **Resolution**: See the [Connections - Timeouts](/docs/internals/connection#timeouts) page to learn how timeouts work with upstream connections, and how to configure timeouts in Pomerium to avoid this error.
+
+### Upstream Response Timeout
+
+The `upstream_response_timeout` error means that the upstream server's response timed out.
+
+    **Resolution**: See the [Connections - Timeouts](/docs/internals/connection#timeouts) page to learn how timeouts work with upstream connections, and how to configure timeouts in Pomerium to avoid this error.
+
+### Via Upstream
+
+The `via_upstream` error means that the upstream service set the response code.
+
+    **Resolution**: To resolve this error, check the upstream service's application logs for more information about how the response status code is set.
+
+---
 
 ## Miscellaneous