Why does my Kubernetes HTTP probe fails with a "Broken pipe" exception
Recently I faced a weird issue on OpenShift/Kubernetes where a Tomcat HTTP probe (a Camel healthcheck but it is the same with Spring Boot or Microprofile healthchecks) was failling with a Broken Pipe exception:
org.apache.catalina.connector.ClientAbortException: java.io.IOException: Broken pipe
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:351)
at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:776)
at org.apache.catalina.connector.OutputBuffer.doFlush(OutputBuffer.java:298)
at org.apache.catalina.connector.OutputBuffer.close(OutputBuffer.java:251)
The exception mentions that the client aborted the call but why? The impact when using the underlying endpoint as a HTTP probe is that Kubernetes thinks the deployment fails so the new version/container does not pop up.
After reviewing Kubernetes HTTP probe code, I realized they had some limitation in their HTTP client which are not obvious upfront. The critical line in my case was this one: https://github.com/kubernetes/kubernetes/blob/c4cf0d49b1108f16d4c560954275f8a1d76175e8/pkg/probe/http/http.go#L36: the client limits the health check payload size to 10k. Back to my application (a camel context using camel healthchecks) I realize the payload is ~50k big! So what happens? Kubernetes HTTP probe gets a HTTP 200 response, starts to read the healthcheck response payload...and stops after 10k data and considers it is a server error.
The fix was obvious: make the output less verbose. It is not that nice because when it fails you loose the details (or you use a "compressed" convention like ok_xxx, ko_xxx) but it works and rolling updates are now functional.
From the same author:
In the same category: