Kube Cloud Pt3 | Health Indicators

Kube Cloud Pt3 | Health Indicators

Kube Cloud Pt3 | Service Interaction via Kubernetes

full course
  1. Kube Cloud Pt3 | Synchronous Service Interaction
  2. Kube Cloud Pt3 | REST Interaction
  3. Kube Cloud Pt3 | Health Indicators

Spring offers a way to tell if your services and their dependent resources are up and healthy. Kubernetes can leverage this functionality via their liveness and readiness probes to report if pods are available to service requests. In this session, we’re going to enable and connect those health checks.

Enable Health Endpoint on message-generator

Switch over to your message-generator project and start a new branch

$ git checkout -b health
Switched to a new branch 'health'

Update pom.xml to add the actuator dependency

        <!-- Health Checks and Metrics -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-actuator</artifactId>
        </dependency>

You may need to reload your maven dependencies to make sure that this new library is being brought in correctly.

Next, create an application.yaml in src/main/resource (remove application.properties if it exists)

spring:
  application:
    name: message-generator
management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      probes:
        enabled: true
      show-details: always
  health:
    livenessstate:
      enabled: true
    readinessstate:
      enabled: true

Here’s what’s happening:

  • setting our spring application name (which is just good practice)
  • enabling all actuator endpoints. This might not be generally a safe practice as there may be some useful information here which you will probably want to secure.
  • Turn on the liveness and readiness health probes and allow the details to be shown on the health endpoint

Some really good information about liveness and readiness are here. Some additional information about actuator and the various endpoints we enabled is here.

Now start up your application and hit the health endpoint (/actuator/health). You should see something like this:

Now let’s go back to our helm configuration and enable those liveness and readiness probe endpoints. Modify your deployment.yaml in helm/message-generator/templates to reflect this

          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 15
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 15

Commit and Push

Now push up to build and test

$ git add .

$ git commit -m "added health checks"
[health f69d658] added health checks
 4 files changed, 32 insertions(+), 9 deletions(-)
 delete mode 100644 src/main/resources/application.properties
 create mode 100644 src/main/resources/application.yaml

$ git push
fatal: The current branch health has no upstream branch.
To push the current branch and set the remote as upstream, use
Compressing objects: 100% (9/9), done.
Writing objects: 100% (11/11), 1.00 KiB | 512.00 KiB/s, done.
Total 11 (delta 5), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (5/5), completed with 5 local objects.
remote:
remote: Create a pull request for 'health' on GitHub by visiting:
remote:      https://github.com/bullyrooks/message-generator/pull/new/health
remote:
To github.com-bullyrook:bullyrooks/message-generator.git
 * [new branch]      health -> health
Branch 'health' set up to track remote branch 'health' from 'origin'.

When the build completes succesfully, merge to main

$ git checkout main
Switched to branch 'main'                    
Your branch is up to date with 'origin/main'.

$ git merge health                                                
Updating 638c8dd..f69d658
Fast-forward
 helm/message-generator/templates/deployment.yaml | 16 ++++++++--------
 pom.xml                                          |  6 ++++++
 src/main/resources/application.properties        |  1 -
 src/main/resources/application.yaml              | 18 ++++++++++++++++++
 4 files changed, 32 insertions(+), 9 deletions(-)
 delete mode 100644 src/main/resources/application.properties
 create mode 100644 src/main/resources/application.yaml

$ git push
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com-bullyrook:bullyrooks/message-generator.git
   638c8dd..f69d658  main -> main

And you should see it via postman

It was about here that I discovered that my application wasn’t deploying the latest version because the publishing of the helm chart was taking longer than the time that github actions reached the helm repo update

I added this update to the helm github action in main.yaml to fix that

- name: Deploy
uses: WyriHaximus/github-action-helm3@v2
with:
exec: |
helm repo add bullyrooks https://bullyrooks.github.io/helm-charts/
helm repo update
echo "helm upgrade message-generator bullyrooks/message-generator --install --version ${{ env.VERSION }}"
sleep 45s
helm repo update
helm upgrade message-generator bullyrooks/message-generator --install --version ${{ env.VERSION }}
kubeconfig: '${{ secrets.KUBECONFIG }}'

Health and Readiness for Cloud Application

Go ahead and do the exact same setup for cloud-application. Once you have that setup, we’re going to add some new health checks.

When you deploy and run the health check you’re going to see that you have some additional health checks:

        "discoveryComposite": {
            "status": "UP",
            "components": {
                "discoveryClient": {
                    "status": "UP",
                    "details": {
                        "services": [
                            "cloud-application"
                        ]
                    }
                }
            }
        },
...
        "kubernetes": {
            "status": "UP",
            "details": {
                "nodeName": "gke-cloud-dev-b-1-b0b413d1-c9n4",
                "podIp": "10.8.17.59",
                "hostIp": "10.255.16.52",
                "namespace": "bullyrooks",
                "podName": "cloud-application-64dd9c6557-v2grm",
                "serviceAccount": "default",
                "inside": true,
                "labels": {
                    "app.kubernetes.io/instance": "cloud-application",
                    "app.kubernetes.io/name": "cloud-application",
                    "pod-template-hash": "64dd9c6557"
                }
            }
        },
...
        "mongo": {
            "status": "UP",
            "details": {
                "version": "4.4.12"
            }
        },

The mongo one is built into the mongodb starter. It will check the connection to the database to confirm connectivity. The others come from the fabric 8 kubernetes framework. One tells you the other services that are available via discovery. The other tells you the status of the deployment so you can quickly debug.

Now we’re going to create a new health check for the status of our dependent service (message-generator) and make the readiness status depend on the health of that service and the database.

Create a Health Indicator

in com.bullyrooks.cloud_application.message_generator.client.dto create a class called HealthCheckDTO with this content

package com.bullyrooks.cloud_application.message_generator.client.dto;

import lombok.Data;

@Data
public class HealthCheckDTO {
    private String status;
}

com.bullyrooks.cloud_application.message_generator.client create a class called MessageGeneratorHealthClient with this content

package com.bullyrooks.cloud_application.message_generator.client;

import com.bullyrooks.cloud_application.message_generator.client.dto.HealthCheckDTO;
import org.springframework.cloud.openfeign.FeignClient;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;

@FeignClient(contextId = "healthClient", name = "message-generator")
public interface MessageGeneratorHealthClient {

    @GetMapping(value = "/actuator/health",
            produces = MediaType.APPLICATION_JSON_VALUE,
            consumes = MediaType.APPLICATION_JSON_VALUE
    )
    ResponseEntity<HealthCheckDTO> getHealth();
}

This will create a client that uses our message-generator service discovered through kubernetes to hit the health endpoint. We only care about status, so that’s the only field we map into the DTO.

Now add this content to application.yaml

feign:
client:
config:
healthClient:
connectTimeout: 1000
readTimeout: 1000

and in application.yaml update this clause

management:
  endpoints:
    web:
      exposure:
        include: "*"
  endpoint:
    health:
      probes:
        enabled: true
      show-details: always
      group:
        readiness:
          include: "readinessState,mongo,messageGenerator"

Here we’re configuring our new feign client to timeout quickly if it can’t reach the health endpoint. Notice that the feign client contextId maps to the configuration. This allows us to use a separate timeout configuration for this client.

Now create a new package: com.bullyrooks.cloud_application.config. Create a class called MessageGeneratorHealth and add this content

package com.bullyrooks.cloud_application.config;

import com.bullyrooks.cloud_application.message_generator.client.MessageGeneratorHealthClient;
import com.bullyrooks.cloud_application.message_generator.client.dto.HealthCheckDTO;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.lang.StringUtils;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.actuate.availability.AvailabilityStateHealthIndicator;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.HealthIndicator;
import org.springframework.boot.actuate.health.Status;
import org.springframework.boot.availability.ApplicationAvailability;
import org.springframework.boot.availability.AvailabilityState;
import org.springframework.boot.availability.ReadinessState;
import org.springframework.http.ResponseEntity;
import org.springframework.stereotype.Component;

@Component
@Slf4j
public class MessageGeneratorHealthIndicator implements HealthIndicator {

    @Autowired
    MessageGeneratorHealthClient messageClient;

    public Health health() {
        Health.Builder status = Health.up();
        try {
            ResponseEntity<HealthCheckDTO> healthCheckDTO = messageClient.getHealth();
            log.info("health check response: {}\n{}", healthCheckDTO.getStatusCode(), healthCheckDTO.getBody());
            if (!StringUtils.equals("UP", healthCheckDTO.getBody().getStatus())) {
                status = Health.outOfService();
            }
        }catch (Exception e){
            log.error("error trying to get message generator health: {}", e.getMessage(),e);
            status = Health.outOfService();
        }
        return status.build();
    }
}

HealthIndicator is a spring actuator interface that will automatically detect this class and add its health() check to the health endpoint results. In this class we use our new client to make sure that the message-generator health endpoint returns UP. If it doesn’t, we return DOWN for this check. Additionally, the naming convention used for the application.yaml will be the name minus HealthIndicator.

Testing with Message Generator Up

Message generator should already be running. Hit the endpoint for your cloud-application health endpoint (actuator/health). You should see something like this:

    "status": "UP",
...
        "messageGenerator": {
            "status": "UP"
        },
...
        "readinessState": {
            "status": "UP"
        },

and the readiness endpoint (actuator/health/readiness) should look like this

{
    "status": "UP",
    "components": {
        "messageGenerator": {
            "status": "UP"
        },
        "mongo": {
            "status": "UP",
            "details": {
                "version": "4.4.12"
            }
        },
        "readinessState": {
            "status": "UP"
        }
    }
}

Testing with Message Generator Down

Log into okteto and destroy the message-generator deployment.

Hit your readiness endpoint (/actuator/health/readiness), you should see something like this:

   {
    "status": "OUT_OF_SERVICE",
    "components": {
        "messageGenerator": {
            "status": "OUT_OF_SERVICE"
        },
        "mongo": {
            "status": "UP",
            "details": {
                "version": "4.4.12"
            }
        },
        "readinessState": {
            "status": "UP"
        }
    }
}

you can also see the changes in the /actuator/health endpoint

    "status": "OUT_OF_SERVICE",
...
        "messageGenerator": {
            "status": "OUT_OF_SERVICE"
...
        "readinessState": {
            "status": "UP"
        },

I’m not sure why readinessState here looks UP. However, you must be very careful with the readiness and liveness probes and kubernetes can take your service out of availability if they are failing. The docs are very specific about this:

As for the “readiness” probe, the choice of checking external systems must be made carefully by the application developers. For this reason, Spring Boot does not include any additional health checks in the readiness probe. If the readiness state of an application instance is unready, Kubernetes does not route traffic to that instance. Some external systems might not be shared by application instances, in which case they could be included in a readiness probe. Other external systems might not be essential to the application (the application could have circuit breakers and fallbacks), in which case they definitely should not be included. Unfortunately, an external system that is shared by all application instances is common, and you have to make a judgement call: Include it in the readiness probe and expect that the application is taken out of service when the external service is down or leave it out and deal with failures higher up the stack, perhaps by using a circuit breaker in the caller.

I do not recommend implementing it this way for our use case. Cloud application is available when message generator is offline (you can provide a message and it should still work). I’m only using this as an example of how to hook up readiness probes for dependent systems.

0 comments on “Kube Cloud Pt3 | Health IndicatorsAdd yours →

Leave a Reply

Your email address will not be published. Required fields are marked *