Constellation-Messaging Service

The Messaging service acts as a middle man between information publishers and information subscribers; accepting simple http publish of information to forward to 1000's of websocket subscribers. The publishers are typically the Infinity case engine or 3rd party integration services, and the subscribers are UI components in browsers running Constellation UI.

Overview

The service is exposed to the browsers and the publishers through a network ip address. The url (network ip address) for the service is passed to the browser during initial portal load, from a DSS in Infinity. The same ip address (e.g. DSS) is used by publishers for http POST calls, and subscribers (e.g. browsers) for WSS connections.

A websocket connection is used from the subscriber, to allow push update on data change. A websocket connection is a duplex long lived connection. This is different to a http connection, which is a short lived connection. This difference should be acknowledged during deployment.

Scaleability and reliability

The service is designed to handle many Infinity deployments. Typically:

The service can scale to handle 10,000's of connections per pod or container - the limit is OS and network tuning. CPU load comes from the number of subscribers per publish message and the rate of publishing.

Pod autoscaling can be applied on number of connections.

Operational statistics (Prometheus) are exposed through the /metrics end-point.

We run a single K8s pod to support all of Pega use on a lightly loaded T3.medium .

Installation

The service can be installed from a Docker image, from a K8s yaml file or from a helm chart.

For anything other than a simple single user desktop trial, the service should be exposed to the browsers and the publishers through a load balancer.

  1. Familiarity with Docker (images, containers, start, stop, background execution, logs, ports, repos) is a prerequisite to trying to install.
  2. For simple Docker installs, a good https certificate is required.
  3. For K8s or Helm, experience with K8s, cluster admin and network configuration are prerequisites to trying to install

Kubernetes: Websockets are long lived connections, and in a multi-pod deployment, round-robin will lead to uneven connection distribution. IPVS 'least-connection' routing (https://kubernetes.io/docs/concepts/services-networking/service/) should be used.

Integration with Infinity is completed by populating the Infinity DSS ConstellationMessagingSvcHost with the service public url. This must be https.

Operations

The service api is exposed through the /swagger.html end-point.

This service is very reliable, and not much service maintenance or monitoring is required. Monitoring the network infra is as important as monitoring the network. All unexpected operations including dropped connections and internal errors, are logged by the service. Alerts can be setup on these log entries. Here is our monitoring dashboard:

Monitoring dashboard

Troubleshooting

All requests and unexpected operations are logged.

There is a basic 'are you connected' http end point at /ping. This should result in a http 200 response. For no response check the log to see if the request reached the service, then work back through the network to find the problem. While installing the service, checking the /ping end-point with http can be very helpful with https issues.

A more detailed check of number of wss connections and size of the subscribers table is on the /v001/healthcheck end point.

Https certificates

The DxV2API on the case engine is https only. As browsers will not allow mixed https and http documents, wss must be used for the messaging service subscriber end point. The certificate must match the domain and be valid.

The publisher end-point is also https with Oauth authorisation. Http Java libraries are very brittle with https certificates: Obscure error messages will be generated in the Infinity log on message publishing to an endpoint with bad certificate. Domain must match and certificate must be valid.

One easy path to solving both of these issues is to put a good certificate on the load balancer.

Network timeouts

Websockets are duplex long-lived connections. This is quite different to a http connection, that is intended to be recreated on each call. The browser component (Constellation-CoreJS messagingservice) reconnects automatically on the connection being terminated unexpectedly. However use of typical http timeouts (< 10 min) will results in messages in the browser log and some degredation in UI performance. Within Pega, we use a Network Load Balancer (OSS level 4) instead of and Application Load Balancer (OSS level 7), to give better control over connection timeout settings.