AWS announced API Gateway websockets way back in December 2018 but I had not taken much of an interest in them until very recently. However, I recently proposed a little fun friday project at work to look at implementing websockets to push out inventory updates to clients for a very bare bones, small scale e-commerce site we recently launched. We had built in some basic inventory management but our current stock levels were kept intentionally low so a race condition of multiple customers checking out with the last of some stock is not unrealistic.
Defining a lambda deployment with Serverless is not going to be covered in this post, there is good documentation here. Defining a lambda trigger to be sourced from a websocket is pretty much the same as any other kind of event (http/sns/sqs etc) with a few websocket specific pieces of config:
Websockets are considered just another kind of event like http or sns when defining your lambda functions.
After looking at this gist you might have questions similar to:
- What is a route?
- What is a routeResponseSelectionExpression?
- What does $connect/$disconnect/$default represent?
All very valid questions but not really to do with Serverless (we will answer those next). That gist really does represent all you need to deploy the possible variants of lambdas that can be triggered by API Gateway websockets.
What is a route?
AWS docs here. Routes provide a mechanism through which messages can be directed to different integrations, in the gist above they are being used to trigger different lambdas but they can just as easily be used to trigger other API gateway integrations. Aside from the AWS provided $connect/$disconnect/$default routes you can consider routes as arbitrary and defined at runtime. The route is specified in the
action key of the body of a websocket request.
What do $connect/$disconnect/$default represent?
- API Gateway will route any messages without a target integration to the $default route. This allows you to handle the
$defaultroute as a catch all, potentially as an exception handler if all routes in your service should be explicitly defined.
- The $connect route is triggered when a new socket is opened.
- The $disconnect route is triggered (not always reliably) when a socket is closed by a client.
What is a routeResponseSelectionExpression?
AWS docs here. The
routeResponseSelectionExpression in the serverless.yml configures the API Gateway/Lambda integration as a two way communication channel (as you might expect from http request). By default when a message is received from a client websocket connection is only one way. Only one route response value is currently accepted -
$default (similar to action routes), but it is designed to support custom ones at some point in the future. What this means is the body of the response from your lambda will be returned to the client over the websocket.
A little bit of pain
Websockets are inherently asynchronous and AWS doesn’t do much to help you manage their state across invocations, if you want to send a message to Client A at an arbitrary point in time you will have to keep a record of their
ConnectionId from when the socket was opened. It is also entirely possible that the socket to Client A has been closed, this could be by the client or by API Gateway, and whilst you might expect the $disconnect route to have been triggered this is not always going to happen. In this scenario you will find out it is closed when you try and send them a message and it will fail. So, again, you must maintain a record of this state yourself. That isn’t ideal but it is what it is. So, how do we do this? And, how can we do this in Golang?
Handling $connect and $disconnect
We know that we need to maintain our own store of state for our websockets otherwise we will be limited to what we can do with them. The repo attached to this post contains a concept DynamoDB table for this purpose with a table structure as follows:
The table above is designed for use in a simplistic scenario where we are limiting socket life to a specific time frame (for example 20 minutes). It supports a single access pattern:
- Get all connectionIDs from the last N[time unit]
The Time To Live on the table is used to expire rows after 20 minutes. Because TTL expiration is not triggered immediately on expiry the access pattern above allows us to also bake it into our query. The TTL does allows us to forget about removing stale connections, they will expire and get removed. There is no reason that our TTL and the condition on our
RANGE key at access time should be the same.
On $connect, store the connectionId in the table above and use the timestamp of
now for the
sk value. In this demo, on $disconnect, we don’t really have to do anything as we have auto expiring rows and we can gracefully handle sending message failures later if the socket has been closed. Ordinarily and with a different table design, deleting the connection record would be prudent but as previously mentioned do not depend on $disconnect reliably being invoked.
Doing the Golang
One way communication
The gist below demonstrates one way communication via websocket in Golang. The lambda handler function signature is just as if it was being triggered by a http request except for not requiring a response return parameter. The
WebSocketRequestBody struct is the format of all incoming websocket request bodies. If action is not set API Gateway will reject the request and if there is no data set then there is no message. Use the connectionID to identify the client or correlate messages.
Two way communication
The gist below demonstrates two way communication via websocket in Golang. The lambda handler function signature is just as if it was being triggered by a http request including a response return parameter. API Gateway will return the body of the response to client over the socket as the message only if a
2xx status code is returned, otherwise it fail. This process is almost identical to managing a http request. Use the connectionID to identify the client or correlate messages.
The gist below demonstrates async communication via websocket in Golang. For demonstration purposes this sends a message to socket client off the back of an SNS message but this could be happening anytime, anywhere. This can also be invoked by the CLI or the HTTP API. In the context of a lambda the message is easiest sent by using the
APIGatewayManagementAPI package from the AWS SDK using the socket endpoint for the gateway the client is using. Correlating clients to connectionIDs in this scenario could be built into the data store we saw earlier or included in any messages as a correlation Id. When using
APIGatewayManagementAPI in this context it may return a 410 error which signals that the socket has been closed. The unreliablility of the $disconnect action noted above means that stale or killed connections must be robustly handled here.
The github repo with a sample implementation of all of this is here. it goes far beyond the gists above so please do take a deeper dive for better idea of what has been described. It is intended to be used as a rough template for a Golang/Serverless implementation that requires the use of API Gateway websockets.
Get in contact
If you have comments, questions or better ways to do anything that I have discussed in this post then please get in contact.