How To Build AWS-Compatible APIs: AWS Sigv4
Nearly all requests that touch an AWS service like S3, DynamoDB, and the entire AWS API use AWS Sigv4 authentication.
Sigv4 is the authentication process AWS clients use to sign the request content, using a combination of metadata including the Access Key ID and Secret Access Key. It's the same idea as RSA keys, or other PKI (public key infrastructure).
By implementing this signature verification process as a middleware in our APIs, we can emulate AWS APIs and AWS compatible services.
For example, ScyllaDB has the Alternator API which is DynamoDB compatible. In the past, I've used this to make a DynamoDB-compatible API for CockroachDB, and an S3 proxy for IceDB.
In this post, we'll implement AWS Sigv4 as a middleware in Go so that we can verifiy the signatures of incoming AWS client requests against known keys.
Setting up the middleware
For this demo, we'll be using the echo framework (it's my current favorite).
First I'll show you the middleware function, and then we'll dive into the functionality. You can find the code on Github. For the extra curious, you can follow the spec from AWS here.
func verifyAWSRequest(next echo.HandlerFunc) echo.HandlerFunc {
return func(c echo.Context) error {
// parse out the auth header
parsedHeader := parseAuthHeader(c.Request().Header.Get("Authorization"))
// build the canonical request
canonicalRequest := getCanonicalRequest(c)
// get the string to sign
stringToSign := getStringToSign(c, canonicalRequest)
// build the signing key
signingKey := getSigningKey(c, someSecretKey)
// calculate the signature (in hex)
signature := fmt.Sprintf("%x", getHMAC(signingKey, []byte(stringToSign)))
if signature != parsedHeader.Signature {
return ErrInvalidSignature
}
cc, _ := c.(*CustomContext)
cc.AWSCredentials = parsedHeader.Credential
return next(c)
}
}
Parse the Authorization header
The first thing we want to do is parse the Authorization
header of the incoming request:
type AWSAuthHeader struct {
Credential AWSAuthHeaderCredential
SignedHeaders []string
Signature string
}
type AWSAuthHeaderCredential struct {
KeyID string
Date string
Region string
Service string
Request string
}
func parseAuthHeader(header string) AWSAuthHeader {
var authHeader AWSAuthHeader
parts := strings.Split(header, " ")
for _, part := range parts {
// Remove the trailing `,`
if part[len(part)-1] == ',' {
part = part[:len(part)-1]
}
keyValue := strings.SplitN(part, "=", 2)
if len(keyValue) != 2 {
continue
}
key, value := keyValue[0], keyValue[1]
switch key {
case "Credential":
credentialParts := strings.Split(value, "/")
authHeader.Credential = AWSAuthHeaderCredential{
KeyID: credentialParts[0],
Date: credentialParts[1],
Region: credentialParts[2],
Service: credentialParts[3],
Request: credentialParts[4],
}
case "SignedHeaders":
authHeader.SignedHeaders = strings.Split(value, ";")
case "Signature":
authHeader.Signature = value
default:
continue
}
}
return authHeader
}
This will pull out all the info from the header we need. Not too much interesting going on here, just some string parsing to reverse engineer an ugly header 🤷
Build the canonical request
Now, we have to build something called a canonical request. What goes into this is largely determined by the values of the Authorization
header, but it takes the form of:
<HTTPMethod>\n
<CanonicalURI>\n
<CanonicalQueryString>\n
<CanonicalHeaders>\n
<SignedHeaders>\n
<HashedPayload>
The code for calculating this looks like:
func getCanonicalRequest(c echo.Context) string {
s := ""
s += c.Request().Method + "\n"
s += c.Request().URL.EscapedPath() + "\n"
s += c.Request().URL.Query().Encode() + "\n"
signedHeadersList, _ := lo.Find(strings.Split(c.Request().Header.Get("Authorization"), ", "), func(item string) bool {
return strings.HasPrefix(item, "SignedHeaders")
})
signedHeaders := strings.Split(strings.ReplaceAll(strings.ReplaceAll(signedHeadersList, "SignedHeaders=", ""), ",", ""), ";")
sort.Strings(signedHeaders) // must be sorted alphabetically
for _, header := range signedHeaders {
if header == "host" {
// For some reason the host header was blank (thanks go)
s += strings.ToLower(header) + ":" + strings.TrimSpace(c.Request().Host) + "\n"
continue
}
s += strings.ToLower(header) + ":" + strings.TrimSpace(c.Request().Header.Get(header)) + "\n"
}
s += "\n" // examples have this JESUS WHY DOCS FFS
s += strings.Join(signedHeaders, ";") + "\n"
shaHeader := c.Request().Header.Get("x-amz-content-sha256")
s += lo.Ternary(shaHeader == "", "UNSIGNED-PAYLOAD", shaHeader)
return s
}
More boring string manipulation... You can see some frustration in the comments. The first issue I found is that the Go http.Request
clears out the host
header, and puts it in http.Request.Host
. To be completely candid, this blocked me from achieving this for over a year... I will never forgive this heresy.
The next issue is that despite the docs claiming otherwise, I found an extra newline had to be inserted.
Hash EVERYTHING
It wouldn't be a signature if we didn't hash a bunch of times, would it?
Well we're going to do it 6 times.
First, we need to get the "string to sign":
func getStringToSign(c echo.Context, canonicalRequest string) string {
s := "AWS4-HMAC-SHA256" + "\n"
s += c.Request().Header.Get("X-Amz-Date") + "\n"
scope := c.Request().Header.Get("X-Amz-Date")[:8] + "/" + utils.Env_Region + "/" + utils.Env_AWSService + "/aws4_request"
s += scope + "\n"
s += fmt.Sprintf("%x", getSHA256([]byte(canonicalRequest)))
return s
}
This is effectively the final string that we will sign with the secret access key to compare against the incoming request.
Notice the utils.Env_AWSService
and utils.Env_Region
, the service and region we are requesting to is actually part of the string to sign. The service is typically the name in all lower case, like s3
or dynamodb
. These aren't documented anywhere that I could find, but thankfully we parsed it from the auth header.
That's 1/6 hashes.
Next, we are going to construct the signing key. This is not simply the secret access key, but that combined with 4 more hashes:
func getSigningKey(c echo.Context, password string) []byte {
dateKey := getHMAC([]byte("AWS4"+password), []byte(c.Request().Header.Get("X-Amz-Date")[:8]))
dateRegionKey := getHMAC(dateKey, []byte(utils.Env_Region))
dateRegionServiceKey := getHMAC(dateRegionKey, []byte(utils.Env_AWSService))
signingKey := getHMAC(dateRegionServiceKey, []byte("aws4_request"))
return signingKey
}
Notice that this depends on the region we are requesting to, that's why the client packages force you to specify a region, even if you're interacting with a global service like Route53 (us-east-1
is the default region if not specified).
That's 5/6 hashes.
The final hash comes from using the signing key to calculate the HMAC of the string to sign:
stringToSign := getStringToSign(c, canonicalRequest)
signingKey := getSigningKey(c, someSecretKey)
signature := fmt.Sprintf("%x", getHMAC(signingKey, []byte(stringToSign)))
With that we're done! 6/6 hashes.
We can compare this signature against the one parsed from the Authorization
header:
if signature != parsedHeader.Signature {
return ErrInvalidSignature
}
cc, _ := c.(*CustomContext)
cc.AWSCredentials = parsedHeader.Credential
return next(c)
If it's valid, we'll attach a nice struct to our request so we can reference the auth information in subsequent request handlers. This way we can know what user has made this request.
In your API, you'll need to look up the secret access key from the provided access key ID, so you'll probably want to attach that information to the request as well so you don't have to look it up again (unless you cache it on the host).
Cloning AWS Services
Now that we have the signature figured out, we can dive into AWS service API references and clone API to build our own AWS compatible services!
This is how I built the IceDB S3 Proxy to be able to trick SQL queries into thinking it's directly querying an S3 bucket, when in reality I am intercepting S3 list calls and returning my own custom object list, and mutating object requests to change the paths before forwarding the request to S3.
You could also take the challenge of building your own minimal interface for the DynamoDB functionality you use, and stick it in front of another database so you can move off AWS if needed.
Did someone say... startup idea? I might already be working on something 😄