Serverless Mono Repo State Machine using only AWS S3 and API requests
The most efficient approach
Background
You are looking to share code between your primary aws lambda, and another lambda which is triggered when an s3 file of a particular sort, is added to an s3 bucket.
The dilemma here is how to do so, in a way that is most efficient and respectful of your current architecture.
Well, you’re in luck. I went through the struggles so you didn’t have to(can skip to the end if want the prime solution).
Solution 1 — Sharing via Typescript
The first approach taken was using paths within typescript, creating a base tsconfig.json that pointed to the baseUrl to root, and then extending that tsconfig in all lambdas. The difficulty with this approach is that lambdas are built using sam cli, and they bundle things outside of the base tsconfig. I found using this approach that it required an extra layer of management.
Solution 2 — Lambda Layers
Due to our architecture, lambda layers also added an extra layer of maintenance. Why? We build our lambdas using half-part docker and half-part zip x makefiles(for more specialized lambdas). The out of box configuration for lambda layers required an extra tsconfig to work with dockers. This made the architecture go haywire.
Solution 3 — Optimal Solution — Call main lambda via API request from step function
I found this after experimenting to be the optimal solution. Most state machine execution via s3 is done for the following two categories:
- Indexing(elastic, vector databases etc) that can be thought as an afterthought. Done in step function/state machine to offload performance from the main application.
- Off-loaded from the main lambda function to go beyond 15 minute lambda limitation.
You can simply call your lambda within the backend the same way you do within frontend. This can be via using the access token passed in via the backend.
Solution 3 — Moving beyond access tokens — AWS_IAM
The one caveat with this approach is that it will most likely be your first time using something beyond an API key, or an access token.
Being that there is one potential use case where the access token might have expired while the step function is running, access tokens are non-ideal solution for this graphql API request.
AWS Appsync + AWS_IAM
For our particular use case we are using appsync, this means adding
@aws_oidc @aws_iam
to our graphql schema. We can then use this syntax to make a call to our API request:
// Dependencies (install with npm or yarn)
import AWS from 'aws-sdk';
import { IGraphQLRequest, GraphQLResult } from 'aws-appsync';
// Replace with your AppSync endpoint URL
const appsyncUrl = 'YOUR_APPSYNC_URL';
// Replace with your IAM role credentials
const credentials = {
accessKeyId: 'YOUR_ACCESS_KEY_ID',
secretAccessKey: 'YOUR_SECRET_ACCESS_KEY',
sessionToken: 'YOUR_SESSION_TOKEN', // Optional if using temporary credentials
};
// Define your GraphQL query or mutation
const query = `
query MyQuery($id: ID!) {
getItem(id: $id) {
id
name
}
}
`;
// Define your variables (optional)
const variables = {
id: '123',
};
async function callAppSync(query: string, variables?: object): Promise<GraphQLResult> {
const AWSConfig = new AWS.Config({ region: 'YOUR_REGION' }); // Replace with your region
AWSConfig.credentials = new AWS.Credentials(credentials);
const Api = new AWS.AppSync({ endpoint: appsyncUrl });
const request: IGraphQLRequest = {
query,
variables,
};
try {
const response = await Api.graphql(request).promise();
return response.data as GraphQLResult;
} catch (error) {
console.error('AppSync request error:', error);
throw error;
}
}
// Call your function with the query and optional variables
callAppSync(query, variables)
.then((data) => console.log('AppSync response data:', data))
.catch((error) => console.error('Error calling AppSync:', error));
There you have it. This infrastructure allowing us to not have to change a line of devops, or share code. Yet, following this architecture, we have saved ourselves duplicating 40 files. Good architecture for the win!