Decoding Serverless Orchestration: Understanding AWS Step Functions through the “Image Converter” Mini Project

Pamal Jayawickrama
AWS Tip
Published in
8 min readJan 2, 2024

--

In the era of serverless computing, AWS Step Functions emerge as a powerful tool for orchestrating complex workflows seamlessly. In this article, we’ll explore the implementation of an image converter project using AWS Step Functions, AWS Lambda, S3, and DynamoDB. This serverless architecture efficiently transforms, copies, resizes, and manages images, showcasing the flexibility and scalability of AWS Step Functions.

What are AWS Step Functions?

AWS Step Functions is a fully managed service that allows you to coordinate and sequence AWS services and microservices into serverless workflows. It provides a visual interface for designing, managing, and executing state machine-based workflows, making it easier to build scalable and resilient applications.

Why AWS Step Functions?

  1. Seamless Orchestration:
    AWS Step Functions simplifies the orchestration of multiple AWS services, allowing you to define workflows in a straightforward and visual manner. This abstraction reduces the complexity of managing intricate business processes.
  2. Error Handling:
    Built-in error-handling capabilities ensure that your workflows can gracefully handle unexpected issues. By defining error-catching mechanisms within the state machine, you can enhance the resilience of your applications.
  3. Parallel Execution:
    AWS Step Functions supports parallel execution of tasks, enabling efficient processing of multiple steps simultaneously. This capability is particularly beneficial for tasks that can be performed independently, leading to faster overall execution times.
  4. Stateful Coordination:
    Step Functions maintain state throughout the execution of a workflow, allowing you to build stateful applications without the need for complex custom solutions. This is crucial for managing long-running processes and workflows.
  5. Integrated with AWS Services:
    Seamlessly integrates with a variety of AWS services, such as AWS Lambda, Amazon S3, DynamoDB, and more. This integration streamlines the development of applications that leverage various AWS functionalities.

When to Use AWS Step Functions?

  1. Workflow Automation:
    Use AWS Step Functions when you need to automate and coordinate workflows that involve multiple AWS services. It’s ideal for managing complex business processes efficiently.
  2. Microservices Orchestration:
    When building applications using a microservices architecture, AWS Step Functions can orchestrate the execution of individual services, ensuring they work together cohesively.
  3. Asynchronous Task Processing:
    If your application involves asynchronous processing of tasks, AWS Step Functions can help coordinate and monitor these tasks, making it easier to manage complex asynchronous workflows.
  4. Stateful Applications:
    When your application requires stateful coordination between different steps, AWS Step Functions provides a convenient way to manage and track the state of your workflows.
  5. Error-Prone Workflows:
    Use AWS Step Functions for workflows that might encounter errors or failures. The service simplifies error handling, making it easier to manage and recover from unexpected issues.

Project Overview: Image Converter

This project is designed to convert JPEG images to PNG. Initially, it checks if the original image in the S3 bucket is a JPEG. If so, it converts it to PNG while simultaneously moving the original image to another S3 location. Subsequently, it writes relevant information to DynamoDB and then deletes the original JPEG image. If the image is not a JPEG, it is deleted from the S3 bucket

Project Structure

The project consists of several AWS Lambda functions, each handling a specific step in the image conversion process. Let’s briefly walk through the key components:

GetFileType Lambda Function:

Extracts the file type from an uploaded image in an S3 bucket.

module.exports.getFileType = async (event) => {
let fileName = event.s3.object.key;
let index = fileName.lastIndexOf('.');
if (index > 0) {
return fileName.substring(index + 1);
}
else {
return null;
}
};

CopyFile Lambda Function:

Copies the original image to a destination S3 bucket.

module.exports.copyFile = async (event) => {
await S3.copy(event.s3.bucket.name, event.s3.object.key)
return {
region: 'us-east-1',
bucket: process.env.DESTINATION_BUCKET,
key: event.s3.object.key
}
};

ResizeImage Lambda Function:

Resizes the copied image from JPEG to PNG format, demonstrating the versatility of serverless image processing.

module.exports.resizeImage = async (event) => {
const bucket = event.s3.bucket.name
const key = event.s3.object.key

const jpgBuffer = await S3.get(bucket, key)
const pngBuffer = await imagemagick.convert({
srcData: jpgBuffer,
format: 'png'
});

const pngKey = key.replace('.jpg', '.png');
const body = pngBuffer.toString('base64');
await S3.send(bucket, key, body)

return {
region: 'us-east-1',
bucket: process.env.BUCKET_NAME,
key: pngKey
}

};

DeleteFile Lambda Function:

Deletes the image from the source S3 bucket.

module.exports.deleteFile = async (event) => {
await S3.del(event.s3.bucket.name, event.s3.object.key)
return {
status: "Deleted from source bucket",
sourceBucket: event.s3.bucket.name,
destinationBucket: process.env.DESTINATION_BUCKET
}
};

WriteToDynamoDB Lambda Function:

Writes relevant information about the image to DynamoDB for future reference.

module.exports.writeToDynamoDB = async (event) => {
Item = {
imageName: event.results.images[0].original.key,
images: event.results.images,
}
const done = await Db.put(Item);
return {
status: "Item saved successfully",
details: done
}
};

Configuring Serverless Orchestration: A Dive into the serverless.yml File

YAML configuration not only defines AWS Lambda functions but also encapsulates the logic and flow of the serverless state machine.

Lambda Functions Configuration:

functions:
upladToS3:
handler: src/handler.upload
events:
- http:
method: post
path: upload

trggerStepFunction:
handler: src/trigger.triggerStepFunc
role: ${env:IAM_ROLE}
environment:
STATE_MACHINE_ARN: ${env:STATE_MACHINE_NAME}
events:
- s3:
bucket: ${env:BUCKET_NAME}
event: s3:ObjectCreated:*
existing: true

stepFunc-getFileType:
handler: src/stepFunction.getFileType
role: ${env:IAM_ROLE}

stepFunc-copyToDestination:
handler: src/stepFunction.copyFile
role: ${env:IAM_ROLE}

stepFunc-resizeImage:
handler: src/stepFunction.resizeImage
role: ${env:IAM_ROLE}

stepFunc-DeleteSourceFile:
handler: src/stepFunction.deleteFile
role: ${env:IAM_ROLE}

stepFunc-WriteToDynamoDb:
handler: src/stepFunction.writeToDynamoDB
role: ${env:IAM_ROLE}

Here, we define AWS Lambda functions responsible for uploading to S3 and triggering the Step Functions in the state machine.

Step Functions Plugin Integration:

plugins:
- serverless-offline
- serverless-dotenv-plugin
- serverless-step-functions

The inclusion of the serverless-step-functions plugin signifies the seamless integration of AWS Step Functions into our serverless architecture.

Step Functions Definition:

stepFunctions:
stateMachines:
imageConverter:
name: sv-state-machine
definition:
Comment: States to Read
StartAt: GetFileType
States:
GetFileType:
Type: Task
Resource: !GetAtt stepFunc-getFileType.Arn
TimeoutSeconds: 3
ResultPath: $.results.fileType
Next: CheckFileType
Catch:
- ErrorEquals:
- States.ALL
Next: QuitMain
CheckFileType:
Type: Choice
Choices:
- Variable: $.results.fileType
StringEquals: jpg
Next: ProcessFile
Default: DeleteSourceFile
ProcessFile:
Type: Parallel
ResultPath: $.results.images
Branches:
- StartAt: CopyToDestination
States:
CopyToDestination:
Type: Task
Resource: !GetAtt stepFunc-copyToDestination.Arn
TimeoutSeconds: 3
ResultPath: $.image.original
OutputPath: $.image
End: true
Retry:
- ErrorEquals:
- States.TaskFailed
- States.Timeout
IntervalSeconds: 5
MaxAttempts: 2
BackoffRate: 2
- ErrorEquals:
- States.ALL
IntervalSeconds: 2
MaxAttempts: 2
BackoffRate: 2
Catch:
- ErrorEquals:
- States.ALL
Next: QuitCopy
QuitCopy:
Type: Fail
Error: CopyError
Cause: An Error Occurred While Executing The CopyToDestination Task
- StartAt: ResizeImage
States:
ResizeImage:
Type: Task
Resource: !GetAtt stepFunc-resizeImage.Arn
TimeoutSeconds: 3
ResultPath: $.image.resized
OutputPath: $.image
End: true
Retry:
- ErrorEquals:
- States.TaskFailed
- States.Timeout
IntervalSeconds: 5
MaxAttempts: 2
BackoffRate: 2
- ErrorEquals:
- States.ALL
IntervalSeconds: 2
MaxAttempts: 2
BackoffRate: 2
Catch:
- ErrorEquals:
- States.ALL
Next: QuitResize
QuitResize:
Type: Fail
Error: GenericError
Cause: An Error Occurred While Executing The State Machine
Next: WriteToDynamoDb
DeleteSourceFile:
Type: Task
Resource: !GetAtt stepFunc-DeleteSourceFile.Arn
TimeoutSeconds: 3
ResultPath: $.results.deletionStatus
OutputPath: $.results
Catch:
- ErrorEquals:
- States.ALL
Next: QuitMain
End: true
WriteToDynamoDb:
Type: Task
Resource: !GetAtt stepFunc-WriteToDynamoDb.Arn
TimeoutSeconds: 3
ResultPath: $.results.writeStatus
Catch:
- ErrorEquals:
- States.ALL
Next: QuitMain
Next: DeleteSourceFile
QuitMain:
Type: Fail
Error: GenericError
Cause: An Error Occurred While Executing The State Machine

The imageConverter state machine is the heart of the project, orchestrating the workflow from checking file types to resizing images and writing to DynamoDB. The state definitions highlight the logic and flow of each step.

Parallel Processing and Error Handling:

          ProcessFile:
Type: Parallel
ResultPath: $.results.images
Branches:
- StartAt: CopyToDestination
States:
CopyToDestination:
Type: Task
Resource: !GetAtt stepFunc-copyToDestination.Arn
TimeoutSeconds: 3
ResultPath: $.image.original
OutputPath: $.image
End: true
Retry:
- ErrorEquals:
- States.TaskFailed
- States.Timeout
IntervalSeconds: 5
MaxAttempts: 2
BackoffRate: 2
- ErrorEquals:
- States.ALL
IntervalSeconds: 2
MaxAttempts: 2
BackoffRate: 2
Catch:
- ErrorEquals:
- States.ALL
Next: QuitCopy
QuitCopy:
Type: Fail
Error: CopyError
Cause: An Error Occurred While Executing The CopyToDestination Task
- StartAt: ResizeImage
States:
ResizeImage:
Type: Task
Resource: !GetAtt stepFunc-resizeImage.Arn
TimeoutSeconds: 3
ResultPath: $.image.resized
OutputPath: $.image
End: true
Retry:
- ErrorEquals:
- States.TaskFailed
- States.Timeout
IntervalSeconds: 5
MaxAttempts: 2
BackoffRate: 2
- ErrorEquals:
- States.ALL
IntervalSeconds: 2
MaxAttempts: 2
BackoffRate: 2
Catch:
- ErrorEquals:
- States.ALL
Next: QuitResize
QuitResize:
Type: Fail
Error: GenericError
Cause: An Error Occurred While Executing The State Machine
Next: WriteToDynamoDb

Parallel processing and error handling mechanisms ensure the robustness of the state machine. Multiple branches handle tasks concurrently, with defined catch blocks gracefully managing errors.

Resource Definitions:

resources:
Resources:
ImageBucket:
Type: 'AWS::S3::Bucket'
Properties:
BucketName: ${env:BUCKET_NAME}

destinationBucket:
Type: 'AWS::S3::Bucket'
Properties:
BucketName: ${env:DESTINATION_BUCKET}

stepFnTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: ${env:TABLE_NAME}
BillingMode: PAY_PER_REQUEST
AttributeDefinitions:
- AttributeName: imageName
AttributeType: S
KeySchema:
- AttributeName: imageName
KeyType: HASH

The definition of S3 buckets and a DynamoDB table sets the groundwork for storing and managing images as they traverse through the state machine.

State Machine in Action: Analyzing Input and Output JSON Files

let’s examine the input and output JSON files generated during the execution of the AWS Step Functions state machine. These files encapsulate the journey of an image through various states, from its origin in an S3 bucket to its transformation and storage.

Input JSON: Image Upload Event

{
"eventVersion": "2.1",
"eventSource": "aws:s3",
"awsRegion": "us-east-1",
"eventTime": "2023-11-05T16:53:33.584Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "AMTVDFT1HG342"
},
"requestParameters": {
"sourceIPAddress": "112.135.64.182"
},
"responseElements": {
"x-amz-request-id": "V0CMMK0S8TPFR2KY",
"x-amz-id-2": "5y9wrObK708ivDJsiLTX78OFwH2qi4WUMwlkJJNUt6lmVws9XNYzdi+jkg5VkZ2kFzUQIIeEGHeBS9+LdSdGthZgKPvLn6e1nutBptfdnM8="
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "poc-2-t-service-serverless-dev-trggerStepFunction-68a737133cefb25bff959852b8f04754",
"bucket": {
"name": "poc-2-transform-init-bucket",
"ownerIdentity": {
"principalId": "AMTVDFT1HG342"
},
"arn": "arn:aws:s3:::poc-2-transform-init-bucket"
},
"object": {
"key": "1.jpg",
"size": 132067,
"eTag": "d93c7bc95f0e90dff6fc89098ccfe36b",
"sequencer": "006547C88D7FFD9FB5"
}
}
}

This input JSON captures the event when an image (1.jpg) is uploaded to the poc-2-transform-init-bucket S3 bucket. The event includes details such as the s3 bucket name, key, and the AWS region.

Output JSON: State Machine Results

{
"fileType": "jpg",
"images": [
{
"original": {
"region": "us-east-1",
"bucket": "poc-2-transform-destination-bucket",
"key": "1.jpg"
}
},
{
"resized": {
"region": "us-east-1",
"bucket": "poc-2-transform-destination-bucket",
"key": "1.png"
}
}
],
"writeStatus": {
"status": "Item saved successfully",
"details": {
"$metadata": {
"httpStatusCode": 200,
"requestId": "403KRHLP1D60TMV1ML08RIPS2VVV4KQNSO5AEMVJF66Q9ASUAAJG",
"attempts": 1,
"totalRetryDelay": 0
}
}
},
"deletionStatus": {
"status": "Deleted from source bucket",
"sourceBucket": "poc-2-transform-init-bucket",
"destinationBucket": "poc-2-transform-destination-bucket"
}
}

This output JSON signifies the successful orchestration of the state machine. It outlines the file type, the status of image operations, and insightful details such as the destination bucket and the success status of DynamoDB item insertion.

The orchestration of the state machine is successful; all states are now in the green, indicating that each state has been successfully completed.”

When an error is encountered, the state turns to orange, signifying that the state machine has not successfully completed.

By combining AWS Step Functions with other serverless services, we’ve successfully created an automated and scalable image conversion workflow. This project not only demonstrates the power of serverless orchestration but also provides a foundation for building more complex and efficient workflows in the cloud.

If you have any questions or doubts regarding the code implementation, feel free to explore the GitHub repository associated with this project.

Thank You and Happy Coding !

--

--