You can use AWS::Athena::NamedQuery resource to create an Amazon Athena query. However there is no resource that can be used to create a database and table(s). This is where we can make use of Lambda-backed custom resources in a CloudFormation (CF) template.
The CF template should include:
- A Lambda function that takes DDL scripts from a S3 bucket and executes them on Athena.
- An IAM execution role that has the required Athena and S3 permissions.
- A custom resource that invokes the lambda function.
CF Template Snippets
The following snippets explain relevant parts of the CF template to help you understand how to associate a Lambda function with a custom resource and how to use the function’s response. The CF template and code for the Lambda function is available here in a github repository.
Lambda Function
To create the Lambda function, you declare the AWS::Lambda::Function
resource, which requires the function’s source code, handler name, runtime environment, and execution role ARN.
AthenaSqlLambda: DependsOn: LambdaRolePolicy Type: AWS::Lambda::Function Properties: FunctionName: 'sql_deploy_custom_resource_lambda' Handler: sql_deploy_custom_resource_lambda.lambda_handler MemorySize: 128 Code: S3Bucket: !Ref 'repobucket' S3Key: sql_deploy_custom_resource_lambda.zip Runtime: python2.7 Role: !GetAtt [LambdaRole, Arn] Timeout: 300
The Code
property specifies the Amazon S3 location (bucket name and file name) where you uploaded the sample package. The sample template uses input parameters ("Ref": "repobucket"
) to set the bucket name so that you are able to specify the names when you create the stack. You can also replace the hardcoded S3Key with a parameter value.
Custom Resource
To associate a function with a custom resource, you specify the Amazon Resource Name (ARN) of the function for the ServiceToken property, using the Fn::GetAtt intrinsic function. AWS CloudFormation sends the additional properties that are included in the custom resource declaration, such as Region and Architecture, to the Lambda function as inputs.
AthenaSqlLambdaAttach: Type: Custom::AthenaSQLLambdaAttach Properties: CreateSqlS3KeyProperty: !Ref 'CreateSqlS3Key' DeleteSqlS3KeyProperty: !Ref 'DeleteSqlS3Key' DDLBucketProperty: !Ref 'DDLBucket' DataBucketProperty: 'test-upload-athena' AthenaSqlOutputUriProperty: !Join ['-', ['s3://aws-athena-query-results', !Ref 'AWS::AccountId', !Ref 'AWS::Region']] TargetDBJDBCUrlProperty: !Ref TargetDBJDBCUrl ServiceToken: !GetAtt [AthenaSqlLambda, Arn]
The Properties
property is used to pass environment variables to the Lambda function. All properties defined here are accessible from the event object as can be seen from the following excrept from the Lambda function code.
# set environment variables exec_environment = 'nonprod' ddl_bucket = event['ResourceProperties']['DDLBucketProperty'] ddl_create_key = event['ResourceProperties']['CreateSqlS3KeyProperty'] ddl_delete_key = event['ResourceProperties']['DeleteSqlS3KeyProperty'] test_output_location = event['ResourceProperties']['AthenaSqlOutputUriProperty']
You might have noticed that I am passing two different SQL DDL scripts, a Create and a Delete. The deletion script is used when CloudFormation stack is deleted or a rollback is triggered because of a failure during stack creation.
Lambda Function Code Snippets
The Lambda function used for creating database and table(s) is pretty simple. All it is doing is:
- Determine the stack event (Create, Update or Delete)
- Download the create or delete DDL scripts from S3
- Execute the scripts on Athena
The only potential difficult bit is understanding the Custom Reosurce Response Objects. You can read more about them here (AWS Lambda Function Code).
There is no code under https://github.com/Irtaza/cf_sql_code_deploy
🙁
Thanks for all the hard work, great explanation. looking forward for the full code 🙂
I have finally got around to adding the code to github. Better late than never.