Skip to content

curiouscurrent/data-engineering-mlops

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

data-engineering-mlops

USECASE-1 : Trigger Glue Job Using AWS Lambda

CODEFILES : https://github.com/curiouscurrent/data-engineering-mlops/tree/main/Trigger-GlueJob-Using-AWSLambda

  1. We have data incoming to our S3 bucket in CSV file format
  2. We need to create one AWS Glue Job (ETL) , which will transfer data from AWS S3 input bucket to another AWS S3 target bucket in the form of JSON.
  3. We have to trigger this Glue Job using AWS Lambda.
  4. When the glue job is triggered, it fetches the glue job script from Glue Job Script bucket. (which can be set in (Glue "Job Details - Advanced Properties")

Steps :

  1. Create 3 buckets : for input, target and to store the Glue Job script.

    image
  2. Now create a ETL - AWS Glue job

    image
  3. Create an IAM role for the Glue Job, and assign the following permissions : Allows Glue to call AWS services on your behalf.

    image
  4. Now create a Lambda function and add the trigger : Only triggered when a user pushes a .csv file into the input bucket (only allow the "PUT" Event)

    image
  5. Add the IAM role for the Lambda function and assign the same permissions as attached to AWS Glue Job. Now deploy the code

    image
  6. Now let us upload a .csv file in input S3 bucket

image
  1. Now check if the GLue Job has executed
image
  1. Now check the target bucket, you should find a .json
image
  1. Check the logs via CloudWatch

    image

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages