selmertsxの素振り日記

ひたすら日々の素振り内容を書き続けるだけの日記

Query and Visualize AWS Cost and Usage Data Using Amazon Athena and Amazon QuickSightのメモ

モチベーション

  • 弊社ではセキュリティ対策の一環からAWS Organizationとそれに関連するセキュリティ対策を導入した
  • セキュリティ対策で利用しているAWS Config Rule, GuardDuty, CloudTrailのサービスについては按分計算が必要
  • 按分をいい感じに計算したいので諸々調査してみる

この記事に書かれていること

aws.amazon.com

上記ページを読んで簡単にサマリーしてみる。

  • CloudFormationを使って、AWS Cost and Usage Reportsを Athenaで分析できるようにする
  • LambdaはAthenaが読み込めるように、適切なS3のパスでデータを格納させる
  • AWS Cost and Usage Reportsは、RedShiftで見れるように設定しとかなければならない
    • これはマニフェストが変更されるだけなので、実際にRedShiftにデータが入る訳ではない。
    • この設定をしても、実際に反映されるのは翌日

このCloudFormationがやってくれること

  • athenaで読み込める形のデータがS3に突っ込まれる
    • パス: year=${current_year}/month=${current_month}
    • ちゃんとmonth単位で分割することで、余計なデータを読み込みに行かない
  • aws_billing_report databaseがathenaに作られる
  • my_cur_reportというテーブルがathenaに作られる

CloudFormationの設定

  • CostnUsageRport: AWS Cost and Usage Reportの名前
  • S3BucketName: Athenaで分析するためのデータが格納される S3 bucketの名前
  • S3 CURBucket: AWS Cost and Usage Reportが現在格納されているS3 bucketの名前
  • CloudFormationを実行したら、書きのbucketがS3に生成される
    • year=current_year
    • aws-athena-query-results

Adding a Lambda trigger

  • CloudFormation Stackが実行完了したら、Lambda Triggerを設定する。
  • LambdaのTriggerは、AWS Cost and Usage Reportを出力しているS3 bucketが良い。
  • lambda console => aws-cost-n-usage-main-lambda-fn-A => Trigger S3
  • S3 bucket Nameは、Cost and Usage Reportsを出しているやつ。
  • Event TypeはObject Created.

Resources built, trigger created… now what?

これらのlambda functionが作成されていることを確認しといてね

  • aws-cost-n-usage-S3-lambda-fn-B
  • aws-cost-n-usage-main-lambda-fn-A
  • aws-cost-n-usage-S3-lambda-fn-B-2
  • aws-cost-n-usage-Athena-lambda-fn-C
  • aws-cost-n-usage-Athena-lambda-fn-C-2
  • Athena DatabaseやTableが作成されたあとQueryを実行できる
SELECT from_iso8601_timestamp(lineitem_usagestartdate) AS lineitem_usagestartdate,
        from_iso8601_timestamp(lineitem_usageenddate) AS lineitem_usageenddate,
        product_instancetype,
        count(*) AS count
FROM aws_billing_report.my_cur_report
WHERE lineitem_productcode='AmazonEC2'and (lineitem_operation LIKE '%RunInstances%'
        OR lineitem_usagetype LIKE '%BoxUsage%')
        AND lineitem_usagetype NOT LIKE 'SpotUsage%'
        AND lineitem_usagetype NOT LIKE '%Out-Bytes%'
        AND lineitem_usagetype NOT LIKE '%In-Bytes%'
        AND lineitem_usagetype NOT LIKE '%DataTransfer%'
        AND pricing_term='OnDemand'
GROUP BY  lineitem_usagestartdate,lineitem_usageenddate,product_instancetype,lineitem_usagetype
ORDER BY  lineitem_usagestartdate, product_instancetype;

LambdaのTest

感想

  • コードをGitHubで確認させて欲しい
  • 現在の要件なら、S3 Selectでも十分に対処できそう
  • AWS Lambda + S3 Selectで作ることにした

おまけ CloudFormation.yml

AWSTemplateFormatVersion: 2010-09-09
Description: >-
  This CloudFormation Template builds a Serverless Solution for Querying Amazon
  Cost & Usage Report!. It creates an S3 bucket that will be used to store your
  Cost & Usage Reports after they have been extracted and transformed. The
  extraction and transformation is done by an Amazon Lambda function that is
  also created by this template. This Lambda functions job also includes, the
  creation of an Amazon Athena database and table. After table creation, it adds
  the partition to the Athena (or Hive) meta-data store. Something to note, on
  the Athena table is crreated from the column title/header field of the Cost &
  Usage report. This means that the table schema will keep up with any changes
  made by Amazon billing to the Cost & Usage Report. A few other things created
  by this CloudFormation template are an IAM LambdaFullAccess Role and
  Environment Variables for Lambda.
Parameters:
  CostnUsageReport:
    Description: Please enter the the name of your Cost & Usage Report
    Type: String
    ConstraintDescription: This field should only hold the name of your Cost & Usage Report.
  s3CURBucket:
    Description: >-
      Please enter the name of the S3 Bucket in which the Amazon Billing System
      currently writes your Cost & Usage Reports. You must enter the name only,
      for eg. MYBUCKETNAME
    Type: String
  S3BucketName:
    Type: String
    Description: >-
      This solution creates one S3 bucket to store your transformed CUR. Please
      enter the name you would like this bucket to have. The bucket names must
      be a series of one or more labels. It can contain lowercase letters,
      numbers, and hyphens but bucket names must not be formatted as an IP
      address (e.g., 192.168.5.4). Each label must start and end with a
      lowercase letter or a number.
    MinLength: '3'
    MaxLength: '63'
    Default: S3 Bucket Name being created
Mappings:
  RegionMap:
    us-east-1:
      bucketname: a-athena-nvirginia
    us-east-2:
      bucketname: a-athena-ohio
    us-west-1:
      bucketname: a-athena-ncalifornia
    us-west-2:
      bucketname: a-athena-oregon
    ca-central-1:
      bucketname: a-athena-central-canada
    eu-west-1:
      bucketname: a-athena-ireland
    eu-west-2:
      bucketname: a-athena-london
    ap-southeast-1:
      bucketname: a-athena-singapore
    ap-northeast-1:
      bucketname: a-athena-tokyo
    ap-south-1:
      bucketname: a-athena-mumbai
    ap-northeast-2:
      bucketname: a-athena-seoul
    ap-southeast-2:
      bucketname: a-athena-sydney
    eu-central-1:
      bucketname: a-athena-frankfurt
    sa-east-1:
      bucketname: a-athena-sao-paulo
Resources:
  myS3Bucket:
    Type: 'AWS::S3::Bucket'
    Properties:
      BucketName: !Ref S3BucketName
  LambdaExecutionRole:
    Type: 'AWS::IAM::Role'
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - lambda.amazonaws.com
            Action:
              - 'sts:AssumeRole'
      Path: /
      Policies:
        - PolicyName: lambdaroot
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - 'cloudwatch:*'
                  - 'athena:*'
                  - 'glue:*'
                  - 'dynamodb:*'
                  - 'events:*'
                  - 'iam:ListAttachedRolePolicies'
                  - 'iam:ListRolePolicies'
                  - 'iam:ListRoles'
                  - 'iam:PassRole'
                  - 'lambda:*'
                  - 'logs:*'
                  - 's3:*'
                  - 'sns:ListSubscriptions'
                  - 'sns:ListSubscriptionsByTopic'
                  - 'sns:ListTopics'
                  - 'sns:Subscribe'
                  - 'sns:Unsubscribe'
                  - 'sns:Publish'
                  - 'sqs:ListQueues'
                  - 'sqs:SendMessage'
                  - 'tag:GetResources'
                  - 'kms:ListAliases'
                  - 'ec2:DescribeVpcs'
                  - 'ec2:DescribeSubnets'
                  - 'ec2:DescribeSecurityGroups'
                  - 'xray:PutTraceSegments'
                  - 'xray:PutTelemetryRecords'
                Resource: '*'
  CURLambdaFn:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: aws-cost-n-usage-main-lambda-fn-A
      Handler: File-Reader-Main-Lmbda-Fn-A.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Timeout: '300'
      Runtime: nodejs6.10
      MemorySize: 1536
      Code:
        S3Bucket: !FindInMap
          - RegionMap
          - !Ref 'AWS::Region'
          - bucketname
        S3Key: File-Reader-Main-Lmbda-Fn-A.zip
      Environment:
        Variables:
          S3_BUCKET: !Ref S3BucketName
          iCUR_NAME: !Ref CostnUsageReport
          CUR_S3_BUCKET: !Ref s3CURBucket
          REGIONX: !Ref 'AWS::Region'
  S3CURLambdaFn:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: aws-cost-n-usage-S3-lambda-fn-B
      Handler: File-Reader-S3-CUR-Parser-Lmbda-B.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Timeout: '300'
      Runtime: nodejs6.10
      MemorySize: 1536
      Code:
        S3Bucket: !FindInMap
          - RegionMap
          - !Ref 'AWS::Region'
          - bucketname
        S3Key: File-Reader-S3-CUR-Parser-Lmbda-B.zip
      Environment:
        Variables:
          S3_BUCKET: !Ref S3BucketName
          iCUR_NAME: !Ref CostnUsageReport
          CUR_S3_BUCKET: !Ref s3CURBucket
          REGIONX: !Ref 'AWS::Region'
  S3CURLambdaFnB:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: aws-cost-n-usage-S3-lambda-fn-B-2
      Handler: File-Reader-S3-CUR-Parser-Lmbda-B-2.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Timeout: '300'
      Runtime: nodejs6.10
      MemorySize: 1536
      Code:
        S3Bucket: !FindInMap
          - RegionMap
          - !Ref 'AWS::Region'
          - bucketname
        S3Key: File-Reader-S3-CUR-Parser-Lmbda-B-2.zip
      Environment:
        Variables:
          S3_BUCKET: !Ref S3BucketName
          iCUR_NAME: !Ref CostnUsageReport
          CUR_S3_BUCKET: !Ref s3CURBucket
          REGIONX: !Ref 'AWS::Region'
  AthenaCURLambdaFn:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: aws-cost-n-usage-Athena-lambda-fn-C
      Handler: File-Reader-Athena-Lambda-Fn-C.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Timeout: '300'
      Runtime: nodejs6.10
      MemorySize: 1536
      Code:
        S3Bucket: !FindInMap
          - RegionMap
          - !Ref 'AWS::Region'
          - bucketname
        S3Key: File-Reader-Athena-Lambda-Fn-C.zip
      Environment:
        Variables:
          S3_BUCKET: !Ref S3BucketName
          CUR_S3_BUCKET: !Ref s3CURBucket
          REGIONX: !Ref 'AWS::Region'
  AthenaCURLambdaFnC:
    Type: 'AWS::Lambda::Function'
    Properties:
      FunctionName: aws-cost-n-usage-Athena-lambda-fn-C-2
      Handler: File-Reader-Athena-Lambda-Fn-C-2.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Timeout: '300'
      Runtime: nodejs6.10
      MemorySize: 1536
      Code:
        S3Bucket: !FindInMap
          - RegionMap
          - !Ref 'AWS::Region'
          - bucketname
        S3Key: File-Reader-Athena-Lambda-Fn-C-2.zip
      Environment:
        Variables:
          S3_BUCKET: !Ref S3BucketName
          CUR_S3_BUCKET: !Ref s3CURBucket
          REGIONX: !Ref 'AWS::Region'
  InitFunction:
    Type: 'AWS::Lambda::Function'
    Properties:
      Code:
        ZipFile: >
          const AWS = require("aws-sdk"); const response =
          require("cfn-response"); const docClient = new
          AWS.DynamoDB.DocumentClient(); exports.handler = function(event,
          context) {    var darray = [];
              darray.push("new-table-response");
              console.log(JSON.stringify(event,null,2));
              var params = {
                TableName: event.ResourceProperties.DynamoTableName,
                Item:{
                    "iKey": 10007770157,
                    "Flag": 0,
                    "TableDescription": darray
                }
            };
          docClient.put(params, function(err, data) { if (err) {
          response.send(event, context, "FAILED", {});

          } else {  response.send(event, context, "SUCCESS", {});

          } }); };
      Handler: index.handler
      Role: !GetAtt
        - LambdaExecutionRole
        - Arn
      Runtime: nodejs6.10
      Timeout: 60
  AthenaTable:
    Type: 'AWS::DynamoDB::Table'
    Properties:
      TableName: AthenaTable
      KeySchema:
        HashKeyElement:
          AttributeName: iKey
          AttributeType: 'N'
      ProvisionedThroughput:
        ReadCapacityUnits: '50'
        WriteCapacityUnits: '50'
  InitializeDynamoDB:
    Type: 'Custom::InitFunction'
    DependsOn: AthenaTable
    Properties:
      ServiceToken: !GetAtt
        - InitFunction
        - Arn
      DynamoTableName: !Ref AthenaTable