I want to create a script to export AWS security groups with rules in table structured format (CSV or TSV)
I require the following Values:
- SecurityGroups
- GroupName
- GroupId
- IpPermissions/IpPermissionsEgress
- FromPort
- ToPort
- IpProtocol
- IpRanges
- CidrIp
- Description
- UserIdGroupPairs
- GroupId
- Description
e.g.,
GroupName | GroupId | IpPermissions/IpPermissionsEgress | FromPort | ToPort | IpProtocol | CidrIp | Description | GroupId |
---|---|---|---|---|---|---|---|---|
default | 123456789 | IpPermissions/IpPermissionsEgress | 123 | 123 | xxx | xxx.xxx.xxx.xxx/x | xxxxxxxxxxxxxxxxx | sg-123456789 |
As far I have tried with
AWS CLI command:
aws ec2 describe-security-groups --query \
'SecurityGroups[*].{"01_GroupName":GroupName, "02_GroupId":GroupId,
"03_FromPort":IpPermissions[0].FromPort,
"04_ToPort":IpPermissions[0].ToPort,
"05_Protocol":IpPermissions[0].IpProtocol,
"06_RuleGroupId":IpPermissions[0].UserIdGroupPairs[0].GroupId,
"07_RuleCidr":IpPermissions[0].IpRanges[0].CidrIp,
"08_Justification":IpPermissions[0].UserIdGroupPairs[0].Description}' \
--output text > instances.tsv
The results of the AWS CLI: | default | sg-123456789 | 443 | 443 | tcp | None | 0.0.0.0/0 | xxxxxxxxxxxxxxxxx |
Problems with results of the AWS CLI:
- Most importantly, it only outputs the first security group and the first security rule (no recursive), Each security group has many security rules (one to many)
- How to put together IpPermissions and IpPermissionsEgress (Inbound/Outbound rules) in a single query?
- How to put the type of security rule in the column. E.g. is this IpPermissions(Inbound)/IpPermissionsEgress(Outbound rules)?
- Added labels in the query, even though it doesn't show as column labels.
Secondly, with Boto3 + Pandas
import boto3
from botocore.exceptions import ClientError
import pandas as pd
pd.set_option("display.max_columns", None)
ec2_client = boto3.client("ec2")
paginator = ec2_client.get_paginator("describe_security_groups")
result = []
try:
response_iterator = paginator.paginate()
for page in response_iterator:
each_page = page.get("SecurityGroups")
each_sg = pd.json_normalize(
each_page, "IpPermissions", ["GroupName", "GroupId"]
)
each_df = pd.json_normalize(
each_page, record_path=["IpPermissions", "IpRanges"], meta=["GroupId"]
)
# reorder columns
each_sg = each_sg[
[
"GroupId",
"FromPort",
"ToPort",
"IpProtocol",
"GroupName",
"UserIdGroupPairs",
]
]
each_sg.to_csv("multiplelevel_normalized_review_data-sg.csv", index=False)
each_df.to_csv("multiplelevel_normalized_review_data-df.csv", index=False)
result = pd.merge(each_sg, each_df)
result.to_csv("multiplelevel_normalized_review_data-result.csv", index=False)
except ClientError as e:
print(e)
The results of the Boto3 + Pandas:
GroupId | FromPort | ToPort | IpProtocol | GroupName | IpProtocol | UserIdGroupPairs | CidrIp | Description |
---|---|---|---|---|---|---|---|---|
sg-123456789 | 443 | 443 | tcp | default | 0.0.0.0/0 | [{'Description': 'xxx', 'GroupId': 'sg-123456789', 'UserId': '123456789'}, {'Description': 'xxx', 'GroupId': 'sg-123456789', 'UserId': '123456789'}}] | 0.0.0.0/0 | xxxxxxxxxxx |
Problems with results of the Boto3 + Pandas:
- The script is only working for IpPermissions (Inbound) rules.
- The numbers of records are different in each CSV file and a few security groups missing in the final - merge CSV file:
-- multiplelevel_normalized_review_data-sg.csv 122
-- multiplelevel_normalized_review_data-df.csv 88
-- multiplelevel_normalized_review_data-result.csv 237 - The script doesn't flatten the nested list UserIdGroupPairs .
- How to put the type of security rule in the column. E.g. is this IpPermissions(Inbound)/IpPermissionsEgress(Outbound rules)?