Day One
AWS offers solutions for IaaS, PaaS and SaaS. Some examples are:
On cloud hosting:
- VMs need a guest OS to run (slow boot times, and resource overhead)
- Containers need orchestration (non-trivial)
- Server-less -> π
AWS launched in 2006 with only 3 services: EC2, S3, and SQS
All services provide an API to perform any tasks.
Observability
- Monitoring, logging, tracing
- CloudWatch -> logs, events
- CloudTrail -> auditing tool, not real-time
- X-Ray -> service maps, identify performance bottlenecks
- Only three services support inline policies: S3, SQS, OpsWorks
- Use managed policies when possible
- With IAM, everything is deny by default
- Explicit deny
- Explicit allow
IAM Policy Simulator -> Troubleshooting of permission issues
IAM Role === Identity (cannot have two identities at once)
Trusted Advisor (support)
- Save money
- Security tips
- Identify security risks
Identities:
GetSessionToken
API call for MFA (multi-factor authentication)
STS AssumeRole operation -> no MFA
- 11 nines -> durability
- 4 nines -> availability
- S3 is an object store (not files)
- Bucket names must be globally unique, always lowercase
- 1024 characters max
- Must be DNS compatible
S3 versioning:
- Must delete all versions one by one
- Deleting a file inserts a βdelete markβ
S3 select:
- individual document basis
- query data inside documents (eg. JSON documents)
Athena -> cross-document queries with actual SQL statements
S3 pre-signed urls
- avoid leechers
- temporary urls that expire after a threshold
Day Two
Neptune -> Graph database
Aurora -> Serverless relational database (MySQL or Postgres)
- Partitions up to 10GB
- Primary Key -> Partition Key + Sort Key (optional)
- 400Kb per item max, including attribute names
- Throughput is divided among partitions
- 5 max Local Secondary Indexes (LSI) before table creation
- 20 max Global Secondary Indexes (GSI) -> new PK/SK, can be added at any time
- Sparse index: when GSI PK does not exist on an item in the table
DynamoDB Streams
- DynamoDB Streams -> S3 (JSON)
- DynamoDB Streams -> SNS
- DynamoDB Streams -> Lambda
Global tables -> multi-master arrangement
DynamoDB Transactions
Batch -> 16MB
DynamoDB Accelerator (DAX)
- write-through cache (latency)
- write around for heavy write applications
- key/value store
- rotate secrets
- Max duration: 15 min
- Lambda Service (managed)
- poll for events
- pull invokation
- Reserved concurrency
- limit number of concurrent invocations
- max concurrent invocations
- Provisioned concurrency
- warm lambda instances
- avoid cold starts
- runs continuously
- Canary deploys with aliases and versions
- Lambda destination for async invocations: success / failure
- Initial burst: 500-3000 cus
- +500/minute afterwards
- 1000 cus max/region by default
SALAD
Remember LAMP (Linux, Apache, MySQL, and PHP)?
SALAD is the new stack:
- S3
- API Gateway
- Lambda
- ?
- DynamoDB
- Sell API keys on AWS marketplace
- 10k requests/sec
- Throttle + daily quotes
Cloud Formation -> Infrastructure as code
- Idempotence -> write consumers for no issues if duplicate messages are received
- 14 days max time for messages to stay in queue
- SQS -> 1:1 only one consumer can get a message
- Dead Letter Queue (DQL) for failed messages
- SQS FIFO -> MessageDeduplicationId + SendMsg
- When using Lambda consumers, polling is handled by Lambda Service
- 1:N multiple subscribers per topic
- Retry policy
- SNS filters -> avoid invocations
- Order of messages is not guaranteed
- FIFO supported on SNS since Dec 2020 re:Invent
- No persistence
- 256Kb max per message
Use SNS to fanout:
βββββββββββββ βββββββββββββ
β β β β
βββββΆβ SQS ββββββββΆβConsumer A β
βββββββββββββ βββββββββββββ β β β β β
β β β β β βββββββββββββ βββββββββββββ
β Producer ββββΆβ SNS ββββββ€
β β β β β βββββββββββββ βββββββββββββ
βββββββββββββ βββββββββββββ β β β β β
βββββΆβ SQS ββββββββΆβConsumer B β
β β β β
βββββββββββββ βββββββββββββ
- Runs Apache MQ
- Managed service
- MQTT for IoT
- AWS IoT Broker message -> MQTT
- Allow for manual approval steps
- State saved for one year max
ElastiCache
Containers
- Docker: only one process per container
- Container = runtime engine + dependencies + code + configuration
- ECR
- Public images since Dec 2020
- Integrated with Docker CLI
- ECR ~= Docker Hub
Container Orchestration
Container -> Task (auto-scaling) -> ECS -> EC2 instance (auto-scaling)
- Forget about managing EC2 instances.
- No choice of instance type
- No ssh into EC2 instance
- Server-less provision of containers
- Elastic Kubernetes Cluster
Day Three
Web based editor
- store keys, passwords
- manage and rotate
- Assume role via broker to authenticate
- Assume role with SAML
- Assume role with JWT
- Hosted web-ui for login, signup, change password, etc
- User dierectory
- User profiles
- User groups
- MFA
- Trusted Identity Providers
- Login with user/pwd -> JWT
- Login with SAML -> JWT
- Login with OpenID -> JWT
- No IAM roles
- Identity federation (identity pool == federated identities)
- Support for anonymous login
- Storage for each identity (profile state)
- STS -> IAM credentials
- Social IDP (OpenID) -> JWT
- SAML -> JWT
- JWT (from user pool)
=> Use Cognito User Pool JWT auth token to authenticate against Cognito Identity Pool
User Pool -> Identity Pool -> IAM Assume Role
Deployments
- Black/Red: 0 -> 100% (all or nothing)
- Blue/Green: 20% green, 80% blue
- Canary deployments
- Linear increments
- Deploy code
- Provisions infrastructure for you
- Uses Cloud Formation
And that is it. As mentioned, not the best notes ever, but better than nothing :)
Update (Dec 30, 2020)
Iβve updated the notes above adding links to service documentation where relevant.
This article was written as an issue on my Blog repository on GitHub (see Issue #8)