1. Introduction to data analytics and AWS
-
Understanding the importance of data analytics and its types (descriptive, diagnostic, predictive, prescriptive).
-
Introduction to Big Data concepts and challenges.
-
Overview of the AWS platform and its advantages for data analytics.
-
Comparison of batch processing with real-time streaming data processing.
2. Data collection and storage
-
Data Collection Services: Amazon Kinesis for real-time data ingestion and processing, including Kinesis Data Streams and Kinesis Firehose.
-
Data Migration Services: AWS Database Migration Service (DMS), AWS Snowball, and AWS Snowmobile for transferring data to AWS.
-
Data Storage: Amazon S3 (Simple Storage Service) as a data lake, including storage classes, lifecycle management, and security.
-
Data Warehousing: Amazon Redshift for large-scale data warehousing and analytics.
-
NoSQL Databases: Amazon DynamoDB for NoSQL data storage and management.
-
Data Lake Governance: AWS Lake Formation for setting up and managing secure data lakes.
3. Data processing and transformation
-
ETL (Extract, Transform, Load): AWS Glue as a serverless ETL service for data preparation.
-
Big Data Processing: Amazon EMR for running big data frameworks like Apache Hadoop and Apache Spark.
-
Serverless Data Processing: AWS Lambda for event-driven processing and data transformation.
-
Orchestrating Workflows: AWS Step Functions and AWS Data Pipeline for building and managing data workflows.
4. Data analysis and visualization
-
Interactive Query Service: Amazon Athena for querying data in S3 using SQL.
-
Business Intelligence: Amazon QuickSight for creating dashboards and visualizations.
-
Log and Search Analytics: Amazon OpenSearch Service for real-time log analysis and search capabilities.
-
Machine Learning (ML): Leveraging services like Amazon SageMaker for building, training, and deploying ML models.
5. Security, governance, and compliance
-
Access Management: AWS IAM (Identity and Access Management) for securing access to AWS resources.
-
Data Encryption: Implementing encryption at rest and in transit using services like AWS Key Management Service (KMS).
-
Monitoring and Auditing: AWS CloudTrail and AWS Config for logging and auditing data access and changes.
-
Data Governance: Understanding best practices for data governance, including data classification and access control.