Loading Data into RedShift
Typically, data from OLTP systems is loaded into Redshift for analytics and BI purposes.
- Data from OLTP systems can be loaded into S3 and data from S3 can then be loaded into Redshift.
- Data from Kinesis Firehose can also be loaded in the same way.
COPY command
- Loads data from files stored in S3 into Redshift
- Data is stored locally in the Redshift cluster (persistent storage = cost)
- DynamoDB table data and EMR data can also be loaded using COPY command
Loading data from S3 with COPY command
- Create an IAM Role.
- Create your Redshift cluster
- Attach the IAM role to the cluster
- The cluster can then temporarily assume the IAM role on your behalf
- Load data from S3 using COPY command
More ways to load data into Redshift
- Use AWS Glue – fully managed ETL service
- ETL = Extract, Transform, and Load
- Use ETL tools from APN partners
- Use Data Pipeline
- For migration from on-premise, use.
- AWS Import/Export service (AWS Snowball).
- AWS Direct Connect (private connection between your datacenter and AWS)
No comments:
Post a Comment