Stores data and performs queries and computations.
Local columnar storage.
Parallel/distributed execution of all queries, loads, backups, restores, resizes.
Up to 128 compute nodes.
Amazon RedShift Spectrum is a feature of Amazon RedShift that enables you to run queries against exabytes of unstructured data in Amazon S3, with no loading or ETL required.
Security
You can load encrypted data from S3.
Supports SSL Encryption in-transit between client applications and Amazon RedShift data warehouse cluster.
VPC for network isolation.
Encryption for data at rest (AES 256).
Audit logging and AWS CloudTrail integration.
Amazon RedShift takes care of key management, or you can manage your own through HSM or KMS.
Charges
Charged for compute nodes hours, 1 unit per hour (only compute node, not leader node).
Backup storage – storage on S3.
Data transfer – no charge for data transfer between Amazon RedShift and S3 within a region but for other scenarios you may pay charges.
HDD and SSD storage options.
The size of a single node is 160GB and clusters can be created up to a petabyte or more.
Multi-node consists of:
Leader node:
Manages client connections and receives queries.
Simple SQL endpoint.
Stores metadata.
Optimizes query plan.
Coordinates query execution.
Compute nodes:
Stores data and performs queries and computations.
Local columnar storage.
Parallel/distributed execution of all queries, loads, backups, restores, resizes.
Up to 128 compute nodes.
Amazon RedShift Spectrum is a feature of Amazon RedShift that enables you to run queries against exabytes of unstructured data in Amazon S3, with no loading or ETL required.
Use Cases of Amazon RedShift
A data warehouse for enterprise operations: Many organizations work with data from multiple sources, such as advertising, customer relationship management, and customer support.
As a centralized repository, Redshift can be used to store data from multiple sources in a unified schema and structure. This can then feed enterprise-wide reporting and analytics.
In business intelligence and analytics, Redshift’s fast query execution against terabyte-scale data makes it an excellent selection. BI tools such as Tableau often use Redshift as the underlying database (which would otherwise struggle to perform queries and joins of large datasets).
Organizations may choose to monetize their data by exposing it to their customers through embedded analytics and analytics as a service. In these scenarios, Redshift’s data sharing, search, and aggregation capabilities make it ideal, as it allows customers to access only relevant subsets of data while keeping other databases, tables, or rows confidential.
As long as the cluster is adequately resourced, Redshift’s performance is consistent and predictable. It is therefore a popular choice for data-driven applications, such as reporting and calculations.
Database migration and change data capture: AWS Database Migration Service (DMS) can be used to replicate changes in an operational data store into Amazon Redshift. It is typically done to provide more flexibility in analysis, or when migrating from legacy data warehouses.