Snowflake Architecture

 

Introduction

The Snowflake architecture is a hybrid of  shared-disk and  shared-nothing  database architecture in order to combine the best of both.

        Shared-disk (A common disk or storage device is shared by all computing nodes) 
        Shared-nothing (Each computing node has a private memory and storage space)




1. Cloud Service :   (Cloud Services Layer)
It is the brain of Snowflake. We will not have any control over this. It is completely managed and maintained by Snowflake.
The main 5 components of this service are as below:
  • Metadata Manager
  • Security
  • Infrastructure
  • Optimizer
  • Authentication and Access Control
 It provides services to administer and manage a Snowflake data cloud, such as access control, authentication, metadata management, infrastructure management, query parsing, optimization, and many more. 

Snowflake will not be charging for this cloud services separately.

Reading data from cloud services layer is always free of cost. and we don't need any warehouse to run these queries But it should not exceed 10% of your total billing cost.

Ex : 
SELECT CURRENT_DATE;
SELECT CURRENT_WAREHOUSE(), CURRENT_DATABASE(), CURRENT_SCHEMA();
SELECT LAST_QUERY_ID();

2. Virtual Ware House : (Query Processing Layer) 
In the processing layer, queries are executed using virtual warehouses. Virtual warehouses are independent MPP (Massively Parallel Processing) compute clusters comprised of multiple compute nodes that Snowflake allocates from cloud providers. Due to the fact that virtual warehouses do not share their compute resources with each other, their performance is independent of each other.

3. Storage Service : (Database Storage Layer) 
Once data has been loaded into Snowflake, this layer reorganizes that data into a specific format like columnar, compressed, and optimized format. The optimized data is stored in cloud storage.

   
Points to be noted :     
  • Software as a service - hardware and software managed by the snowflake team.
  • Data Warehouse hosed on public clouds - AWS , Azure and Google Cloud.
  • No Hardware or software to buy or maintain - near zero and maintenance .
  • As a Snowflake customer you simply signup , Load data and stat querying.
  • Cannot be hosted on Private cloud or on-premise.
  • Storage and compute layers are decoupled
  •         Data Stored in columnar, compressed format in micro-partitions
What is Columnar Storage :
columns are stored separately unlike traditional databases like oracle. So the data retrieval is very fast.

so when you select any set of columns only required columns will be coming into memory. Snowflake don't even read or scan the unwanted columns




Comments