Month: November 2020
-
DataProc Cost Saving Options
Use Pre-emptible VMs for adding capacity and configure them for graceful shutdown If your apps are fault-tolerant and can withstand possible instance preemptions, then preemptible instances can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on preemptible instances. If some of those instances terminate during processing, the job may slow…
-
Big Query Cost Saving Options
Use the Preview Option for Data Exploration – Zero Cost Dry run option using CLI – Zero Cost Partitioning table enables us to avoid full table scan for those queries which are based on a particular (e.g. calendar) dimension. For e.g. retrieve last month’s data. Using the limit clause doesn’t help in saving the cost.…
-
Data Studio Caching
Data Studio Reports may not get refreshed and you may be unable to see data updated in the last one hour. Data is cached by Data Studio Data Studio caches data for 1 hour if you are leveraging Big Query as your datasource. These settings can be configured through the Data freshness setting. For more information,…
-
What are data streams?
Streaming Ingestion (SI) To use data, a system needs to be able to discover, integrate, and ingest all available data from the machines that produce it, as fast as it’s being produced, in any format, and at any quality. A streaming data ingestion framework doesn’t simply move data from source to destination like traditional ETL solutions.…