AWS
-
Learn AWS Incognito for Authentication
Whether you’re building a web app, mobile app, or API, understanding how to implement robust authentication and authorization is a critical skill. AWS Cognito is a powerful ability that simplifies user management, authentication, and access control, making it an essential technology for developers and businesses alike. By mastering AWS Cognito, you can build secure, scalable applications while integrating seamlessly with other AWS services. If you’re looking to enhance your skills in this area, we have the perfect resource for you.…
-
Processing Cloud Data With DuckDB And AWS S3
DuckDb is a powerful in-memory database that has a parallel processing feature, which makes it a good choice to read/transform cloud storage data, in this case, AWS S3. I’ve had a lot of success using it and I will walk you through the steps in implementing it. I will also include some learnings and best practices for you. Using the DuckDb, httpfs extension and pyarrow, we can efficiently process Parquet files stored in S3 buckets. Let’s dive in: Before starting…
-
A Guide to Automating AWS Infrastructure Deployment
When it comes to managing infrastructure in the cloud, AWS provides several powerful tools that help automate the creation and management of resources. One of the most effective ways to handle deployments is through AWS CloudFormation. It allows you to define your infrastructure in a declarative way, making it easy to automate the provisioning of AWS services, including Elastic Beanstalk, serverless applications, EC2 instances, security groups, load balancers, and more. In this guide, we’ll explore how to use AWS CloudFormation…
-
Mastering the Transition: From Amazon EMR to EMR on EKS
Amazon Elastic MapReduce (EMR) is a platform to process and analyze big data. Traditional EMR runs on a cluster of Amazon EC2 instances managed by AWS. This includes provisioning the infrastructure and handling tasks like scaling and monitoring. EMR on EKS integrates Amazon EMR with Amazon Elastic Kubernetes Service (EKS). It allows users the flexibility to run Spark workloads on a Kubernetes cluster. This brings a unified approach to manage and orchestrate both compute and storage resources. Key Differences Between…
-
When (Tech Service) Relationships Don’t Work Out
Think back to those days when you met the love of your life. The feeling was mutual. The world seemed like a better place, and you were on an exciting journey with your significant other. You were both “all-in” as you made plans for a life together. Life was amazing… until it wasn’t. When things don’t work out as planned, then you’ve got to do the hard work of unwinding the relationship. Communicating with each other and with others. Sorting…
-
Automating AWS Infrastructure Testing With Terratest
Organizations adopting Infrastructure as Code (IaC) on AWS often struggle with ensuring that their infrastructure is not only correctly provisioned but also functioning as intended once deployed. Even minor misconfigurations can lead to costly downtime, security vulnerabilities, or performance issues. Traditional testing methods — such as manually inspecting resources or relying solely on static code analysis — do not provide sufficient confidence for production environments. There is a pressing need for an automated, reliable way to validate AWS infrastructure changes…
-
AWS Cloud Security: Key Components, Common Vulnerabilities, and Best Practices
With organizations shifting at a rapid pace to the cloud, securing the infrastructure is of paramount importance in their list of priorities. Even though AWS provides a varied set of tools and services related to security and compliance. There are various other factors beyond security. Security is not just about tools but about strategy, vigilance, continuous improvement, and conformity to the industry compliance standards for secure environments, including GDPR, HIPAA, and PCI DSS. In this article we will discuss AWS…
-
Iceberg Catalogs: A Guide for Data Engineers
Apache Iceberg has become a popular choice for managing large datasets with flexibility and scalability. Catalogs are central to Iceberg’s functionality, which is vital in table organization, consistency, and metadata management. This article will explore what Iceberg catalogs are, their various implementations, use cases, and configurations, providing an understanding of the best-fit catalog solutions for different use cases. What Is an Iceberg Catalog? In Iceberg, a catalog is responsible for managing table paths, pointing to the current metadata files that…
-
Setting Up a ScyllaDB Cluster on AWS Using Terraform
In this article, I present an example of a simple and quick installation of ScyllaDB in the AWS cloud using Terraform. Initially, I intended to create a ScyllaDB AMI image using HashiCorp Packer. However, I later discovered that official images are available, allowing ScyllaDB to be easily configured during instance initialization via user data. In fact, user data can define all parameters supported in scylla.yaml. Additional options and examples can be found in scylla-machine-image GitHub repository. What else should you…