Posts

Showing posts from August, 2025

Pandas vs Polars: Which One to Choose for Data Processing?

Image
Introduction If you’ve done any data work in Python, chances are you’ve used Pandas —it’s been the go-to library for data analysis and data preparation for years. But as datasets keep getting bigger and performance demands rise, a new player has entered the scene: Polars . Think of it as Pandas’ faster, more modern library. Both are great at handling data, but they differ quite a bit when it comes to speed, scalability, and the way they’re designed In this blog, we’ll dive into the differences between Pandas and Polars, and help you decide which one fits your use case. Pandas vs Polars Both Pandas and Polars can play an important role in data preparation and data analysis. Pandas: Pandas can integrated easily with s cikit-learn , Matplotlib, TensorFlow, and PyTorch. Built on top of  NumPy and designed for in-memory datasets Pandas is ideal for small to medium dataset. Polars: Uses  Apache Arrow memory model for efficient storage Designed to be multi-threaded and more memory...

Using ConnectorX and DuckDB in Python: Step by Step Guide

Image
Introduction When working with large datasets, execution time and efficiency comes into play. Traditional methods of extracting data from the relational databases into Python often involve loading everything into memory, which can be painful and very slow. That’s where connectorX and DuckDB come in handy. Together, they make data extraction and analytics in python very  fast and memory-efficient . What is ConnectorX? ConnectorX is an open-source library built to load data from databases directly into pandas, Polars, or NumPy efficiently. Instead of fetching row by row via  psycopg2  or  sqlalchemy ConnectorX p arallely fetch chunks of data and stream them directly into Python. Supports many databases: MySQL, SQLite, PostgreSQL, SQL Server, BigQuery, Snowflake, and many more. What is DuckDB? DuckDB is an in-process SQL OLAP database. Can query CSV, Parquet, JSON, Arrow datasets, and even pandas/Polars DataFrames. Works directly inside Python and R. Data pro...

How to Manage Secrets Securely with AWS Secrets Manager and Lambda

Introduction In modern cloud-native applications, managing sensitive information like API keys, database credentials, any third party service credentials and other secrets securely is a top priority. Hardcoding secrets into application code, environmental variables in lambda functions or configuration files can lead to serious security vulnerabilities and operational risks and this is where AWS Secrets Manager comes in—a fully managed service that enables you to store, retrieve, and rotate secrets securely. When combined with AWS Lambda , Secrets Manager allows you to build powerful serverless applications that access secrets dynamically during the runtime, without ever exposing them in your codebase. In this blog, we'll explore how to integrate AWS Secrets Manager with Lambda functions, ensuring your application remains secure, scalable, and maintainable. Whether you're accessing a database, calling a third-party service, or simply avoiding secret sprawl, this guide will wal...