Artificial Intelligence

Explore the rapidly evolving world of artificial intelligence, machine learning, and automation. From AI ethics to real-world applications, this category delivers insights that matter for today and tomorrow.

Train a Model Faster with torch.compile and Gradient Accumulation

This article is divided into two parts; they are: • Using `torch.

December 25, 2025

AI Wrapped: The 14 AI terms you couldn’t avoid in 2025

If the past 12 months have taught us anything, it’s that the AI hype train is showing no signs of slowing. It’s hard to believe that at the beginning of the year, DeepSeek had yet to turn the entire industry…

December 25, 2025

Training a Model with Limited Memory using Mixed Precision and Gradient Checkpointing

This article is divided into three parts; they are: • Floating-point Numbers • Automatic Mixed Precision Training • Gradient Checkpointing Let’s get started! The default data type in PyTorch is the IEEE 754 32-bit floating-point format, also known as single…

December 24, 2025

Practical Agentic Coding with Google Jules

If you have an interest in agentic coding, there’s a pretty good chance you’ve heard of

December 24, 2025

Evaluating Perplexity on Language Models

This article is divided into two parts; they are: • What Is Perplexity and How to Compute It • Evaluate the Perplexity of a Language Model with HellaSwag Dataset Perplexity is a measure of how well a language model predicts…

December 23, 2025

How social media encourages the worst of AI boosterism

Demis Hassabis, CEO of Google DeepMind, summed it up in three words: “This is embarrassing.” Hassabis was replying on X to an overexcited post by Sébastien Bubeck, a research scientist at the rival firm OpenAI, announcing that two mathematicians had…

December 23, 2025

3 Smart Ways to Encode Categorical Features for Machine Learning

If you spend any time working with real-world data, you quickly realize that not everything comes in neat, clean numbers.

December 22, 2025

Pretraining a Llama Model on Your Local GPU

This article is divided into three parts; they are: • Training a Tokenizer with Special Tokens • Preparing the Training Data • Running the Pretraining The model architecture you will use is the same as the one created in the

December 21, 2025

Hiring specialists made sense before AI — now generalists win

Tony Stoyanov is CTO and co-founder of EliseAI In the 2010s, tech companies chased staff-level specialists: Backend engineers, data scientists, system architects. That model worked when technology evolved slowly. Specialists knew their craft, could deliver quickly and built careers on…

December 20, 2025

Rotary Position Embeddings for Long Context Length

This article is divided into two parts; they are: • Simple RoPE • RoPE for Long Context Length Compared to the sinusoidal position embeddings in the original Transformer paper, RoPE mutates the input tensor using a rotation matrix: $$ begin{aligned}…

December 20, 2025