Topics

DeepSeek-R1 Explained: Reinforcement Learning, GRPO, and Emergent Reasoning

A college-level breakdown of how DeepSeek-R1 used reinforcement learning and GRPO to incentivize reasoning, why the approach mattered, and how later work on hierarchical reasoning helps explain what may be happening inside RL-trained reasoning models.

deepseek-r1reinforcement-learninggrporeasoning-modelsllms

Introducing SentinelMesh

Why I'm building a security monitoring mesh for AI agents — and what problem it solves.

sentinelmeshagentsprojects

AI Security in 5 Concepts

Five concepts from five years of enterprise security that show up in every serious AI incident.

ai-securityfundamentals

Welcome to pawanbk.io

My space for AI concepts, security projects, and personal perspectives on new developments in AI.

about meai-security