AI was HARD until I Learned these 10 Concepts

📘 AI Agent Playbook (free guide from Hubspot): https://clickhubspot.com/a1052e

Everyone is talking about AI but to actually build with it, use it at work, or ace an AI interview, you need to underst

https://www.youtube.com/watch?v=5DtjQrROUzY?start=108

Transcript

Everyone is talking about AI. But in order to actually build with it, use it at work, become an AI engineer, or even just have an intelligent conversation about AI in an interview, you need to understand the core fundamentals of AI, not just the buzzwords. Today, I’m going to talk through the 10 AI concepts every software and AI engineer should know in 2026. Hi friends, I’m Maddie. I’m a senior software engineer who previously worked at Google and internet other big tech companies like Amazon, IBM, and Microsoft. I work with AI systems a lot and I’ve spent hours breaking down the concepts that actually matter right now so you don’t have to. In this video, I’m going to walk you through 10 AI concepts from LLMs and tokens all the way to AI agents and context engineering. Whether you’re building AI products, preparing for interviews, or just trying to keep up with the industry, this is the video to watch. Let’s get into it. Thank you to HubSpot for sponsoring this video. We’re starting with the foundations. Large language models or LLMs, ChachiBT, Claude, Gemini. These are all powered by LLMs. At its core, an LLM is a neural network trained on massive amounts of text data to do one deceptively simple thing. Predict the next word in a sequence. So you type the capital Frances and the model predicts Paris. That may seem simple, but when you scale that idea up to billions of parameters train on essentially the entire internet, it goes deeper than just autocomplete. The model at this point doesn’t just predict words. It’s developing the ability to reason, write code, summarize documents, and hold genuinely useful conversations. LLMs are the engine behind almost every AI product you’ll interact with in your career. So, if you’re a software engineer in 226 and you don’t understand what an LLM is, it’s like being a web developer in 2010 who doesn’t know what an API is. Let’s dive deeper into tokens and context windows as our second concept. So, LMS don’t actually read words the way we do. They break text into smaller units called tokens. A token might be a whole word like hello or it may be a piece of a word like un and believe and able being three separate tokens. Why this matters is because every LLM has a context window. That’s the maximum number of tokens it can process at once. You can think of this like a model short-term memory. Early GBT models had only a 4,000 token context window. Now we’re seeing models with over a million. A larger context window means the model can see more of your conversation, your code, or documents all at once, which directly impacts how useful it is. If you’ve ever had an AI forget something you told it earlier in a conversation, that is the context window running out. Understanding this concept helps you design better prompts and build smarter AI applications. Now, speaking of AI concepts that actually translate into real world impact, if you’re wondering how people are actually using AI agents right now to supercharge their workflows, then this free resource from HubSpot, today’s video sponsor, will help you do exactly this. HubSpot’s new playbook, AI Agents Unleashed, is one of the most practical and comprehensive guides I’ve seen on this topic. It breaks down what AI agents can actually do today, where to start implementing them, and how to build an effective strategy for human AI collaboration. It talks about how AI agents work in practice. The memory systems, the tool integrations, the multi-step reasoning that separates real agents from basic chatbots. One of my favorite parts is their framework for figuring out what tasks to hand off to an agent versus which ones still need a human in the loop, because that’s the question that everyone really needs to answer. It also covers the common pitfalls that people run into when deploying agents and how to avoid them, which honestly is the part that most guides skip entirely. The playbook is completely free. I’ll put the link in the description below. Big thanks to HubSpot for creating this guide and sponsoring this video. And now, let’s get back to the concepts. Our third one is AI agents. An AI agent isn’t just a chatbot that answers your questions. It’s an AI system that can reason, plan, and take actions autonomously to achieve a goal. So, a chatbot says, “Here’s how to book a flight.” An agent actually will go and book the flight for you. It perceives the environment, reasons about the next best step, acts on that plan, and then observes the results, and it loops through that cycle until the task is done. So, for example, look at OpenClaw. If you haven’t heard of it yet, OpenClaw is an open source AI agent built by Peter Steinberger that went completely viral a few weeks ago. We’re talking 60,000 GitHub stars in 72 hours. It runs locally on your machine and connects to the apps you already use. So, WhatsApp, Slack, Discord, your calendar, your email, and it actually does things on your behalf. People have used it to triage their inbox, debug code while they sleep, automate their smart home, even build entire apps from their phone. OpenAI actually ended up hiring Steinberger to build their next generation of personal agents, which tells you how big this space is getting. Concept four is MCP, model context protocol. If AI agents are the workers, then MCP is a universal adapter that lets them plug into your tools. Before MCP, if you wanted an LLM to connect to your database, your email, and your CRM, you need to build three separate custom integrations. MCP standardizes that. It’s an open protocol that defines how AI models connect to external data sources and services. Think of it like USB for AI. Before USB, every device had its own proprietary connector. MCP fixes the same thing for AI tools. It creates a common interface so that any model can connect to any compatible service. If you’re building AI applications, understanding MCP is going to be essential because it’s rapidly becoming the standard for how agents interact with the outside worlds. MCP is open source and Anthropic donated it to the Agentic AI Foundation which is managed by the Linux Foundation last year. And now let’s talk about concept five, RAG, retrieval augmented generation. The problem that RAG solves is that LLMs are trained on data only up to a certain date. They don’t know about your company’s internal documents, your product specs, or last week’s policy changes. So when you ask them a question about something specific in your organization, they either make something up, which is called hallucination, or they just say that they don’t know. Rag fixes this by adding a retrieval step before the model generates its response. When a user asks a question, the system first searches a database, usually a vector database, for relevant documents. It then feeds those documents into the LLM’s prompt as additional context so the model can give an accurate grounded answer. I want to say a quick note on vector databases since they’re a key part of rack. Instead of storing data as traditional rows and columns, a vector database stores data as mathematical representations called embeddings. These embeddings capture the meaning of text. So you can search by similarity rather than exact keywords. So for example, you can ask what’s a refund policy and the system finds relevant documents even if they use words like return or money back instead of refund. That semantic understanding is what makes Rag so powerful and so adaptable to many use cases. Concept six is fine-tuning. And this is one that people often confuse with rag. They solve different problems. Fine-tuning is the process of taking a pre-trained model and training it further on a smaller specialized data set so it behaves differently. You can think of it this way. The base model already knows how to speak and reason. Fine-tuning trains it to speak in a specific way, maybe in medical terminology or in your company’s brand voice or to format its outputs in a particular structure. A quick rule of thumb I’d recommend following is to use rag when you need a model to access specific facts it wasn’t trained on and use fine-tuning when you need the model to behave differently, like changing its tone, style, or output format. And you can definitely combine both. Concept seven is context engineering. You’ve probably heard of prompt engineering, crafting the right instructions to get a good response from an AI. Context engineering goes way beyond that. It’s about designing the entire information environment around the model. What documents you retrieve via rag, what conversation history you include, what external tools are available via MCP, and how you summarize and prioritize all of that information within the model’s context window. The reason why this is so critical is that the quality of an LLM’s output is directly tied to the quality of the context you give it. Two people can use the same model and get wildly different results based on how well they’re engineering the context. If you take one thing from this video for your career, let it be this. The engineers who master context engineering are the ones companies are really wanting to hire right now. Concept eight is reasoning models. These are a newer breed of LLMs that have been specifically trained to think step by step before generating an answer. Regular LLMs respond immediately. They generate tokens one after the other without pausing to plan. Reasoning models, on the other hand, generate an internal chain of thought. They break problems down, consider different approaches, and work through the logic before producing a final answer. This is why you’ll sometimes see an LLM say thinking before it responds. That is the reasoning model at work. These models are trained on problems with verifiably correct answers, like math problems or code that can be tested by a compiler. And through reinforcement learning, they learn to generate reasoning steps that lead to correct solutions. OpenAI’s O series and DeepSeek are prominent examples. Reasoning models are especially important for agents because complex multi-step tasks require planning, not just pattern matching. Concept nine is multimodal AI. Early LLMs could only process text. Multimodal models can handle text, images, audio, video, and more, both as inputs and outputs. This is a big deal because the real world obviously isn’t just text. You might want an AI to analyze a photo of a whiteboard from a meeting, transcribe a voice memo, generate an image from a description, or understand a video. Multimodal models can do all of this. What’s fascinating from a technical standpoint is that models trained on multiple data types actually develop a deeper understanding. A model that has seen both images of cats and text about cats understands the concept more richly than a texton model. For engineers, this opens up enormous possibilities from accessibility tools to medical systems that analyze scans alongside patient notes. And finally, last but not least, number 10 is Mixture of Experts or MOE. This one’s more under the hood, but I would argue it’s one of the most underappreciated breakthroughs in LM architecture. The idea has actually been around since 1991, but it’s never been more relevant than now. Instead of having one massive neural network where every parameter activates for every input, divides the model into specialized subn networks, the experts. A routing mechanism then decides which experts to activate for which input. So you might have a model with hundreds of billions of total parameters, but for any given query, it only uses a fraction of them, whichever experts are most relevant. The result is the intelligence of a huge model with the speed and cost efficiency of a much smaller one. This is how companies are scaling AI without cost spiraling out of control. And it’s a big reason why AI models have gotten so much better so fast. To sum up, the 10 AI concepts you need to know are LLMs, token and context windows, rag, fine-tuning, AI agents, MCP, context engineering, reasoning models, multimodal AI, and mixture of experts. These are the fundamental building blocks of virtually every AI product shipping right now. And that’s all I have for you in this video. If this helped you understand AI better, hit that like button, hype the video, and subscribe. I make videos every week on software engineering, tech careers, and AI. Thanks for watching, and I’ll see you in the next one. Everyone is talking about AI. But in order to actually build with it, use it at work, become an AI engineer, or even just have an intelligent conversation about AI in an interview, you need to understand the core fundamentals of AI, not just the buzzwords. Today, I’m going to talk through the 10 AI concepts every software and AI engineer should know in 2026. Hi friends, I’m Maddie. I’m a senior software engineer who previously worked at Google and internet other big tech companies like Amazon, IBM, and Microsoft. I work with AI systems a lot and I’ve spent hours breaking down the concepts that actually matter right now so you don’t have to. In this video, I’m going to walk you through 10 AI concepts from LLMs and tokens all the way to AI agents and context engineering. Whether you’re building AI products, preparing for interviews, or just trying to keep up with the industry, this is the video to watch. Let’s get into it. Thank you to HubSpot for sponsoring this video. We’re starting with the foundations. Large language models or LLMs, ChachiBT, Claude, Gemini. These are all powered by LLMs. At its core, an LLM is a neural network trained on massive amounts of text data to do one deceptively simple thing. Predict the next word in a sequence. So you type the capital Frances and the model predicts Paris. That may seem simple, but when you scale that idea up to billions of parameters train on essentially the entire internet, it goes deeper than just autocomplete. The model at this point doesn’t just predict words. It’s developing the ability to reason, write code, summarize documents, and hold genuinely useful conversations. LLMs are the engine behind almost every AI product you’ll interact with in your career. So, if you’re a software engineer in 226 and you don’t understand what an LLM is, it’s like being a web developer in 2010 who doesn’t know what an API is. Let’s dive deeper into tokens and context windows as our second concept. So, LMS don’t actually read words the way we do. They break text into smaller units called tokens. A token might be a whole word like hello or it may be a piece of a word like un and believe and able being three separate tokens. Why this matters is because every LLM has a context window. That’s the maximum number of tokens it can process at once. You can think of this like a model short-term memory. Early GBT models had only a 4,000 token context window. Now we’re seeing models with over a million. A larger context window means the model can see more of your conversation, your code, or documents all at once, which directly impacts how useful it is. If you’ve ever had an AI forget something you told it earlier in a conversation, that is the context window running out. Understanding this concept helps you design better prompts and build smarter AI applications. Now, speaking of AI concepts that actually translate into real world impact, if you’re wondering how people are actually using AI agents right now to supercharge their workflows, then this free resource from HubSpot, today’s video sponsor, will help you do exactly this. HubSpot’s new playbook, AI Agents Unleashed, is one of the most practical and comprehensive guides I’ve seen on this topic. It breaks down what AI agents can actually do today, where to start implementing them, and how to build an effective strategy for human AI collaboration. It talks about how AI agents work in practice. The memory systems, the tool integrations, the multi-step reasoning that separates real agents from basic chatbots. One of my favorite parts is their framework for figuring out what tasks to hand off to an agent versus which ones still need a human in the loop, because that’s the question that everyone really needs to answer. It also covers the common pitfalls that people run into when deploying agents and how to avoid them, which honestly is the part that most guides skip entirely. The playbook is completely free. I’ll put the link in the description below. Big thanks to HubSpot for creating this guide and sponsoring this video. And now, let’s get back to the concepts. Our third one is AI agents. An AI agent isn’t just a chatbot that answers your questions. It’s an AI system that can reason, plan, and take actions autonomously to achieve a goal. So, a chatbot says, “Here’s how to book a flight.” An agent actually will go and book the flight for you. It perceives the environment, reasons about the next best step, acts on that plan, and then observes the results, and it loops through that cycle until the task is done. So, for example, look at OpenClaw. If you haven’t heard of it yet, OpenClaw is an open source AI agent built by Peter Steinberger that went completely viral a few weeks ago. We’re talking 60,000 GitHub stars in 72 hours. It runs locally on your machine and connects to the apps you already use. So, WhatsApp, Slack, Discord, your calendar, your email, and it actually does things on your behalf. People have used it to triage their inbox, debug code while they sleep, automate their smart home, even build entire apps from their phone. OpenAI actually ended up hiring Steinberger to build their next generation of personal agents, which tells you how big this space is getting. Concept four is MCP, model context protocol. If AI agents are the workers, then MCP is a universal adapter that lets them plug into your tools. Before MCP, if you wanted an LLM to connect to your database, your email, and your CRM, you need to build three separate custom integrations. MCP standardizes that. It’s an open protocol that defines how AI models connect to external data sources and services. Think of it like USB for AI. Before USB, every device had its own proprietary connector. MCP fixes the same thing for AI tools. It creates a common interface so that any model can connect to any compatible service. If you’re building AI applications, understanding MCP is going to be essential because it’s rapidly becoming the standard for how agents interact with the outside worlds. MCP is open source and Anthropic donated it to the Agentic AI Foundation which is managed by the Linux Foundation last year. And now let’s talk about concept five, RAG, retrieval augmented generation. The problem that RAG solves is that LLMs are trained on data only up to a certain date. They don’t know about your company’s internal documents, your product specs, or last week’s policy changes. So when you ask them a question about something specific in your organization, they either make something up, which is called hallucination, or they just say that they don’t know. Rag fixes this by adding a retrieval step before the model generates its response. When a user asks a question, the system first searches a database, usually a vector database, for relevant documents. It then feeds those documents into the LLM’s prompt as additional context so the model can give an accurate grounded answer. I want to say a quick note on vector databases since they’re a key part of rack. Instead of storing data as traditional rows and columns, a vector database stores data as mathematical representations called embeddings. These embeddings capture the meaning of text. So you can search by similarity rather than exact keywords. So for example, you can ask what’s a refund policy and the system finds relevant documents even if they use words like return or money back instead of refund. That semantic understanding is what makes Rag so powerful and so adaptable to many use cases. Concept six is fine-tuning. And this is one that people often confuse with rag. They solve different problems. Fine-tuning is the process of taking a pre-trained model and training it further on a smaller specialized data set so it behaves differently. You can think of it this way. The base model already knows how to speak and reason. Fine-tuning trains it to speak in a specific way, maybe in medical terminology or in your company’s brand voice or to format its outputs in a particular structure. A quick rule of thumb I’d recommend following is to use rag when you need a model to access specific facts it wasn’t trained on and use fine-tuning when you need the model to behave differently, like changing its tone, style, or output format. And you can definitely combine both. Concept seven is context engineering. You’ve probably heard of prompt engineering, crafting the right instructions to get a good response from an AI. Context engineering goes way beyond that. It’s about designing the entire information environment around the model. What documents you retrieve via rag, what conversation history you include, what external tools are available via MCP, and how you summarize and prioritize all of that information within the model’s context window. The reason why this is so critical is that the quality of an LLM’s output is directly tied to the quality of the context you give it. Two people can use the same model and get wildly different results based on how well they’re engineering the context. If you take one thing from this video for your career, let it be this. The engineers who master context engineering are the ones companies are really wanting to hire right now. Concept eight is reasoning models. These are a newer breed of LLMs that have been specifically trained to think step by step before generating an answer. Regular LLMs respond immediately. They generate tokens one after the other without pausing to plan. Reasoning models, on the other hand, generate an internal chain of thought. They break problems down, consider different approaches, and work through the logic before producing a final answer. This is why you’ll sometimes see an LLM say thinking before it responds. That is the reasoning model at work. These models are trained on problems with verifiably correct answers, like math problems or code that can be tested by a compiler. And through reinforcement learning, they learn to generate reasoning steps that lead to correct solutions. OpenAI’s O series and DeepSeek are prominent examples. Reasoning models are especially important for agents because complex multi-step tasks require planning, not just pattern matching. Concept nine is multimodal AI. Early LLMs could only process text. Multimodal models can handle text, images, audio, video, and more, both as inputs and outputs. This is a big deal because the real world obviously isn’t just text. You might want an AI to analyze a photo of a whiteboard from a meeting, transcribe a voice memo, generate an image from a description, or understand a video. Multimodal models can do all of this. What’s fascinating from a technical standpoint is that models trained on multiple data types actually develop a deeper understanding. A model that has seen both images of cats and text about cats understands the concept more richly than a texton model. For engineers, this opens up enormous possibilities from accessibility tools to medical systems that analyze scans alongside patient notes. And finally, last but not least, number 10 is Mixture of Experts or MOE. This one’s more under the hood, but I would argue it’s one of the most underappreciated breakthroughs in LM architecture. The idea has actually been around since 1991, but it’s never been more relevant than now. Instead of having one massive neural network where every parameter activates for every input, divides the model into specialized subn networks, the experts. A routing mechanism then decides which experts to activate for which input. So you might have a model with hundreds of billions of total parameters, but for any given query, it only uses a fraction of them, whichever experts are most relevant. The result is the intelligence of a huge model with the speed and cost efficiency of a much smaller one. This is how companies are scaling AI without cost spiraling out of control. And it’s a big reason why AI models have gotten so much better so fast. To sum up, the 10 AI concepts you need to know are LLMs, token and context windows, rag, fine-tuning, AI agents, MCP, context engineering, reasoning models, multimodal AI, and mixture of experts. These are the fundamental building blocks of virtually every AI product shipping right now. And that’s all I have for you in this video. If this helped you understand AI better, hit that like button, hype the video, and subscribe. I make videos every week on software engineering, tech careers, and AI. Thanks for watching, and I’ll see you in the next one.

Brian Wong

Explorer

# AI was HARD until I Learned these 10 Concepts

AI was HARD until I Learned these 10 Concepts

Transcript

Graph View

Table of Contents

Backlinks