AI Agents for PMs in 69 Minutes — Masterclass with IBM VP

Armand Ruiz, VP of AI Platform at IBM, reveals why most enterprise AI implementations fail and what Fortune 500 companies are actually building that works. He breaks down the difference between chatbo

https://www.youtube.com/watch?v=g-Yb7CFWItk?start=6

Video Transcript (If Available:

Can you break down for us what is an AI agent? Because we’ve all experienced chat GPT, but what makes an agent so special? Well, for me, agents is really delivering on the promise of AI. Now, we got into this chatbot era, but like agents, they really deliver the wall of automation. You have this tremendous handwritten drawing demonstrating what our agents So, can you walk us through the steps? Yeah, four simple steps. The first one is thinking, second step, the planning, and then the third component is action. Fourth step is reflection. Alman Ruiz is the VP of AI platform at IBM and he has amassed over 200,000 LinkedIn followers in just two years for his takes on AI. So you’ve been in AI for 16 years. What have they been the biggest open-source releases over time? LLMs. I think Mral did massive things when they came into the market. They were also I think the first to provide a mixture of experts. There are hundreds of thousands of open source models. So let’s get to IBM. How is IBM going to make big waves in the AI space? One of the things I’m very bullish is about providing customers flexibility to deploy AI anywhere and to tap into any AI engine they want. It’s about jumping in using the tools. What’s like a good road map if you had to give somebody if they’re going from zero to one to ramp up on all these tools like which tools should they try first in what order? I think first [Music] really quickly I think a crazy stat is that more than 50% of you listening are not subscribed. If you can subscribe on YouTube, follow on Apple or Spotify podcasts, my commitment to you is that we’ll continue to make this content better and better. And now on to today’s episode. Arman Ruiz is the VP of AI platform at IBM and has amassed over 200,000 followers on LinkedIn in less than two years for his takes on AI. In today’s episode, we’re going to break down everything you need to know about AI agents and open-source AI. We also cover his path from intern to VP in less than 14 years and his takes on the future of the product management role. Armand, welcome to the podcast. So happy to be here. Finally, we make it. Yeah, I think we both were talking off air that we mutually have been reading each other’s work on LinkedIn. So, it’s really exciting to chat. I think it has this cool effect which I can see as a reader which is I almost feel like I understand how you think. Same same here. I’ve been following your journey, your newsletter, listening to your podcast. So, yeah, very impressive work. Thank you. Likewise. Daily LinkedIn posting for you for two years. And I think this thread you’ve been on probably for almost the last year, so a lot earlier than other people is AI agents. So can you break down for us what is an AI agent because we’ve all experienced chat GPT, but what makes an agent so special? Well, for me agents is really delivering on the on the promise of AI. So uh we’ve been through this journey where I I’ve been working in AI for 14 years now and at first it was just predictive analytics doing predictions giving uh just uh rough numbers for forecasting and and things like that. Now we got into into this chatbot era but like agents do they really deliver uh the wall of automation that is going to unlock uh everyone uh and people and businesses to generate way more output. So that’s why I’m extremely bullish and very excited about it. And um yeah, I’ve been talking about agents from the very very beginning and they were already some early projects showing the the the potential maybe AGI or Auto GPT. Uh those were amazing and and then yeah, the whole world now is uh prioritizing agents. Yeah, those are I think almost 2023 news. It took two years for the world to really catch up. Yeah. Yeah. Absolutely. Absolutely. So um uh part of my job right now at IBM is to lead AI platforms. So really building the the blogs building blocks for enterprises to build securely AI agents and embed them into different business functions. Uh I just came yesterday night from meeting a CIO. I met another CIO last week. So I’m meeting some of the biggest customers from the biggest brands. they all have AI as the number one priority in their agenda and and agents is one of their core components. So there are a lot of different factors on how they empower employees to experiment with the technology but then at the same time how they can take it into production in a very safe and secure way and there are everything in between in different levels of uh risk and innovation appetite that they have. M so as someone who’s educating people about AI agents when I saw your handwritten drawing I was just like that is a piece of art that is something that really helps people understand AI agents so can you walk us through this like what are the building blocks of an AI agent yeah so uh the first phase is uh thinking uh if we’ve been in this world of LLMs and we’ve seen LLMs at first they were just spitting texts and now we see that step of like thinking you thinking is uh is number one step. That’s why we hear about LLMs being very good at reasoning. So and and um also we see we hear Jensen saying, “Hey, with aentic systems and new LLMs, we’re going to use more tokens, more inference.” Because that reasoning step takes extra compute, but gives uh gives you like the kind of like chain of thought process that we used to do manually now is built in into the LLM. and to go to step two which is planning. So you ask for a for a task and and uh the LLM is going to break down that task into subtask and for each of them uh will go and execute and in some cases will challenge the output from the previous uh from the previous step but it’s going to be able to create uh multiple u uh subtask and goals and then it’s going to go into into act which is uh step number three and act is maybe one of the most fascinating steps because uh it’s going to allow you to um tap into execution of actions, you know. So, if you need to uh input something to a CRM or send an email all the way to sending, not just to write the copy or um whatever action it can be in in in for example in a system like workday can uh go and and uh interact with information for for a specific employee. There are so many different things that you can do in the act phase that is being opened up by uh protocols like MCP. And then reflection uh that’s step number four. Reflection is really uh what is going to make agents really good because maybe at first they’re a little bit raw but then with human input uh they will iterate and become better and better and better over time. Uh so that reflect uh step comes with like some technical implementations that you need to do that but you will be able to tap into the all the past history of interactions and learn for the from it and feed it back into the agent so next time it executes it does it better. So there’s so many different terms and frameworks out there that people have heard about AI agents. What are the frameworks they should understand and how do they fit into these steps? Frameworks uh you mean developing development frameworks. Yeah, I think I would like to um classify them into two categories. there is like the coding uh frameworks and these coding frameworks are becoming simpler and simpler but there still you need to know in most of the cases uh Python um but you have uh frameworks like Langraph or Crew AI or Lama index or autogen um those are excellent frameworks open source uh widely popular and and uh with a lot of information documentation and and courses online you can go and and try them out but then you have on the flip side you of like um low code no code tools. One we we have at IBM that is very popular. It’s called Langflow. Um I saw the new announcement from Lindy. There is um there is NA8 N. Yeah. Uh which is also getting uh very popular. Uh Stack AI uh from a fellow Spaniard here in in San Francisco as well also taking off. Flow-wise there are a lot of uh tools that help you build these agents in a in a very simple way. Still you need to understand the concepts but it helps a lot with development. Okay. So if you’re not a very technical person you can use some of these noode tools lindy nadn make.com zap year you name it they’re all becoming huge. Or if you’re trying to develop a more robust internal system you’re going to work with developers to build on top of a crew AI or a lane graph. Is that right? That’s that’s right. I think those um programming frameworks uh give you um needed control and flexibility that for very complex agentic implementations you you still need um and and those are those are um evolving quickly and I think also the exciting piece of that those are open source projects. You can go to the GitHub. uh I something I do uh sometimes is I go to the repos I see the the the PRs and the issues and the conversation in the repo itself and what people are asking and and uh anyone can contribute to those frameworks and make them even better. So there is there is a lot of fast innovation happening at that space because of the power of the community and the ecosystem. M you just mentioned building things and that’s when I’ve really figured out you know what is rag versus what is fine-tuning versus what is the other elements of context engineering but for someone who hasn’t gone through that what is rag useful for so rack is useful to give additional context to an LLM so if we uh step back for a second LLMs are trained with data uh at a point in time and uh obviously for most use cases uh you need to uh inject new updated data um in order to get the output you’re looking for. In order to do that, the the most popular technique is called rack. You can use fine-tuning, but fine-tuning is not really to to inject new updated data that is changing all the time. Is fine-tuning. Actually, one of my most viral post is a guide on when to you should be doing fine-tuning versus rack. Um but uh Dra is basically uh great in order to connect directly to a to a knowledge base to a database and and um is is a space that is evolving extremely quickly. uh my first year uh in after the release of CHAGPT 90% of the use cases we were doing for enterprises were rack use cases um because it’s one of the most powerful um methodologies you can tap into all these uh structured data but also unstructured data and and just fit it directly into an LLM. So that’s I think it’s a gold mine for most traditional companies that are sitting on on a lot of valuable data. Wow. 90%. So tell me a little bit more, how are enterprises using rag systems? What are employees using rag for? How does that help them build a better agent? Yeah. Uh think of rack as one of the core components of a of an agentic system. You can use rack just simply on a chatbot. But if you are taking it into an anentic uh application rack is basically the component that is going to uh let’s say uh just stepping back we have think we have planning and most likely in the planning phase you will have a step of fetching some data from somewhere y um so that’s that’s a rack pipeline right there you know so um users and my customers are they’re just looking at ways to tap into massive amounts of data instantly. Uh we’ve been very big on enterprise search, but most of that enterprise search has always been at the metadata level and this is taking it to a whole new level. This is allowing us to tap directly into the information on on those uh documents and structured data. So you can go tap give me like the the top use cases for for a specific uh for my top 10% customers and and then you can export directly from whatever reports or documents you have and get that directly into for example a product manager and then they can start making some assumptions on which feature they should be developing for example to go and accelerate development in certain areas. So um yeah there is uh some some say we’re in the age of ideas. So with these new tools tools and new access to intelligence tab to um um enterprise data uh my my area is really on the enterprise side. Um that’s where we see an explosion of new use cases. Are there particular technical frameworks or things people should know about when it comes to rag like different options to implement it? many many different options u and many different building blocks and ways to you you can do that. Um at the end of the day we are we are building these uh pipelines that they do a lot of different things. So uh seems like all the hype is on the LLMs. Uh but then you need you need good embeddings models. uh embedding uh model are those that are going to convert text into tokens and and those need to be really good and those need to be uh they they they can be good in different languages. They can be faster. They can be slower. Uh it depends on your on your application. You need vector databases. Uh you need ways to do like uh filtering, search, ranking. So all that uh and we have folks uh killing it out there with like data engineering uh education like Zach for example um because a lot of the problems in AI are are data engineering problems are connecting the LLMs to all these uh very complex data systems and in order to do that at scale is very very complex and for that you have uh a lot of different technologies if you are more on the uh AI application layer we have frameworks like line chain that we dimension and and so Lama index. Uh, and if you’re more on the heavy data side, then you you you have things like Spark or Airflow or things like that. Okay. So, that’s where all those terms fit in. And then there’s this concept of vision rag. What does that help you do? Yeah. So, as we as we mentioned, um, rag extracts information um, uh, adds adds context into the LLM, right? And a lot of that information it’s in the it’s in the um it’s an unstructured documents. So uh very rich PDFs with um very complex tables or charts with a lot of valuable information. So vision rack is taking the classic rack that is more just based on text and it’s opening up to more multi multimodel uh scenarios and and is adding that component. uh there are some LLMs that are great for multimodality um nowadays uh and but then also you have like open source projects one from my colleagues from IBM called docklink which is available on GitHub is a free framework that you can go uh grab and it’s going to be really good at um getting info from like um word documents, PDFs, PowerPoint, a set of different file formats and extract all that uh information visually and then you can fit it into a rack pipeline. M so that’s what we call um that’s what we call v vision rag that is also kind of like very popular and and is opening up new new use cases and I think vision rag is really important for charts right because there’s so much rich data that lives in charts yeah yeah if you are in some industries uh like for example in healthcare that you need to read uh charts that come from from um very advanced equipment or you’re in finance and you have a lot of charts for like the markets right um or tables as well. Tables uh many times come they are uh exported from from a spreadsheet and put together in a nice report on a PDF. Uh uh there is there is so much you can uh you need to do in order to understand what the chart is saying, what the table is saying, what are the conclusions. So um yeah that’s why vision rack is becoming super critical and and uh there are a lot of different ways to to build that. uh I think the right combination here is to again build the right pipeline with the right components like dockling that I mentioned and then have very good multimodel models that are able to also understand um like um image as input for for the prompt. Today’s episode is brought to you by the experimentation platform Chameleon. Nine out of 10 companies that see themselves as industry leaders and expect to grow this year say experimentation is critical to their business. But most companies still fail at it. Why? Because most experiments require too much developer involvement. Chameleon handles experimentation differently. It enables product and growth teams to create and test prototypes in minutes with prompt-based experimentation. You describe what you want. Chameleon builds a variation of your web page, lets you target a cohort of users, choose KPIs, and runs the experiment for you. Prompt-based experimentation makes what used to take days of developer time turn into minutes. Try prompt based experimentation on your own web apps. Visit chameleon.com/prompt to join the weight list. That’s k a m e l e o n.com/prompt. Aie eval are one of the most important skills for PMs. And I know you know they matter. The question is, are you doing them right? Most teams are winging it with basic metrics and hoping for the best. Meanwhile, the teams that actually ship reliable AI, they’ve cracked the code on systematic evaluation. Today’s episode is brought to you by the AIE evals for engineers and PMS course by HML Hussein and Shrea Shunker. This live Maven course will teach you the battle tested frameworks from HML and Shrea who are the engineers behind GitHub copilot’s evaluation system and 25 plus production AI implementations. 4 weeks live instruction. Next cohort starts July 21st. Start shipping AI that actually works. Enroll at maven.com with my code ag-roduct-growth for over $800 o f f . T ha t^{'} s a g - p r - g g r t . W ha t d o m os tt e am s g e tw r o n g im pl e m e n t in g r a g sy s t e m s ? U m I t hink a tt h ee n d o f t h e d a y l ik e a l o t o f t h eco n v er s a t i o n s t ha t I ha v e w i t h c u s t o m er s a r e u m f r u s t r a t i o n so n o na cc u r a cy . I t hink t h e in t h e in t h eco n s u m er s p a ce a l i ttl e a l i ttl e bi t o f l a c k o f a cc u r a cy i s a cce pt ab l e . U h y o u c ank ee p i t er a t in g an d an d u hi t^{'} s n o tl ik e abi g u h u m sy s t e mi s g o in g t o g o d o w nan d a f f ec t mi l l i o n so f c u s t o m er s . B u tw h e n w e a r e t a l k in g ab o u t u m f or e x am pl e p u tt in g u ha c u s t o m er ser v i ce u h c ha t b o tt ha t n ee d s t oco nn ec tt or a c k i t n ee d s t o b e v er y v er y a cc u r a t e l ik e 70$ 20,000 in compute to get a benefit of like 100 bucks you know uh so this false illusion of like I’m using AI I’m being more productive but what’s happening underneath is like you’re spending a lot on on execution either on the pipelines or on the AI I compute. So there needs to be like frameworks. Uh some are talking about like AI hubs or an AI office that tracks all the all those projects and uh you want to encourage innovation and use case creation, but then at the same time you need to have a a way to to assess those projects. Yeah. So you said most employees should be getting to the stage where they’re managing 10 to 20 agents. How do you think through what agents you should be building? uh it depends on the business practice. Let’s take for example uh you are sitting in marketing um if you see all the things that I’m sure you want to do and you have bottlenecks in in different areas right maybe you have a bottleneck in terms of like you cannot iterate fast enough on on the copy because you are relying on some third party agency and then the turnaround maybe is one week. So using agents maybe you can what it takes a week and a set of maybe meetings you can do it in in one or two days uh or even faster. Then maybe you have um you have challenges as well creating advanced uh um um just creative material, videos, images, charts that are impactful. Again, you can use AI for that. Or or you want to uh in in experiment with AB testing on the website, you can uh have AI to create different iterations of your website of your copy. So uh at the end of the day you I think everyone depending on their function will have different specialized agents that can give either recommendations or full uh automation on on execution of certain task and then you will be able to generate more more output. Mhm. So it’s almost like managing a team under you and you need to figure out what are the right stages of human in the loop or human approval so that it doesn’t just go out and misrepresent our brand or something when marketing but at the same time it is reducing the work we have to do it. Yeah, there is this concept of orchestration that everyone has been um talking a lot about uh this year um which is this need of orchestrating agents and a lot of our our job is going to be on the judgment of the output of uh some of those agents. Uh again I think we are far from that reality um 3 to 5 years not because of the state of technology. We are here in Silicon Valley and we see this AI first AI native companies that are built from day zero with this agentic mindset. But then when you when you you you talk to small traditional companies, there is a long journey in order to get there. Uh but uh yeah um orchestration of agents and um being able to quickly iterate on them and um orchestrate them and check the the outputs that they are generating are good because ultimately it’s going to be the responsibility uh of of the humans. Um that’s that that’s a new um skill we all need to go and get used and learn. How do people get better at orchestration? Um I think I honestly I don’t have a very clear answer. Um a lot of um companies they are working on AI literacy so learning. Um so I think uh getting hands-on with the technology is really important. Um if you use the early days on the internet you were on a on a console and then we moved into easier and easier interfaces. Right. So um I’m sure the technology it’s being already democratized so it’s going to be accessible for everyone but then at the end of the day it’s good good education content like the one you are creating and and good courses and then um yeah target targeted for specific functions is going to be very critical. So if you’re a product manager what AI agents should you build first? Um I think well first of all product management I think is also one of those functions that is changing. Uh I I lead a team of product managers and I think usually the ratio the standard ratio that I’ve seen in the industry which we don’t always follow is like um a product manager for uh six to 10 developers. So you tend to have like product managers that are really focused on one specific area of your product. Um, so for example, in my in my AI platform, I have a PM that is really focused on on uh tuning techniques, another PM that is really focused on inference and serving models. And you have these PMs that are really focused on on certain areas. I think with AI agents, we are um we can get into a different ratio. Instead of 1 to 6 to 10, maybe we can get one every like 20 or 30 developers. Um because um and and and these PMS they might be able to cover multiple areas uh all at once because they will have uh they have agents that can do like competitive you can have a an agent that is doing competitive analysis. uh it’s a very it’s a crazy market like I mean the AI space is a crazy market every big vendor small startup YC there is so much action you cannot keep up you know so you can have an agent that is doing uh research another one that is building up reports for competitive that’s those competitive analysis then need to be polished and they uh the sales people need to be equipped in order to have a good conversation uh in front of a customer defending the the your product versus the competition uh then you have a lot of um you need to prioritize all the user feedback. So um with with very powerful uh AI agents you can match you can have agents that can check your usage data from uh SAS metrics directly with maybe user feedback that comes from social media from other systems that you have to collect feedback NPS systems and and then you can start uh gathering more inputs to prioritize better your road map. Um and then when you prioritize the road map and you come with a new feature uh you need to write the PRD AI can do like 80 90% of the work and then um before even you validate and you prioritize the the feature uh that’s where we can get into more details but you can even prototype it yeah you know and and then work with a select of users to get some some feedback so that’s I think where we are going with um with product management completely and I think the final Step two, right? Today’s episode is brought to you by Vant. As a founder, you’re moving fast toward product market fit, your next round, or your first big enterprise deal. But with AI accelerating how quickly startups build and ship, security expectations are higher earlier than ever. Getting security and compliance right can unlock growth or stall it if you wait too long. With deep integrations and automated workflows built for fast-moving teams, Vanta gets you audit ready fast and keeps you secure with continuous monitoring as your models, infra, and customers evolve. Fast growing startups like Lingchain, Writer, and Cursor trust Advant to build a scalable foundation from the start. So go to vanta.com/acosash. That’s v a nta.com/ a kas to save 500 off, use my code a AA25 and head to maven.com/rouct-faculty. That’s mavn.com/pect-fac. Once you release it into production, it can monitor if all of the sudden users are getting some corner case you didn’t realize. And then two or 3 weeks later it can tell you hey here’s what the statistically significant results were. So it’s like across every step of the PM life cycle. Absolutely. Absolutely. It’s it’s um it’s a one of the most exciting um functions because as I mentioned before we are in the in this wall of ideas and I think PMS have the as part of the job the description is to bring some of these ideas to life. Yeah. You know, and if you you you work in in product management, you know, like it’s been always uh kind of like a frustration to turn ideas into not features but sometimes even validating the idea, right? You need to work with design, maybe get them all up, have user feedback and then work with engineering and engineering is overwhelmed with uh production tickets on support and things that need to be fixed and new feature development. So um yeah it’s it’s that’s why I think also uh PMs all the PMs that I usually hire are really good technically and now with AI they will be able to take it to like three four uh steps further uh by themselves. So you recently commented on this idea of writing first versus prototype first cultures. Talk to me a little bit more about what the future of the role looks like the future of the PRD in this world. Um yeah, I’ll tell you a story, a personal story about that. Um and I think that that build u a lot of success in my career. Uh that’s more than 10 years ago, but uh I I’m from Spain. I moved to the US. My English was very very rough. And we had this kind of like big meeting with a lot of big executives how to reimagine how the next machine learning platform is going to be. And at that time I I still remember I had the meeting in two days and I was really struggling to articulate all the ideas that I have. So I I said myself I’m going to build a prototype and and to my surprise in that meeting everyone was just talking and showing slides and I was the only one that was showing I was showing the product. You can touch it. You could uh see it. It was not nothing close to production. And it was all kind of like um uh fake but giving the ideas and the art of the possible and and guess what I I I got to lead the project that became the kind of like the default and the path forward. So I think that’s what is happening right now. Before I had to just get hands-on start coding and and and do a lot of the work. I could have done that now in maybe like three or or four hours you know. So um now with all the tools I think every single PM should have access to to um B coding tools uh different different options out there in the market to just kind of like skip ahead and show some of those ideas directly into working prototypes and and that also helps a lot with with communication. The teams I work as well they are um they are worldwide. So there’s also um language barriers. Uh a lot of the work in big corporations is all about communication and that communication is either in meetings or written or in GitHub uh repos you know. So a lot of things are missed in translation. So um I’m big fan of showing and and skipping a lot and even if you write the most beautiful detailed PRD still uh a lot of information is is lost uh in translation and just there is nothing that speaks better than just a working working prototype. There is some worry out there that we’re going to start to get into a more feature factory solutions focused world. we’re not going to heavily investigate the problem space. If we just jump into AI prototypes, what’s the right step, the right life cycles to make sure that you are investigating the problem space, but you’re also taking advantage of this new prototyping technology. Yeah, that’s that’s a very valid concern. I I spend a lot of time with customers. I think that’s and again, that’s also part of how I built my career. Like my first two years I was I was traveling every single week to visit customers. I spent two years traveling. I had no no wife, no kids. So it was free to meet a lot of customers all over the world and it was it was unbelievable like not only network but just going deep into what they were doing trying to figure out the problems and I didn’t have LM but I I I kind of like started framing my own ideas hypothesis and checking what was going on in the market to build solutions. So I think you always need to start customer first and um yeah PMS in my opinion they need to spend a lot of time talking to to customers and get going deep not at a high level but just really trying to understand what are what are their um challenges and then figure out how your product can solve. All right so let’s zoom back out of product management for a second and just talk about general tech workers. you’ve encouraged people to learn Python, get technical. Even you’ve asked leadership to get more technical in the AI era. Why is that so important? Yeah. And uh I I think everyone should have technical literacy in in this day and age. U always I think you are completely going to miss out on the opportunities of AI. Um the one that articulates this very well is uh Aaron Levy uh the box founder and CEO. He he says you can you have two ways two ways to approach AI either as a cost-savings tool that’s completely fine or or you you can uh just go do way more with AI you know so in order to do way and I think that’s the right approach I think that’s how uh companies will will grow will expand work is going to be more fun because we’re going to be able to accomplish uh new use cases and new new work in order to do that you need to understand the art of the possible of AI and there is no there is no document, white paper, LinkedIn post, video that is going to teach you the art of the possible unless you actually try the technology. So um luckily you uh I mean if you learn how to code in Python or do the basics in Python that’s that’s completely cool. Luckily the technology is getting democratized so you can still touch the technology uh and and not code um at all but you need to understand the concepts and and um a lot of the a lot of the leaders in the space they have a lot of ideas on things that they should be doing and they can do and and they need to understand how the technology uh bridges that gap. So yeah I’m I’m I’m I’m spending a lot of time learning myself. Uh many people ask me how I’m so up to date or I write about this content uh so often. It’s just I mean number one is I’m obsessed with it. So it happen it comes to me naturally. So every time there is something new I just jump and I try it out um in the in the evenings usually. But then um and then I start to form my own opinions based on my u professional experience. Mhm. So it’s about jumping in using the tools. What’s like a good road map if you had to give somebody if they’re going from zero to one to ramp up on all these tools like which tools should they try first in what order? Yeah, I think first just understanding the the the concepts and then tools um I think everyone should just develop one AI agent. Um and and there there are a lot of different tools you can do that with u no code flow builder. There are many out there. Um I I I was trying I was actually trying yesterday the new Lindy Yeah. AI. Very very very impressive. At IBM we have a tool called Langflow as well which is like low code flow builder experience as well. Um at the end of the day if you see each of those tools they still require you to understand the concept of AI. So I always recommend start with the concepts and understand what is an LLM, understand what is reasoning, understand what is rack and and and some of those things and then and then use any of those tools that give you like the building blocks and just think about one one use case that you have um in your own personal um um job or life and then try to solve it, you know, and and try any of those tools. And then if you want to go deeper and deeper um I think it depends on the on the on the role. If you are into leadership I think there is a lot of education about um how you inject AI into an organization and change management and and and so on. If you are more on on a practitioner um practice in in in different domains uh I think you will have a lot of um aentic solutions that can help you speed up your your work. So there are a lot of different options. Okay. So build with a noode tool. Then what’s the next step? Do you go into like a cursor or something like that? Do you learn to program? Where should people go after? Yeah, like pro programming. I I think also by coding is is a is a big one. Um because it I think you need to understand the different the different levels. I I I don’t expect everyone right now to just create something and put it into production in an enterprise uh setup. I think that’s u that that needs to be a little bit somehow controlled um depending on on on data access and tool access but yeah is start with uh developing some some agents um try bip coding um if you are if you are um curious try different things that are more advanced using Python depending on your level of expertise there um I mentioned earlier deep learning AI amazing short courses and uh yeah it’s it’s interesting because you you can if you want to for example hey I heard about rack every day I see it on my timeline every single time I log into LinkedIn or X just do a quick course on rack you will really understand that it takes like 3 four hours and then you can understand hey uh basically my entire organization can access all the information if we build these rack pipelines really good so maybe something we should invest and you you can do it in house, you can do it through a vendor, you can have like a third party help you build those. Uh but you need to understand those concepts because again people in the business need have they have the ideas and the use cases in in in their heads, you know. Let’s shift focus to open source AI. IBM and you in particular have had a lot of focus on open-source AI. Can open source really win? It feels like it’s always a cycle behind closed source. Yeah, I think um we need to understanding in the enterprise context that I’m coming from and I think in that enterprise context open source I would say always wins. Oh yeah. I think um it’s if if we are um like let’s take the latest OpenAI model right why everyone is so excited especially in the enterprise first the license it’s unbelievable it’s an Apache 2 license which is really good um then it’s a very good reasoning model like we we I think we’ve been lacking some very good reasoning models uh in the open source space and then and then you can just take them and deploy them anywhere. So you don’t have to rely on a third party API call where you most likely most for most of my customers they cannot just send a lot of uh confidential information there or they cannot connect it with their own tools. So right now you can take that model make your own deploy it anywhere in your own infrastructure. A lot of customers are still running on their own infrastructure. they are buying they’re creating their own um AI factories with different um AI accelerator providers like Nvidia AMD uh so that you can deploy it on your own machine so open source provides a lot of a lot of control um for enterprises which is a great thing then I think the pace of innovation of the community even though sometimes is a little bit slow at the beginning at the end of the day in the long term it shows up and I I think we’ve been a little bit through a cycle I think last year on we were we could see open source models getting closer and closer to closed source and then I think this year it’s been a little bit different Google with Gemini and OpenAI with GPD5 they they show that they are still ahead but you will see again the open source community rallying behind and and and pushing that forward and lastly is developer ecosystems uh what we were talking at the beginning on all these different frameworks that actually enable companies to develop applications. Um they are built on open source as well and they are deployed on open source systems uh like Kubernetes and BLM. BLM is a um is the basically the the engine to run models you know. So yeah I think on the there it’s not just the LLM itself it’s the entire um AI ecosystem. PyTorch is another great example like everyone is building on PyTorch which is also open source. So it’s it’s I’m very passionate about open source and I think in the long term it always wins. What’s PyTorch for people who don’t know? PyTorch is basically the the framework that allows uh you to uh create very complex deep learning algorith algorithms and and run them. Yeah. And just about all of the closed and open source foundation companies are building with PyTorch, right? Pretty much all of them. And that’s a project that came out of Meta. Um and uh yeah it’s it’s open governance and everyone is contributing to to it and it’s been used by every single major AI lab in in the market. So you’ve been in AI for 16 years. What have they been the biggest open source releases over time? Um it’s been a while a wild journey um because when we talk about uh open source a lot of the conversation right now is with LLMs. So um I think um if we if we just focus for an open source for a moment um I think Mistral did a massive things when they came into the market. They they were also I think the first to provide a mixture of experts open-source model um also they did it they did it in a very funny way with a torrent link that you have to get a little bit your way to go download. And then um we had Llama building a fantastic ecosystem around around open source models. IBM we open source our models I can speak to that. And then uh now open AAI and others. So there is innovation in the model space. If you go check hugging phase which is kind of like the repository of all these open source models that CLM and the team is building like there are hundreds of thousands of open source models. Then there is data as well also available on hing phase. There are a lot of data sets data sets for many different things for pre-training for post- training for alignment. So these are also components um major PyTorch massive uh tensorflow uh look promising. Finally um PyTorch kind of took over. Um um there is always kind of like at the beginning you don’t know which one is going to win and then you let the community and the ecosystem uh um move that forward and and mo in most of the cases there are very technical decisions that are very critical or users user simplicity. Um and then a lot of the conversation as well is happening on um potential alternatives to CUDA uh that provides a lot of control for Nvidia. So uh it’s a it’s a very exciting uh ecosystem and yeah it’s not stopping and it’s operating at every layer really every layer in in every layer of the AI stack you have like three or four projects and new incumbents coming with new alternatives and and I think the beautiful thing about open source is let the best win you know y briefly mentioned pre-training post-raining and alignment for people who don’t understand that those are the steps in model building what what’s one deeper what’s happening in each of those. Yeah. And and for 99.999% of the people they won’t they they don’t really have to touch that. uh this is really done by the frontier AI labs that they train the models and basically in pre-training is when you basically gather all that uh clean data set and you and you train this what you use to train your your model and then uh you have the post- training phase and the alignment to to make sure it it performs uh properly. So these are uh different data sets that you use for that. uh obviously we’re running out of data. So then you have new methods to create synthetic data high quality synthetic data which is synthetic data is data generated by AI algorithms and then supervised by humans to make sure that is high quality and and yeah like all that process uh is what is used by all these frontier labs to train um to train all these magic LLMs that we’re using. So let’s get to IBM. How is IBM going to make big waves in the AI space? Yeah, so um I joined IBM when we announced IBM Watson, you know, and it’s been a wild uh decade uh with a lot of lessons learned learn and and uh also things that got we got right and things that we got wrong. Um right now the one of the things I’m very bullish is about providing customers uh flexibility to tap into to deploy AI anywhere and to tap into any AI um engine they want. So so what what does that mean? That means um um our customers uh they they sit on a lot of data as I mentioned before. It’s a gold mine of data. Uh they need to execute the that AI close to that data. uh cost per token is extremely important you know so uh I think we are one of the only providers that provide all this flexibility to deploy the AI very close to in the infrastructure they they want whether is a hyperscaler whether is on prem on a private cloud uh setup or a combination of all of those so that’s number one uh then we’ve been developing also uh our own AI uh models is a family of models called granite and then uh providing but Our customers they want everything. I mean the customer I was yesterday and it’s a trend I see with every single customer they have all the options you know. So okay so how you provide and govern access to all these uh AI engines no matter where they run. Uh so you make sure you have some way to understand the the overall cost the access control and and things like that. And then we provide a lot of um tooling on top to make sure uh we give productivity to developers and and at the end of the day this these systems are built for scale massive scale. So we are working on different projects to uh help scale inference throughout multiple clusters in different environments and and and then um an area that I’ve been responsible as well is uh the governance piece and the governance piece is a is a u one that many people are thinking uh after the fact and it should be thought before especially in in an enterprise setup. So uh I’m sure you heard about a lot of the AI regulation that is coming to the market and that regulation is being updated and is uh different in at sometimes by industry by state by country and so you need to have like an inventory of use cases at different stages and for those that are in production they need to be compliant uh to a certain regulation. So um we we have um very good tools in order to to do that at scale. Why is the granite model important? I think that the granite model has two I would say maybe uh the research team will disagree with me there are more but I think they has we have two major components. One is um uh cost per token. So these are very small models. Um one of one of the most popular ones is a two billion model that performs really really well. So if you if you see what opening I released uh last week the smallest one is 20 billion. Okay. Um these are different kinds of models like that that model is open eye model is very good at reasoning. Uh but like for certain use cases what we see is uh cost per token is extremely critical and for some uh use cases you don’t need generic models that know how to do everything. In enterprise setups you need models that do one thing and do it really really well. So um these very small models are extremely good. they are very cheap and they run in in hardware um that in some cases even commodity hardware. And then um the second thing I I will add two more. The second thing is easy to customize. So we’re talking about uh tuning or drag or things like that. The larger the model, the more complex every single thing you’re trying to do is. So if it’s a very small model, it’s easier to tweak it to tune it to to change the weights and and to uh embed it into into different um uh customization setups. And then the last one is um which is it was talked a lot at the beginning of this AI uh um craziness that we have going on. It was about the copyright of the data. uh for most of the models that we use today we have no idea which data was used to train um all our data is actually is even disclosed on the white paper uh our legal teams they went through it is uh has the proper copyrights and and so on so uh and we provide that information very transparently to our customers so that that build a level of comfort for some specific use cases in certain industries that has been that has been really good so the AI talent wars have gotten insane. People have heard about 800 off. That’s ag-pr-g grt. What do most teams get wrong implementing rag systems? Um I think at the end of the day like a lot of the conversations that I have with customers are um frustrations on on accuracy. I think the in the in the consumer space a little a little bit of lack of accuracy is acceptable. Uh you can keep iterating and and uh it’s not like a big uh um system is going to go down and affect millions of customers. But when we are talking about um for example putting uh a customer service uh chatbot that needs to connect to rack it needs to be very very accurate like 70% accuracy is not acceptable or we have use cases where there are the the human is not interacting with the with the system is like machine to machine. So and then you need to have like the right uh humans in the middle. So you need to build very trustworthy uh systems and what many people are getting frustrated is about the accuracy of the rack because they are just applying some vanilla uh out ofthe-shelf uh templates and implementations. So you you need to really build a strong practice to to properly evaluate the outputs and and at the end of the day it’s really a a data problem. So, so yeah, you you need to build that practice to properly evaluate and and and understand uh what is an acceptable business accuracy for the use case and then um just keep iterating in the architecture in the different uh configurations that you need to do in your pipeline in order to build that accurately. So that’s that’s one. I think they are also uh underestimating the power of rack. Um because uh rack is providing uh like if Google was unbelievable to provide access to information for everyone. Uh at the end of the day they were provide they are providing like uh set of links and then you need to go and find the information. Rack is giving that superpower to every single company to build that at scale to tap into all the company’s information uh for every single employee you know and and there is there is so much that can be done in that space um but to in order to do it right um it’s it’s it needs um very heavy uh engineering at this point. M. So if you’re thinking about evals, that’s usually an area that people talk about just in the context of the final output, but it sounds like you’re saying evals are really important in the rag system itself. Yeah. And um yeah, I mean the the eels the rack space and the eels space um keeps evolving super super fast and uh there are there are there are new techniques new um new companies innovating in that space. Uh I think evaluic workflows should be almost um put at like every single step if you’re really serious about developing something. um um a critical system you know and then you need to to evaluate u at different points before you put something uh into production. At the end of the day, eels is basically adding that no um human expertise to validate the output of what the AI system is giving you, you know. So if you have a system that has a lot of multiple steps and you are only checking the output at the at the end of the spectrum I think you’re you are missing I think in the in like it’s classic software development you will have evaluation in different points so yeah doesn’t change that much in that front it’s just more about the methodology on how you do it okay and how do you do good eval for a rag system there are again this is an area where there is u there there are a lot of papers talking about good techniques and then it’s pretty cool that uh the frameworks and the open source community is coming with projects to help uh to help customers or users to to do that. At IBM, we have something called the EVAL studio that basically allows allows um the either the developer or the business user to uh do like proper evaluation of the outputs and and there are different ways. There are ways that mix um um there they mix synthetic data with like human uh checkins with like having a data set that has the ground truth. there there are different different tools and um yeah we we’ve been pushing one that is called evaluation studio which is GUI based because we also understand a lot of the use cases the knowledge and the expertise is in the in themes and the business users and they are they know very well what is good and bad and they need to be able to um uh assess the outputs of an AI system and and and this needs also it’s not a one-off that is just you do it once and and uh to the next agent. These systems need to continuously be checked and improved. And that’s the that’s really the the power. Yeah, it’s like any internal tool you’re going to build. There’s going to be ongoing maintenance and with an AI agent, a lot of it is in the eval phase. I think that’s a really interesting insight that you want to equip yourmemes to be able to help you with those eval. You don’t want to just be doing those in some engineering silo. Yeah, that said there there needs to be some framework on how you do that. Um, when we’re talking to companies like my set of customers that are like hundreds of thousands of employees, you need to put some best practices and frameworks on how you do that at scale in a in a in a company. Um, so I think that’s a little bit the the um the challenge in a lot of these companies. they see they they everyone has a lot of ideas and it’s how they can experiment in a safe way and then also a customer told me recently they don’t want to they don’t want to um spend like 1,000 and join over 10,000 ambitious companies already scaling with Fanta. Today’s episode is brought to you by Amplitude. Replays of mobile user engagement are critical to building better products and experiences, but many session replay tools don’t capture the full picture. Some tools take screenshots every second, leading to choppy replays and high storage costs from enormous capture sizes. Others use wireframes, but key moments go missing, creating gaps in your understanding. Neither approach gives you a truly mobile experience. Amplitude does things differently. Their mobile replays capture the full experience. Every tap, every scroll, and every gesture with no lag and no performance hit, it’s the most accurate way to understand mobile behavior. See the full story with Amplitude. Today’s episode is brought to you by the AIPM certification on Maven run by Mcdad Jaffer who is a product leader at OpenAI. This is not your typical course. It’s 8 weeks of live cohort-based learning with the leader at one of the top companies in tech. OpenAI just doesn’t stop shipping and this is your chance to learn how. Run along with product faculty and Mo Ali. The course has a 4.9 rating with 133 reviews. Former students come from companies like OpenAI, Shopify, Stripe, Google, and Meta. The best part, your company can probably cover the cost. So, if you want to get 1 billion for four years at Meta for some of the highest paid AI researchers, yet you’re still saying that AI talent may still be underpaid. What’s your take on this? Like what’s going on with these AI talent wars? Why is AI talent underpaid? I I got in trouble for that that post, but I think um I think there are two things here. One is like kind of like the ethical piece. It seems completely unethical that someone is making that absurd amount of money. But then you need to put things also into context, right? We’re in a a capital allocation market and we’re talking about talent that is very unique. I would say maybe there are like 200 of those um folks uh worldwide. Yeah. And those folks are the ones that are making they are actually using all these capex being uh being spent by the biggest and strongest companies in the world. So when you have a uh one of those companies spending billions on AI clusters and they are even talking about building nuclear uh nuclear plans to power those those AI clusters and those clusters those clusters are they are for training new models um tuning new models serving new models. So you need the right talent um to leverage that and and literally like one architecture decision can can um use capacity on those clusters for weeks and month. So if you put the numbers into context that’s massive. So that’s number one like those those folks are really capital allocators right now. Um not only just employees. Yeah. And and second one is I mean you need to see the state of the market like a lot of these people they are either in they are founders or or first employees of um very wellunded uh companies. So they have like very sweet equities uh at extremely high valuations. Right? So if you see what they are what what is their opportunity u maybe um they they have two or 300 million on equity in some in some company. So if you want to put a very sweet package like the capital motion and the state of the market and where these technical folks that they are like founders of some of the most promising companies in the world. Well, I just love to see the nerds get paid like athletes. this. I mean, I I heard there are AI um agents, but not agents in the like NBA agent for players that are helping you negotiate those uh those contracts and things like it’s just it’s it’s wild. It’s the it’s the nerds are are taking over. Revenge of the nerds. Amazing. Um so, you have an amazing career story. You rose from intern to VP of AI. a lot of people would want to follow your trajectory. Of course, you know, you did good work, you made good connections, but are there things about the way you work or the way you manage your career that really helped you propel so quickly? Yeah, I think this is one of those that be careful what you dream because it might become true. Uh I was I was a kid and I was just obsessed with being here in Silicon Valley watching all the keynotes the action happening with Apple and all Oracle and all these companies and I really really really wanted to to be here. So every single thing I did either very intentionally or intentionally took me here. Um and for some things I was like very very aggressive trying to get there. I I when I was in Europe, I wanted to be in a US company to get the visas to to get here and get sponsored and and so on. Um so that was one factor. I think the other factor is is I um there is a component of of luck, you know, uh I joined IBM when IBM Watson was announced and then I’ve been very consistent on that AI path. Even on these AI winters that we had in between, I’ve been always working in in AI and machine learning and and yeah like we are now in this stage where AI is the biggest thing in in the world. uh so there is that luck factor but I had a good intuition right I saw the promise of the technology and and I was very connected with all the developments that were happening with uh for example with Nvidia and AlexNet and the promise of all this technology but we didn’t expect this to happen so so quick and then and then is also more about managing kind of like the corporate uh ladder something I always recommend and is network is very important um and network and add value and just um be humble and uh problem problem solving. So yeah, I’ve been always kind of um lucky to be in these very hot projects and then part of my nature. I’m very impatient. I want to build things very quick and um that fits the narrative in corporate America that they want people want to see results quickly and and they want to see innovation. So um yeah, I was always with this mindset of building and showing not telling and and and then I’ve been lucky to surround myself with unbelievable colleagues um that made things uh happen very fast. So that’s that’s been kind of like my story. I started I’m from Spain. I started working at IBM in Belgium in France. I moved to Chicago and then I’ve been here in the Bay Area for 10 years. Wow. very intentional journey to get to Silicon Valley and then seize the opportunity stay grinding. A lot of people they come to Valley, they see the gray weather, they leave after 3 years. You stuck it out and that’s really an incredible story. Another incredible story that you have is you I believe June 2023 you shared was when you started your content posting journey. We’re talking in August 2025. You have nearly 200,000 followers. I think it’s at like 194,000 today, right? By the time this episode published, it’ll be 200,000. How, and you’ve written about this a little bit. You use AI in your content creation process. How can other people use AI in their content creation process to grow like you did? Yeah. Um, first, why I started doing it, I I think I always um wanted to be a better communicator, and I think if you want to get good at something, you just need to flex that muscle. Yeah. So um and because of the work I was doing every time I was posting something either at the time on blogs on Medium or sometimes on LinkedIn I have posts from like 5 years ago that were getting like thousands of views which is rare and they were they were resonating. So then I was like, “Okay, I’m gonna try to put consistency and a framework.” And it’s something I should have been doing before. And and then I I put a system just to collect ideas and and and and just write every single day. Like I have a post every day on on LinkedIn at uh 7:00 a.m. Pacific. No, 4:00 a.m. Pacific, 7:00 a.m. Eastern. Oh. Um, so kind of like when when the US is waking up and it’s still working day in Europe, uh, you you get an update from me. Um, I will say so though that um I was using AI a lot more before. I barely use it these days. Oh, and I think that’s also part of the differentiation on on good content. So um and I have posts where I had like like two years ago I had agents filled with um baby baby AGI and things like that. And I think at that time I was always kind of like concerned that I didn’t have enough um enough um ideas. So I had agents to do like research and they will just scroll YouTube x or uh different sources and give me like what is the content that is getting more engagement and then my my content was really optimized to go viral. Um which was good that that helped grow a lot. Um, right now my content is more targeted to the people I want to talk to and the audience I’m going after and and that’s helping as well with my my uh connections and and professional experience. So when I sit in front of a customer, most of like 90% of the time they already follow me, they know my content, they have questions about it. So um and and that help also inspired um um other people at IBM to help promote uh IBM technology. So um I I I was very heavy on AI. I use it a little bit less these days. Um just because I think uh um just trying to um spend more time thinking and that’s a way to differentiate otherwise the content is also getting democratized these days. Yeah, it is right. I also had a phase like two years ago when I was using AI and I felt like it was giving me an edge. Now I don’t use it at all because it it ends up leading you down the path of creating content like everybody else and that’s the content that doesn’t work. Yeah. And and and I think maybe where I use it is more on the ideation. Okay. So um like many people ask me how do you have time to write so much and it I think is if I didn’t write I my thoughts wouldn’t be structured and in order and I need that for my my job. So um I was flying yesterday uh back from a customer visit. It was like a 24-hour visit to a customer. in the flight back, I I just had so much information that I was kind of like structuring it and writing my thoughts um kind of like old school on pen and paper on a notebook and and I will use AI to kind of expand on on those ideas, help me structure them, but then uh write write something and I usually have my own routine uh every day after kids go to sleep, just spend some time like writing and then the more you do it, the faster you do it as well. Yeah. Um, at first maybe it will take you like 45 minutes to write something. Now it takes me maybe like five, seven minutes, you know. Oh wow. Because I I I don’t have like this uh paralysis when I have to write. I already know what I’m going to be talking about and more or less um I learn about formatting content and so on. So yeah, it’s one of those things that you just need to do it uh and regularly and then also track metrics as well if you really care about growth. Yeah. What works and what doesn’t. Oh yeah. I actually look back at what the days you got more followers, the posts that got more likes. Yeah. But I’m also not super obsessed with that because um a lot of the content that gets a lot of u likes and stuff is just reporting the news. Yeah. And I think I can bring additional value on top just reporting what’s happening in the market and there are many people that are doing that very well. So it’s more about okay I’m I’m in this spot where I’m developing enterprise AI software and I talk to AI implementation in in enterprises. So uh that’s where I’m I’m trying to spend more time. So I I rather get maybe like two 200 or 300 likes but the the people are that are engaging are like executives in some of the most important companies in the world or aspiring executives. um than than just reporting about the latest model in the market and getting like 3,000 likes. Yeah, that’s more valuable. What an ending to the podcast. Arand, thank you. This was I think your first deep long form podcast. Cannot wait to share this with the world. Really enjoyed it. Uh thank you for inviting me and looking forward for more. Thank you. All right. So, if you want to learn more about how to shift to this way of working, check out our full conversation on Apple or Spotify podcasts. And if you want the actual documents that we showed, the tools and frameworks and public links, be sure to check out my newsletter post with all of the details. Finally, thank you so much for watching. It would really mean a lot if you could make sure you are subscribed on YouTube, following on Apple or Spotify podcasts, and leave us a review on those platforms. that really helps grow the podcast and support our work so that we can do bigger and better productions. I’ll see you in the next one.

Brian Wong

Explorer

AI Agents for PMs in 69 Minutes — Masterclass with IBM VP

AI Agents for PMs in 69 Minutes — Masterclass with IBM VP

Video Transcript (If Available:

Graph View

Table of Contents