Is AI Lying to You? The Truth About Agents, Automation & the Coming AI Reckoning | 06-10-26
NPI TechGuysJune 10, 202662.66 MB

Is AI Lying to You? The Truth About Agents, Automation & the Coming AI Reckoning | 06-10-26

Sam Bushman and developer Ben Bushman dig into a major new Arena research study revealing just how often AI agents fail to complete tasks and why. From the "windshield scraper" problem to Google secretly installing a 4GB AI model on your computer, this episode is a must-watch for anyone relying on AI in their business. Plus: Micron hits $1 trillion, Apple bets on on-device AI, the FROST browser fingerprinting attack, and a frank conversation about where AI is really headed. Timestamps: 0:00 - Intro & welcome 0:37 - Arena AI research study: AI agents fail more than you think 1:23 - AI + automation: understanding tasks, tokens & prerequisites 12:08 - Micron tops $1 trillion on AI chip demand 15:33 - Apple shifts to on-device AI (and what it'll cost you) 16:46 - Google Chrome secretly installs a 4GB AI model on your PC 19:00 - FROST attacks: how websites can spy through your hard drive 22:38 - Free AI training event from Network Providers 24:39 - Wrap-up & final thoughts Call to Action: Ready to put AI to work for your business the right way? Visit networkprovidersinc.com to learn how Network Providers can guide your AI strategy, from system setup to security. Get the free Cyber Playbook at networkprovidersinc.com/cyber-playbook or call 385-446-5500. And subscribe so you never miss an episode of Tech Watch!

[00:00:12] My fellow Americans, Sam Bushman for TechWatch. We keep an eye on tech so you don't have to. Brought to you by NetworkProvidersInc.com. You've got a partner when it comes to tech. They can take care of everything from strategic guidance to help desks to everything in between security, AI and more. NetworkProvidersInc.com. The website for the show, NPITechGuys.com. And by the way, the developer of the website for the show is on with us now. Ben Bushman for TechWatch.

[00:00:45] Ben Bushman with us. Jay Harrison's on a hiatus taking care of his work business. But Ben's with us. Hi, Ben. Hello. Thank you for letting me join. And we're on talking about this interesting thing. There's a new big research study out by a company called Arena. So we're doing kind of a two-part series on this. Arena basically talks about, hey, you know what? AI is changing big time. People use AI way more than we realized they do. It's incredible. And it's not only AI for intelligence gathering. It's not just everybody uses ChatDVT.

[00:01:15] Ask questions like a super Alexa or super Siri. No, no, no. These are worker bee agents. And so what you got to do is understand that AI chained to automation is the key. AI is where you ask questions and get back answers. An app does that. But when you want to carry out tasks, you know what? Schedule a task. Do this. Do that.

[00:01:36] And many times the agents, they lie. They say that a file was created or a task was done and 8% of the time it didn't get done at all. Why? Because there's prerequisites oftentimes that have to be done for the task.

[00:01:49] Because many automated workflows depend on previous steps being completed. So I use the example and I teased it up last episode. Hopefully you'll join us for this episode too. But here's the deal. I said to Ben, what if I tell you to go out and, you know, scrape off the windshield because it snowed and it's ice on it. We got to scrape off the windshield so we can leave. To Ben, that's one task. Go, scrape off the windshield and, you know, we're done. But to the computer, how do you scrape off the windshield unless you go get a scraper?

[00:02:17] How do you spread, you know, how do you do that unless you go outside? How do you? And so a task to you and me and what we think of as a task may not be what a task really is to the computer. And every token relates to tasks or tasks relate to at tokens or multiple tokens. And so you got to kind of understand, you know, what's a task? And as a programmer, Ben, talk about this because it's not the same, is it?

[00:02:42] No, it's not. I mean, it's like when you say you tell people, hey, let's let's talk about instructions for making a peanut butter and jelly sandwich. Go ahead. Well, you get the peanut butter and you put it on your bread. Well, you know, did you just spread it with your finger? Did you stick the bread in the peanut butter? Oh, you got it. Do you even have peanut butter, Ben? Yeah. Is peanut butter available? Did you did you get a knife? Well, how do you get the knife? You didn't open the drawer. Right. You got to open the drawer first. And where's the door? Don't you have to go to it?

[00:03:11] Yeah. Computers think about these things. And when you ask questions in agents building workflows or things or just asking questions to AI to get, you know, information. You know, a lot of the times, unless you really walk it through everything, it messes up quite easily. So just during the break, I prompted chat GPT, chat GPT and I said, hey, I need to wash my car. The car wash is only 200 feet away. Should I drive or should I walk?

[00:03:40] And chat GPT said, walk. You're 200 feet away. It takes 20 to 30 seconds on foot. By the time you get in the car, start it, put on a seatbelt, shift into gear, drive over, park. You could already be standing at the wash bay, dude. Wash your car. Just walk over there. Yeah, but I don't have my car and I'm at the car wash now. Yep. Hey, man, that's a good example. I love it. So that's kind of the point that we're getting at, ladies and gentlemen. And so I don't think AI means to lie to you either.

[00:04:08] I think sometimes thinks things are accomplished because it may have actually wrote the file. Listen carefully. It may have actually wrote the file. It may have actually completed the task. That doesn't mean the file actually got written, though. So what if, for example, there was an error that AI wasn't watching for that was a do not write or, you know, a permissions issue where you don't have the authority to write? And so, you know, you never asked it to check if it wrote the file successfully, did you? That might not have been part of your automation. So it did write the file. It just didn't get written.

[00:04:38] It's kind of like I saved that, somebody says to me. Well, you must not have because it didn't get saved. So you really didn't. You thought you did. Well, AI is no different. It thought it did and it did its best to do it. It's not trying to lie to you. It's just and so we've got to really work on some of these things. Many automated workflows depend on a lot of different prerequisites and you got to understand that. It's kind of like if I tell the computer to start the car and let's say it tries to start the car and I say just keep trying. But I don't tell it after so many tries, don't stop.

[00:05:06] I mean, you got to stop or else the battery will go bad if it doesn't start. Hey, it'll just keep trying to start the car until the battery goes dead and then it'll just keep trying. It's a click, click, click. It's still trying because I never told it to stop at some point for safety reasons. Yeah, pretty soon you burn through $500 million like that company did. And that's what we're trying to warn against, ladies and gentlemen. And we're not really trying to be critical of anybody.

[00:05:30] What we're trying to do is help you learn the guardrails, learn the appropriate actions to make sure that AI becomes a blessing to you, a value to you. And that's why I highlighted that $100 a month is nothing, folks. If it's really giving you that much power to each employee, you take an employee that makes a lot of money, say $60, $70, $80, $100 a year, whatever they make. And you give them an AI tool and it costs $200, $300 a month for them to use that tool all the time.

[00:05:53] If it's bringing out, you know, a third increase in productivity or 25% increase in productivity, you have scored as a company. You really have. Don't pretend you haven't. For safety reasons, by the way, Arena restricts their agents from directly accessing email and messaging systems and confines those agents to protect the environments because they don't want damage to be done. So that's where this company is that analyzes this stuff, Ben.

[00:06:23] They're saying, hey, these findings highlight both the promise and the limitations of AIs today for the today AI revolution. It's great, folks, but we're not nowhere near the hype and the promise. And I submit to you that because of computer chips expense, because of needing to build these massive data centers for the increased use of these tools, because Anthropic and all these other companies are trying to find out what is the cost for a user?

[00:06:50] They don't know that because if I at first just want one agent to research something, it takes this much dollars. But what if I then learn the fan command that I tell it to use 25 agents to research something for me? Instantly, my usage massively increases, right? So they're becoming powerful productivity tools, Ben, capable of having increasingly significant tasks. They still require oversight and verification. Do you want to speak to that? Because this is exactly what your world is day to day, right?

[00:07:21] Yeah. One of the things that I think happens a lot is, you know, the CEOs of these companies over promise like crazy, right? A couple of years ago, a year and a half ago or something, the CEO of Anthropic said, hey, in a year, we won't need developers anymore. We just won't need them, you know? And here it is now, a year or 18 months since. And sure, you know, developer jobs are kind of, you know, odd. It's a weird market.

[00:07:49] But when you have, you know, data centers that try to get built in Box Elder, Utah, and the whole community comes against it, that's really going to, you know, stop a lot of the AI progression. You know, if data centers can't get built, everybody's usage increases. We become more efficient in, you know, understanding our workflows. And we can build, you know, more complex workflows.

[00:08:14] And like you said, when you, you know, you hit fan and you go to 25 agents doing research, you know, it can't cost $20 anymore. Anyway, it's interesting. They say, well, AI agents are becoming more powerful productivity tools. They still require oversight and verification. I mentioned that, but I want to tie that to this concept. The technology is advancing quickly, folks. And it can actually do way more than we give it credit for. But here's the deal.

[00:08:38] Well, the replace of human judgment is entirely premature. For businesses, educators, programmers, office workers, the real question may no longer be whether AI agents will become part of your work. It'll be how do you make it part of your work but leave the judgment to us? We'll talk about that with Ben Bushman in seconds. You're watching Tech Watch, where we keep an eye on tech so you don't have to.

[00:09:06] Brought to you by NetworkProvidersInc.com. Bypass the mainstream narrative with Liberty News Radio at LibertyNewsRadio.com. Engage with charismatic hosts live or on demand. We cover the crucial news focused on God, family, and country. News that other networks simply refuse to use. Think of LNR as hard-hitting news and podcasts at your fingertips anytime, anywhere.

[00:09:36] Join us at LibertyNewsRadio.com. Empower your day with the truth. Because the truth will set you free. LibertyNewsRadio.com. Cybercrime is exploding. Take Sarah from Sweet Delights, whose world crumbled after having to close with not being able to bounce back. Small businesses are prime targets, but the right strategies can keep yours safe.

[00:10:00] Jay Hill, CEO of Network Providers, has co-authored The Cyber Playbook Simplifies Cybersecurity for Business Owners with Strategies. To avoid costly breaches and fines, build a strong cyber attack response. Secure your business with key protections. Cyber threats aren't slowing down, but you can stay ahead. Protect your business. Ensure its security for tomorrow.

[00:10:25] Get the Cyber Playbook today at NetworkProvidersInc.com slash cyber dash playbook. Or call 385-446-5500 now. All right, back with you live, ladies and gentlemen.

[00:10:55] I'm just telling you, it's funny. I loved Ben's example of, you know, you were near the car wash. So should I drive my car to the car wash or should I walk? It came back and said, it's better to walk. It's closer. Why would you drive your car there? So you go to the car wash and then you say, well, I don't have my car. I'm at the car wash and I left my car at home. What do I do? And here's ChatGPT's response. Ben? So ChatGPT says, I've successfully optimized the commute and accidentally optimized the car away. Your options are walk back home and get the car.

[00:11:25] Wash yourself instead, since you're already at the wash. Stand confidently next to an empty bay and critique other people's washing. Or call someone to bring your car and pretend this was your plan all along. On the bright side, you've proven that walking was the correct choice for getting to the car wash. It was just the wrong choice for getting your car there. See how all the blame was placed on me and not the AI agent. There you go. And I think option D is the answer, don't you, Ben?

[00:11:52] Yeah, call someone to bring your car and pretend it was your plan all along. Yeah, call your wife and be like, honey, I need to bring the car to the car wash so I can wash it. She's like, what? He's like, listen, I walked over here because it was more efficient. Now I need you to... All right, Micron's in the news, ladies and gentlemen. This is interesting because they have significant Utah headquarters. That's why I'm kind of interested in it. But Micron now briefly tops $1 trillion market value. Why?

[00:12:23] Basically because their stock surged because the increased demand for memory chips is just through the roof, Ben. And this is where I think AI is real. I think it's going to change lives. But if everybody thinks it's going to happen tomorrow, they're wrong. We've got to really deal with the human judgment issue. How much of your life are you going to turn over to it? You're going to let it access your bank account? How much access to things are you going to give it? How much autonomy are you going to give it? How much trust do you put in it? When do you want the decisions and the judgments being made?

[00:12:53] That's issue one that's going to have to really be worked through for these companies. And we're getting ahead of it because the technology can do a lot of cool things. But beware, people. The second issue, though, is this big one about we've got to build these data centers. We've got to have chips. Pretty soon, AI will have all the chips and I won't even be able to buy a computer. Well, then how am I going to use AI? It's kind of like the car wash example, Ben. Hey, we optimize this thing so much that now you can't access it. We optimized you out of it, Ben.

[00:13:19] And so we've really got to kind of think through this stuff and find out, you know, how is this going to unfold? I believe we're going to hit an AI wall. And I don't know when, but pretty soon, where we're going to have to back off our expectations until these data centers get built, until we crank up enough memory plants, until we decide that it can become profitable, until we decide these human decisions of how much control are we going to give it and what decisions are we willing to let it make. These things, because of the hype, have been ignored,

[00:13:47] but they're going to come crashing down on us. It's going to be like the dot-com bust 2.0 kind of a thing, I believe coming fairly soon. I don't know when, and I'm not preaching catastrophe, but I am telling you, these setbacks are going to happen, Ben. Yeah, there's already things, you know, going on where, you know, companies will use AI for their customer service or support, and they start talking to other companies using AI. So now AI are just going back and forth constantly,

[00:14:16] and they can't progress on anything because there's no human to actually do anything. And it just goes back and forth, back and forth. Well, I, by the way, using Claude and ChetGPT back and forth, so I'll create a document, and then I'll say, hey, Claude, can you clean this up, make it look good, fix my spelling, fix my grammar, da-da-da, then it'll get done. Then I'll take it to ChetGPT, and I'll say, hey, man, I wrote this awesome document. Can you make improvements to it? And ChetGPT comes back and says, yes, these improvements are good. And then I basically say, cool, adopt them. And then I take it back to Claude, and I say, hey, me and ChetGPT have been working on this document. Can you improve it?

[00:14:45] And of course, you know, these have to outdo each other. So they just improved my document. And if you go several rounds, and it actually helps, it's incredible. You've got two worker bees now. If you pay 20 bucks for each of them, that's 40 bucks for these two double writing assistants. It's incredible. However, at the same time, if you keep going back and forth eventually, it just starts going sideways. It comes back and says, well, if your main point of the document is this, then you've got to do this. And if your main point of the document is that, you've got to do that. And now you start going sideways. And that's where a human needs to go. Okay, wait, I'm the judgment guy here.

[00:15:14] You guys work for me, you little worker bees. You're not in charge around here. We created you, not the other way around. I want to stop now, because the document is just going sideways. You can't genuinely find ways to make it better. You can find ways to morph it, change its scope, but that's not what I'm after, see? And so then you've got to stop and go, okay. But Apple now is in the news, Ben, and I find this very interesting. They're focusing more on AI that runs directly on devices

[00:15:42] rather than relying exclusively on cloud processing. And the reason they're doing this, Ben, is because, again, they're running out of cloud capabilities. Like how much competing power can they bring to the table? They want you to bring some to the table too. This is good and bad as far as I can see. What do you say? Get ready for the new iPhone to cost $2,000 starting out. Yeah, I think it's an interesting concept. You know, I think anybody, you know,

[00:16:12] want their phone to be able to do more for them. You know, I say, hey, Siri, you know, give me the stats on this, and it just says looking up on the web, right? And I can't really progress with it. But so the idea of having more of a locally run AI tool that could do a lot of things for you on your, you know, mobile device, it's kind of an awesome idea. You know, but the only way they can do it is by increasing hardware, which just drives up price, and you still pay for it. And that's the problem.

[00:16:42] So now Google is involved in a similar thing. Google Chrome, Ben, I don't know if you know this, but Google Chrome has quietly started installing a massive 4 gig artificial intelligence model onto users' computer hard drives. Did you know that? I didn't know that. There's big concerns about privacy, about storage use, about all kinds of things.

[00:17:10] The hidden download is part of Google's Gemini nano system. That's a local AI model built directly into Chrome. And unlike cloud-based AI, this operates on the user's PC or Mac, and it powers different features like web page summaries, scams and phishing warnings, AI writing assistance, text rewording,

[00:17:42] tab organization for your Chrome browsers. The problem is many users never even knew the software was being installed or downloaded, and it's a large file. It's called weights.bin, so you can go search your computer, and if you have weights.bin on there, it'll consume more than four gig, and even if you delete that file, it'll just reinstall it.

[00:18:10] So it's installed in the Chrome app folder or Chrome app data folders and stuff where you can check it out. Do you have that file? It's kind of weird, Ben. I don't know how to respond to this, but it's very strange. Now, supporters of local AI say, hey, this is awesome. It's faster responsive or responsive. Hey, it prevents sending all kinds of information to the cloud, which is a security kind of issue,

[00:18:39] but critics say Google crossed a line by installing such a massive package without a warning. Anyway, you can disable it inside Chrome. What do you say to this, Ben? Do you have one of these files? I'm now searching my computer for it right now to see. I came across a news article, though, from Wired talking about, have you heard of the frost attacks? Uh-oh. What are they? This is a new technique. I want to stay frosty and I've heard the attacks, Ben.

[00:19:08] Frost stands for fingerprinting remotely using OPFS-based SSD timing. Essentially, the concept is websites can spy on you through your hard drive. And so when Google installs... Say that last part again. All right. We'll get Ben back. But this is serious, though, folks. So now you got to, what, spoof my fingerprint or my thumbprint with this remoted technology

[00:19:38] and all this kind of stuff? I don't know how to respond to that exactly. It's very, very strange, the things that are going on and that we're discovering with a lot of this stuff. I don't exactly know how to respond to a lot of it because there's so many things changing so fast. And this is the human judgment point coming up again. With human judgment, do we want these things just happening? Ben, you were continuing.

[00:20:06] This thing's going to steal my fingerprint? Yeah. So my network cut out, so I missed a little bit what you said. Yeah, don't worry about it. I'm just saying, you're saying this thing's going to steal my fingerprint? So yeah, what it does is, I'm trying to see what it said. It was fingerprinting remotely using OPFS-based SSD timing. Essentially, websites can use JavaScript

[00:20:35] that interacts with your origin private file system and allocated storage space that's reserved for a specific site to run code needed to complete a given task. Websites can create one with no interaction required by the visitor. So with this thing from Google that you're talking about, this reminded me of this article about how once we start installing all these things for browsers and all of this stuff, the security concerns just pop up right after. Well, and you might approve something for a very logical reason.

[00:21:04] But what you don't understand and what you need to consider and focus on and think about is, what did you approve that you may not realize you approved? Right? In other words, it might not just be the scope you think it is. It might be, yes, that's the right scope, but because of that scope, we can now do this, this, this, and that, which is a big concern. Well, I don't have that file on my computer, Ben, I searched. Did you have it on your computer? I'm not seeing it yet. All right.

[00:21:34] I searched my computer and I don't have it. So that's good news for me. You know, I'm in the clear, bro. Well, I guess you can't have scam and phishing warnings or text rewording or Chrome tag organization. So this is all the things they claim that happen in this file. Yeah. The question becomes, does it do it on the web if I don't have the file? But if I do it on the file, it does it locally or does it disable those things? But, you know, some of the details are what we still need to kind of dig into and figure out, right?

[00:22:04] Yeah. I mean, I don't know. So it's weights. W-E-I-G-H-T-S dot bin. And I've got a program on my computer called everything and it doesn't find it for me. So I'm in good shape. But it's possible that maybe it's somewhere I'm not looking to. I think I'm switching my whole hard drive. But yeah, zero objects. I don't have it. Anyway, very interesting stuff, folks. And we wanted to bring this to your attention.

[00:22:34] And don't get me wrong. We're not down on AI. Believe it or not, I'm conducting an AI training coming up in partnership with my Chamber of Commerce, American Fork, Utah, where we'll go to EmTech and I'll basically provide a free hour training on AI. We'll basically, it'll be a two hour lunch event where there'll be about, you know, 45 minutes of training and then there'll be a half hour for lunch and then we'll come back and finish up with some Q&As on that. And it's free and we'll buy you lunch to attend.

[00:23:03] You say, why would you do that, Sam? Because it lets people know who network providers is. It lets people know that we're consultants who are in the business. And then we're offering extended training for AI where we can literally go through your system and work on what AI really can do for you as a company, okay? Because people really need consultants to kind of look through this and say, what kind of benefit is AI to my company? And people need to realize

[00:23:33] that it's very difficult for AI to do well when you have a mess of your systems. Okay, the first thing you got to do if you really want AI to be productive and valuable is you've got to work on your system layer. Let's just quickly explain what is your source of truth? Let's say for your orders if you're a company and you take orders for products or services. Who controls your orders? Is that the single source of truth for orders?

[00:24:02] And so that's a simple example but people have got to kind of get an understanding of these different layers that this technology taps into. ChatDBT, Claude, Grok, whatever you're using, Gemini, all those are is the reasoning layer, Ben. That's all they are. They don't deal with any of the data underneath. They don't deal with any of the permissions or security. There's other layers that we've got to kind of talk through and so my training deals with all of that. It's coming up and we'll provide more details

[00:24:32] on it as well. Maybe we can even get somebody to take a video of it and put it online. We'll see. All right, thanks for being alongside with the ride. That's a wrap, Ben. Any final quick comment? You've got 10 seconds. I'm excited about the AI future. It's just going to come with a cost and I'm excited to see what the cost is. Make it a good.