The reason I’m so excited about genAI products like ChatGPT, Gemini, Meta AI, and Siri becoming personal assistants in the not-too-distant future is for one particular use case. I like genAI’s ability to answer all sorts of questions, even though I know it can hallucinate information. I’m also a fan of what OpenAI has achieved with Advanced Voice Mode. But my end goal is to be able to control all my devices with my voice.
I want powerful, capable AI that understands my intentions when using my Mac and iPhone. I want it to understand complex commands, such as instructing it via voice to open certain apps, browse certain sites, perform searches in different tabs, take notes, download or play content, and even make purchases on my behalf.
I can’t wait to start using AI like that. But there’s no way I’m going to use an AI agent like Google’s leaked Jarvis AI anytime soon, if ever.
Why I want AI agents like Jarvis
Adding features like ChatGPT Advanced Voice Mode to my devices will change how I use computers once AI agents like Jarvis become available. I’ll be able to tell the AI to perform specific tasks in the background while I focus on other things.
This isn’t about buying plane tickets via an AI that can browse the web for you, although that sounds awesome. It’s about performing research and browsing the web. That’s what I do most of the time on my Mac and iPhone. I look for information that I need for my day job, which happens to be writing a part of the internet.
Having AI browse the web for me would increase my productivity significantly while speeding up my workflows.
AI agents like Jarvis will be told to open as many tabs as I want in a browser and perform required searches while I focus on something else. They might then summarize that information or point me to specific sections on a page by taking over my keyboard and mouse when I instruct them to.
For example, I might want to find more information about Google’s work on AI agents for this very story. I would want to see whether OpenAI has anything waiting in the wings, or if Claude’s AI agents that can browse the web might be a better alternative. This would involve performing the research manually, even if I use ChatGPT in its current form to help me.
But imagine using something like Advanced Voice Mode on my Mac to start several queries at once. The AI agent would browse the web for me and open relevant links. If possible, I could instruct it to summarize the information in a separate app.
Since we’re in the early days, I know there’s not a lot about these AI agents to tell at this time. We saw Claude do a task on the web only to then decide on its own to look at pictures. And Jarvis AI has leaked, but it’s not available to end users yet. As for ChatGPT, similar features must be coming.
I’d also browse the web by voice with AI for personal matters. Since I run half-marathon and marathon races, I would tell the AI to find plane tickets and hotel options for future trips. But I’d have the final say in the matter and make the bookings myself.
What happens to my data?
I’m sure (or hope?) that Google’s Jarvis AI will be able to handle some of my browsing needs. But I absolutely won’t use it, not at first. The reasons are two-fold. First, I don’t trust Google with my privacy. I would want a tool like Jarvis AI to run primarily on-device. If that’s impossible, I’d want something like Apple’s Private Cloud Compute to handle my requests.
The last thing I want is for Google to use a different AI model to keep track of my internet browsing, and create a detailed profile for targeting me with ads across platforms. Also, I would not want my interactions with Jarvis to help train the model. Importantly, I definitely don’t want Google to keep a history of my browsing with an AI agent.
Second, I’d want an AI agent to browse the web using the browser of my choice, which is not Chrome. Also, I would want it to use a search engine that isn’t Google Search, which I bailed on a long time ago. I think Jarvis AI will be deeply entrenched in Google’s ecosystem, at least initially. That’s the whole point of creating such products. Gemini is already built into plenty of Google apps and devices.
My thinking here is that Jarvis will be available for free to some degree. A premium version might offer the kind of features I’d want.
I am speculating here, and I could be very wrong. Google could surprise me and attempt to match Apple’s Apple Intelligence privacy assurances with such products.
Whether I use it, I hope Jarvis AI works great and becomes available soon. The faster these AI agents, the more of them we’ll see. If Google is first with an AI agent like Jarvis, it will pressure OpenAI and everyone else. Microsoft, Anthropic, Meta, Apple, and Samsung would want to offer similar AI capabilities.
Eventually, we’ll probably have all sorts of AI agents. Some might control apps, and others might handle web browsing. These AI agents would probably be baked into operating systems alongside the chatbots that we currently use. And that’s how we get to highly advanced personal assistants that will let us use voice for most of our computing needs.