AI, in its current form, is alright. Like I mentioned yesterday, it certainly has the potential – albeit, untapped – to transform a lot of what we do. The potential of AI is certainly there – but what does that mean for the apps we currently use? And, is there some world where AI takes over the notion of an “app” on your phone, or even a website that you go to?
I certainly think so. With enough time, I think AI will (or should) fully replace the way we interact with technology.
Exploring the history of apps
To explain what I’m envisioning, let’s look at how we use technology today. Put simply, technology is a means to access/manipulate data. Looking something up on Google is accessing granulated data, social media is a medium to connect with people through accessing information they upload, enterprise websites/apps allow you to complete work by providing a medium to do so – all of these start with an objective of the user, and ultimately end with a piece of material being consumed or manipulated.
This has been kind of existing since the dawn of time. We’ve always thrived on communication as a species, and productivity has been essential to our success. As technology has evolved, we’ve seen more and more things able to be done or accessed. Just 20 years ago, immediately getting access to all of the world’s news and everyone’s thoughts would have been insane. Now, it’s just on your home screen, one tap away.
Apps have evolved to be our way of getting something done, or quickly accessing a specific piece of information. Anything that you need nowadays usually has an app for it.
But, when you think about it a bit, how much easier does the app really make the user experience? Let’s take a task such as submitting a document for work. Without apps/technology, the steps would probably be:
- Receive the specifications for the document by hand
- Gather information via library etc.
- Hand write a document/type it
- Submit it to the appropriate person
With technology, we get:
- Access the specifications for the document via email/message
- Gather information via search/AI/online library
- Type a document in Word
- Email to the appropriate person
Looking through each of the steps one by one and seeing how technology improves each:
- Receiving the document – I would argue that, technology makes it very slightly easier to access information. However, with all information nowadays being over technology, the amount of information we each have thrown at us over message/email is way more than if we were to use real documents. So while you don’t need to physically retrieve a document, it’s not significantly easier to locate an email than to ask someone for something in person.
- This is the step I think technology has improved for us drastically. Accessing information has never been easier, and you can do so from the comfort of your own home. Although I will say, I think there’s definitely still churn when looking things up.
- Surely hand writing is much worse than typing, but unless we’re going farther back than when the typewriter was invented I think we’re good to call this even.
- Technology also saves us time on this step – you don’t need to physically go to someone to send information – communication has never been easier.
So, technology seems to save us a significant amount of time on some of the steps. Notably, though – there’s still the same number of steps. Repeat this process with virtually any procedure you can imagine – technology makes the means of accessing information and communication much easier, but does not remove any of the actual steps needed to perform tasks. You must still do all of the steps, there’s no way around that… right?
Agentic AI
Agentic AI is a concept that allows you to divert the work towards an autonomous agent. This agent would know the steps that you need to do, and be able to do them with the proper permissions and access. For a repetitive task, this is extremely useful.
Agentic AI is the first step people have taken to actually make AI do something. Instead of having a job where you manually do the same thing 500 times, you just make an AI agent to do it and run it 500 times. The labor is drastically cut down, allowing for employees to do things that aren’t like the same thing every time.
Something of note here is that this only applies to things on a computer – we haven’t really got to the point yet with robots where we replace the more blue collar jobs, and I think we’re very far from that!
So sure, agentic AI seems to be good for looping the same thing over and over again. Even for micro tasks at work (ex you’re always checking emails from the same person, always submitting to the exact same document) agentic AI helps a lot. It saves you time in those moments so you can spend your time actually doing real stuff.
What happens when you change the task slightly? This is, in my opinion, where the idea of agentic AI crumbles. Slight variations in a task or methodology still require human intervention. If you’re constantly tweaking and rebuilding an agentic AI for an automated process, oftentimes it might be easier to do the process yourself! This is the classic engineer conundrum where an engineer will spend five hours to automate a manual task that can be done in one. You can spend all day making agents to do tasks, but managing these agents will become just as difficult as managing the tasks yourself!
So, while agentic AI offers benefits from no agentic AI, I think we can do better. You’re basically just making automations that take out the small things. Automations have existed for years and, employees still exist! People still hire other people despite a lot “technically” being able to be done via automations/computing, just because the overhead for these things are oftentimes more than hiring someone to do it would be.
MCPs
MCP is a concept that has just been brought to life – a concept which I think has the potential to change everything. Basically, it’s a way to tell AI how to interact with a certain app or framework. I haven’t quite explored the technical implementation myself so my understanding might be a bit fuzzy, but it sounds like you can layer these “AI extensions” on each other. MCP is a framework to allow all of these “extensions” to talk to each other.
Not only does this make building agentic experiences easier, but eventually I think this could wipe out apps as we know them. Let’s go back to our apps exploration – we talked about how apps are a means to accessing and manipulating information. With AI + MCPs, we can essentially create an AI that is capable of doing the steps for us.
Let’s go back to our previous example with the submitting a document, which was 4 steps with and without technology. Let’s first revisit this with an AI agent that was build to theoretically handle the capabilities we’d like, and was configured with maximum efficiency. The steps we would have would be:
- The agent would send you the requirements
- You’d still need to do research for the document
- You’d complete the document by Word still
- Clicking one button would automatically submit the document to the correct person in the correct format
It’s still 4 steps – but why? Even though the agentic experience reduces some of our steps to literally just a click, we have this flow that’s inevitable – we as a human still need to access the web for research ourself, we need to read and interpret the requirements ourself.
Let’s look at a potential integration of agentic AI built off MCP, notably, an agent that not only is automated to perform a series of tasks but understands and broadcasts intent to every part of the agent. MCP allows the fetching requirements, research, and submission parts of the agentic AI to all understand the same context of the agent easily. None of this is to say MCP is required, but I’ll get to later how this would be nearly impossible to scale without MCP or a similar technology. In this new agentic experience, the steps might look something like:
- The agent would send you the requirements, handle appropriate research contextually aware of intent and any other relevant communications, and draft a document with generative AI
- You’d manually approve the generation or suggest improvements and submit the document
Finally, a decrease in our number of steps! Notably this takes the experience down from a constant requirement on our part to the agent handling pretty much everything, and us just needing to confirm this.
Going back to my previous point of, “MCP or a similar technology is needed for this” – the reason we need it to be modular is because we can’t predict every single use case/automation we need to build. Everyone has a completely different use case/need – so, agents alone I don’t think solve our problem. We saw before where, even though we had an agent, if it wasn’t contextually aware of everything, you’d still have to do the same number of steps. And even if it was, and you were able to implement something like the MCP example on your own – this does not scale. Spending tons of work on one specific work flow falls right into the trap of “overengineering” we talked about earlier.
The other important part about modularity and MCPs are that AI models are evolving so rapidly. The idea of what an AI “can” and “can’t” do are changing literally by the day. So, to get stuck in one way of developing or integrating something would be shooting yourself in the foot inside this extremely fast-paced environment.
How the future can happen
MCPs I believe are a great start to enable this modular, contextually aware future of AI. Eventually, you should just be able to talk to your phone, and it could perform any task/chain of tasks perfectly. There’s no reason that, with our current technology this can’t happen.
The reason that this would not happen is competition. Companies love to compete, and they love to make frameworks that others can’t use. However I’m at least a little optimistic about this, considering OpenAI already adopted the use of MCPs, which was initially proposed/launched by Anthropic, a direct competitor. So, hopefully this continues and people will be nice to each other for the sake of benefiting technology.
I’m going to try my best to understand MCPs and try to make sense of all this – tomorrow OpenAI might release another model that makes this entire article outdated. But I think the idea will always remain – modularity will win in the massively complex and wide field of AI and LLMs. As long as we work together, I think we can completely change how we see technology forever.
Leave a comment