foo for thought

AI is interesting. It can make an entire mobile app with just a sentence, but right when you tell it to fix a bug, it spins in circles for hours. This seems to be the problem with using AI for development. It can get 70% of the way there, but crossing the finish line is like pulling teeth.

“Anyone can code” actually means “anyone can code a buggy MVP of what they’re trying to make”. At least, that’s what it means now. That’s not to say the same will hold true next week, next month, or even next year. The technology is developing fast, so I wouldn’t be surprised. Regardless, it seems like a common problem in AI to not be able to cross the finish line in reasoning implementations.

Perhaps the most common use case of AI is as a developer tool. After all, the developers are the ones making these things, so why wouldn’t they want to make their own life a bit easier? Except, it really doesn’t.

I’ve been spending a couple months using AI as a part of my development process. I’ve used lots of different IDEs, full-stack builders, and chatbots such as Grok to assist with reasoning. And my findings are rather interesting. I’ve learned a lot about what AI fundamentally can and cannot do, and today I’d like to share that with you.

0-50%

When starting a project, especially in development, there tends to be a massive amount of boilerplate. Finding what framework to use, how to organize your files, are to name a few aches of setting up a project, Fortunately, AI is very good at kick starting a new project. We see this capability getting better every day with tools such as v0 and Loveable. With just a prompt, you can spin up an entire end-to-end application, it’s truly remarkable what you can do.

This is very cool for a number of reasons. For me personally, I hate all of this boilerplate because it tends to take up so much time out of my day. Now, if I know exactly what I want, AI can lay the foundations of the project for me. I think it’s only going to get better with time, and it’s already pretty amazing.

Consensus on 0-50% effectiveness: Extremely effective

50-70%

This is the area that starts to get messy. This is the part in development where, you’ve already gotten all of your files organized, already gotten a nice working MVP, and it’s about refining it to be your specific use case. I find that AI can either be really good or really bad at this, depending on how good your prompt engineering skills are.

When I say it’s highly dependent, I mean that it can turn AI from a third arm to a paperweight if you don’t know how to use it properly. If you prompt AI with specific details and a clear direction, it can be really really good. Notably, the problem solving still has to come from you. That’s where I feel like a lot of people get tripped up – they ask AI to do literally everything for them with absolutely no direction. While it certainly can make something out of this, it’s certainly not going to be good or anything like you intend. AI’s reasoning capabilities are still not great, especially with little to no sample data.

However if you give AI the tools and resources it needs, combined with effective prompting, the results can be astonishingly good. Proper planning and design beforehand turns entering prompts into literal sorcery. I feel like a wizard when I can one-shot a prompt into Cursor, it’s the coolest thing ever. Keeping in mind that, you still do have to know what you’re doing. You can’t make an effective prompt to code something if you don’t know how to code.

With that said, even if you do know what you’re doing and you’re a wizard a couple times, I can’t say it’s as good as it is with starting a project. Because there are so many variables and so many things that can go wrong when engineering something, it’s a problem that scales infinitely when diving deeper and deeper into your project, one that I’m not sure any AI will be able to get 100%. But I think in the near future, models will be able to drive up this section’s effectiveness quite high, so I’m looking forward to seeing what OpenAI and friends bring us.

Consensus on 50-70% effectiveness: Terrible to mildly effective, depending on model and prompt engineering

70-100%

This is where the faults of AI really start to show. AI is terrible at taking an idea and correctly engineering it to completion. Absolutely terrible. No matter what IDE I’ve used, no matter what I make the prompt, it can’t seem to cross the finish line. I think this is because, engineering past a certain point is exponentially more and more complex. The more you want something to be able to do, the more you need to engineer it to support those capabilities.

It’s really funny how bad an all-knowing LLM is at engineering, and it goes to show that it doesn’t really matter how much you know when trying to code something, it’s about how well you can apply it. It doesn’t matter that the LLM has access to millions of lines of code people have made – it doesn’t know how to connect the dots. And, even though models are getting better and better at connecting these dots, engineering is infinitely complex. You’ll always have harder problems, you’ll always require more complex solutions to said problems, and you’ll always want to scale positively to increase reliability and scale properly.

Consensus on 70-100% effectiveness: Terrible.

Conclusion

I’ve always thought since I was a kid – the day that robots learn how to code is the same day that they learn to code more robots and they’ll kill everyone on Earth. I don’t completely disagree with that now, it just goes to show that engineering is very hard!

Not only that, but the “takeover” of AI has, already been pretty underwhelming. When looking at the capabilities of AI and trying to answer “why”, it becomes pretty obvious. AI can’t really reason like us humans can. It can’t go from point A to point B (it can get 70% of the way there really well, but past that it trips on itself too much).

Any amount of ChatGPT breakthroughs could make this entire article irrelevant, but with the current position of LLMs I feel confident in this position. I think that there’s a lot of complexity to a lot of jobs that we just don’t think of until you throw them at an AI bot to do. Humans are a lot more capable than we give ourselves credit for.

Anyways, we’re safe from the AI takeover for now, and I think it’ll be a while until we actually have to super worry. For now, I think it’s best to learn how to use these super useful tools, and how you can be better at what you do with them, because I think there’s still a massive amount of potential.

Leave a comment