AI Is Kind of Exciting, Actually

Tech is boring these days, but recent AI advances are catching my attention.

Jeremy McAnally

22 May 2023 • 3 min read

Honestly, I don't get legitimately impressed or excited by much in tech these days. All the breathless takes on the hot new trends are exhausting after a while. One can only hear "[x] will revolutionize [y] as we know it!" (where x is some new technology and y is work, the economy, or Technology™) so many times before everything seems like it's a big game of crying wolf for clicks.

To my brain, most of these things have either been obivous pump and dumps, overblown pipe dreams that won't catch on, or incremental bumps at best. Apple's shrug emoji was modeled after my reaction to most of the drooling around cryptocurrency. Federated social media puts me to sleep. I yawned at VR. "Web 3.0" seems like maybe someone bumped the major version way, way too soon. But the current wave of AI interest centered on LLMs has actually caught my attention and got me a little excited about where it goes. Let me explain.

It's Not Really "Just" Incremental

The current wave of foundational LLMs are at least partially as useful as they are due to training and not necessarily a gigantic leap forward in AI implementation. The architecture for this sort of model has been improving, yes, but the recent surge isn't due to someone unlocking some Pandora's Box of new algorithms that suddenly made them so much better. But the iterations on the technology paired with the creation of thoroughly trained generalized foundational models has created an environment ripe for advancement in actual application of these models into real products.

These factors represent a big leap forward in actual utility for most teams. Compared to previous approaches that required a whole machine learning team or paying for insanely priced SaaS ML products that required intense training, a small team can now implement some degree of AI into their product without spending millions of dollars or months of time. Things like OpenAI's GPT-[x] and Meta's LLaMA provide accessible, straightforward to implement models that can understand enough to get a team from 0 to (at least close to) 1, and from there, they can refine and iterate the implementation.

Will it be as tailored as a from-scratch model or other custom trained machine learning solution? No. But the immediacy of gains is, I would argue, possibly more important. These models let teams that otherwise would have no possbility of even experimenting with AI's impact on their product implement, test, and validate whether it's a worthy further investment.

And that's the truly exciting thing to me. This current wave of models is allowing smaller, scrappier teams to level up to go toe-to-toe with The Big Guys™. You don't need the resources of IBM anymore to have solid AI-powered features in your product. You can ramp up the development of a totally new AI-powered product without getting bogged down in months of training and setup. Using AI won't require you to sell your house and kids, raise a Series A and a Series B, and liquidate your entire workforce's 401(k)s before development can start. Teams can drop these models in, experiment, innovate, and see what happens for about $0.08, which is incredible.

But It's Going to Ruin Everything!

Nah.

No, really!

It will impact some jobs, for sure, but I think we're still in the "let's measure the impact" phase of things. It's very obvious some people are sprinkling AI into things where it doesn't belong, and it's even more obvious that people are replacing things with AI that should never be replaced with AI. We'll absolutely see a harsh drawback of some of these things. The seeming unstoppable impact that folks are predicting will eventually slow.

At the same time, as I said, for teams that realize that these models are tools just like their web frameworks or their IDEs and use them accordingly, this will represent a huge positive leap forward. I'm excited to see what comes out of this period of experimentation!

Sooo...Where Can I Learn More?

I was reticent to really engage this stuff at first, but the more I dug in, the more I saw some potential. If you're an engineer (or just curious person!) who wants to dip your toes in, here are a few links that I found to be helpful:

Brex's Prompt Engineering Guide. It's not just about prompt engineering (i.e., learning how to structure prompts to get models to respond with what you want). It's actually a good onramp to understanding the history of these models and how they work: https://github.com/brexhq/prompt-engineering/blob/main/README.md
Numbers every LLM Developer should know. A nice cheatsheet of numbers, from pricing to token calculations. This guide is especially useful if you're the one footing the bill for your experiments. 😬 https://github.com/ray-project/llm-numbers
Let's build GPT: from scratch, in code, spelled out. If you really want to dig in to how these models work, this video is an excellent guide as someone builds a GPT language model from scratch. It is about 2 hours long, but worth it in my opinion: https://www.youtube.com/watch?v=kCc8FmEb1nY