I am sitting here and listening to The Cure’s classic album Faith from the year 1981. I love this record. This particular copy of the album is a vinyl release from a record store day a few years back. So satisfying to spin it yet again after all these years. I have no idea how long ago I first heard Faith but it never gets old.

Today I finally managed to do something that I have been wanting to do for quite a while: I rebuilt my blog with something other than Wordpress. I have wanted to ditch Wordpress for many years and I’ve taken many runs at the problem of converting the blog to something else but I have never gotten to my goal until now. As of this post, this blog is no longer a Wordpress blog and is now a static web site generated by the hugo static site generator, written in Markdown, and managed via a private git repository using Gitea. The next step will be to move the site from Bluehost, where it has lived for many years, and deploy it on my self-hosted Linux server.

A few years ago I had decided that this was something that I wanted to do and in researching options I had landed on hugo as my technology of choice but after working with it for a while I lost focus and dropped the project. When I came back around to try it again I did some more reading about various static site generators and I decided that I really wanted to try out 11ty, so I did, and it was also pretty cool, but I still didn’t get the job done because accomplishing something while also learning how to do it for the first time can be a challenge.

Long story long, I eventually decided to leave well enough alone and stick with Wordpress until I was more equipped to manage the feat. For entirely unrelated reasons I have been spending much of the last week learning the go programming language and that got me thinking about hugo again because it is written in go. You don’t need to actually know how to code in go in order to use hugo but since 11ty is based on NodeJS and I don’t particularly love working with that technology I decided to revisit the static site thing and see what I could do with hugo. (The fact(s) that I run Nuclear Gopher, that go programmers are referred to as “gophers” and that I live in a city called Hugo had nothing whatsoever to do with my thinking… LOL)

The important part here is that this time I actually saw the project through and here I am with a “new” site. It’s not done yet, I still have a punch list of things I want to improve, but at least it’s no longer a bloated Wordpress monstrosity, it’s fast as blazes, everything is plain old static content, it’s 100% under my control, and I can easily deploy it to any host I want. Poifect.

And now for something completely different…

A few weeks back I was in a meeting with my boss and he told me that he wanted me to read a book that he was also reading about “vibe coding”. I was, of course, familiar with vibe coding both in theory and in practice. I have been working in software engineering for longer than I care to admit and try to keep on top of trends. It’s hard to miss a trend this prevalent. If you are not aware of the term, “vibe coding” refers to the practice of creating or modifying software with the aid of “AI” coding assistants. If you’ve spoken to me at any time in the last, oh I don’t know, 5 years(?) and the topic of GPT or LLM or GenAI is broached you know how I feel about this stuff but in case you haven’t, I am not a fan. I hate the economic and ecological impacts, the slop fests, the laziness, the blandness, the boringness and stupidity of synthesized media, really almost everything about this tech.

Almost everything, but not EVERYTHING. I am, after all, a technology guy and this is, after all, a major sea change in the technology landscape. It is literally my job to evaluate this tech for potentially usage within the company I work for and to be as dispassionate and level headed about it as I can be. So, professional that I am, I told him that yes, I will read the book, and yes, I did. I will not include a book review here but I will summarize the book as about 400 pages of anecdotes trying to convince the reader that vibe coding doesn’t suck and about seven pages of practical exercises that the reader can undertake to decide for themselves.

Sigh…

What the book really did was frame the questions floating around the use of LLMs in software engineering in a new way for me and give me a few starting places to test and challenge my own assumptions about how this tech can fit into my professional life. To date the only really useful “AI” technologies that I had adopted into my workflows were AI-assisted audio dynamics processing and AI-assisted video upscaling, both of which are very technical media management tasks that traditional tools struggle with and the new tools can properly assist with. I spent all of 2024 working for a startup as a full time senior software engineer, used “AI” coding assistance the whole time, and was absolutely unimpressed but after reading this book I decided to kick the tires on the current crop of tools and see how much had changed in a year. Perhaps we were starting to have some sort of actual value coming out of this madness.

I wanted to know if one of these tools could generate code that wasn’t completely broken and delusional, if it could actually allow me to be a more productive engineer, if the work could be enjoyable, and if there were ways of working with LLM-assisted reasoning models that were more or less effective than others. Last but not least, I wanted to know if there was a way to use this technology that was private, secure, and ethical. I didn’t care if I could get more code cranked out if it was bad code, if the process was annoying, or if a couple of acres of rainforest had to be burned down or a bunch of hard working people had to be robbed blind of their live’s work in order for me to write code (which is a thing I’ve known how to do since I was 8 years old… without a robot).

So, first things first. Private, secure and ethical. These three requirements all point to the same direction: the ability to run large language models on one’s own computers without sending data to the cloud. And so far my observations on this front are mixed. There is a path, mostly, but it’s not one that most people are going to be capable of doing at this point in time.

Large language models can ABSOLUTELY be run on your personal machine without any cloud service or subscription. I know, I’ve done it many times. But there are caveats. I have a MacBook Pro that is 4-5 years old at this point and has 16GB of RAM. On that machine I can run a small version of ChatGPT or any of hundreds of other freely downloadable LLMs. They are fast and surprisingly capable. For standard chat type scenarios, I am impressed. I can have casual conversations about just about any topic in any language without being on WIFI. I can ask it to write code, or help me flesh out designs or requirements, whatever, and it just works. But… That’s because the Mac is an M1 and the M1 has integrated neural and gpu cores all in one package and all the processor cores share access to the same pool of memory. On my Windows laptop the story was completely different. Because my Windows laptop doesn’t have a dedicated GPU and VRAM, everything that the LLM does happens on the main computer processor and standard CPUs are just slow at LLM stuff. The performance comes from GPUs, not CPUs. What’s more, large language models are, um, LARGE, and they eat up a lot of memory. Specifically video memory, and even my PC that has a dedicated Nvidia graphics card still only has 6GB of VRAM. It’s not enough to load any LLM that is usable. So, lucky for me, I have an M-series Mac with just enough RAM to work for smaller models, but of the multiple computers at my disposal it’s the only one that can do it. My desktop gaming PC, other laptops, they just can’t run LLMs. Truth is, the Mac is barely able to do it. It really needs a TON more RAM. Like, 128GB instead of 16GB. A machine like that is very very expensive, however.

It takes a lot of cloud spending to offset the cost of a local LLM machine in today’s market but that is changing. AMD is working hard to take the fight to Apple with new AI focused processors that combine traditional CPUs, GPUs, and NPUs all into a unified package with plentiful FAST unified RAM. I could spend less than $2500 today and nab either a second-hand Mac Studio M1 Ultra or a mini-PC based on one of these new AMD packages with 128GB of RAM that could be an LLM beast, if I wanted to. I don’t want to, not yet, but if it’s already possible it seems inevitable that it will not only drop in price but will become normal to have local LLMs running on our own devices.

OK, so, it’s technically possible to run an LLM without paying subscription fees to an evil corporation or contributing to a data center that is destroying the environment. Hell, you could power your local LLM using renewable solar panel generated power. Since an LLM running in this way doesn’t expose your data to the internet it’s secure, private, and sustainable. Ethical though? There was an environmental cost to training the models and, let’s not forget, there are also the ethical questions surrounding the theft of intellectual property that was used in the training of the models. So maybe we get sustainable, but ethical is still a topic for some debate and discussion. I personally don’t think there are any ethically constructed or trained models out there and using any of them has some moral ambiguity to it. Maybe most people don’t care. Most people don’t worry about the morality of eating a hamburger or driving a gasoline powered vehicle or purchasing consumer goods that are made with underpaid overseas child labor either. It is literally harder to live a life without ethical quandaries at this point than it has been at any point in human history. Just see the character of Doug Forcett from The Good Place if you need an illustration of the problem.

The next question I had was whether or not there was actually any significant advantage to working with this tech. Can it literally help you do more of what you actually want to do in a way that let’s you stay in control of the output? Unlike an image generator or chat bot, software code needs to be rigorous, secure, maintainable, well formed, scalable, and above all, modifiable. Software applications live for years. Source code is maintained by many different hands and eyes and brains over it’s lifespan. Badly written code quickly spirals into a nightmare. Mistakes can build upon mistakes. The idea of somebody who doesn’t know how to code using an pattern generating chatbot to attempt to build functional software is, frankly, TERRIFYING to me. Not because I disrespect people who don’t know how to code but because I learned decades ago that writing code is the easy part of software engineering. Building software systems that work, that scale, and that you can live with is a totally different skill and it’s the real skill of software engineering, not the actual coding. Thinking about problems and how best to solve them with code. The coding part is, well, it’s typing. When I have used coding assistants in the past they have been marginally helpful at saving me some typing but also have been kind of idiots and the more rein they were given the more they made messes I had to clean up. Now I was being told that this was changing and I needed to test whether or not that was the case. So, I ran some experiments. None of these were personal projects, this was all day job related, so I won’t be talking about the specifics but at a general level I’ll describe the problems, what I tried to accomplish, and how it went.

Experiment #1: There is a legacy codebase that I am responsible for and I routinely make small code changes and troubleshoot errors that occur. The system has literally no error handling. Things either work perfectly or they completely explode and there are no error messages to say anything about what went wrong just a generic runtime failure. It’s impossible to troubleshoot it. What worse, the users will snip the generic error part of the screen, avoiding the browser address bar (which might tell me at least what part of the app they were on) and say something like “I got this error.” I found myself craving two things. One: a way to automatically capture these generic errors. Two: a way to know who was doing what when it happened. Three: a way for the users themselves to give me more information from a bug report. Did I mention that this system is written in a technology that I barely have any experience with? Decades of software engineering experience but NONE of it is in this particular language. I could see exactly what I needed to do from an engineering perspective but it would take me days, maybe weeks, to figure out how to build an error handling/bug reporting and logging feature for this system. Weeks I don’t really have. So, I leaned on the coding assistant, asking for each brick I needed to build the wall, and checking each step of the process. I would love to say that I told it what I needed and it built it for me but that wasn’t what happened. It screwed up, over and over again. Wrote terrible code, then corrected it’s errors when I pointed to them. I had to step in repeatedly. And this was with a current generation commercial grade professional coding assistant, not my personal MacBook running a limited LLM. And yet, by carefully testing, guiding, thinking, designing, and iterating, I managed to get the feature done with everything I needed it to do in about half a day.

Experiment #2: I got more ambitious with the second experiment. There was a screen in the system that displayed some information but did not allow the user to edit it. The editing capability was in another system and involved a questionnaire and an answer calculator and a bunch of special rules about who was allowed to edit which things. A developer on the team was already working to try to replace the screen in the other system and had been working on rewriting the backend functionality for several days. He hadn’t even started in on the new front end screens. He and I paired up to tackle it and a few hours later, again working with the coding assistant, we were nearly done. A follow up session for another hour or two the next day and it was done and ready to ship off to testing.

In less than a week, part time, while doing other work, I had managed to knock off two high priority development tasks. I would say that multiple weeks were reduced to two days. Not bad. And this was despite the AI assist being, again, not a particularly talented programmer. The process felt like pair programming with a junior level engineer who had suffered a minor head injury. (I have pair programmed with a lot of junior engineers, all were better than the fancy LLM, and none had a head injury that I could identify, but that’s what this felt like.).

For the unfamiliar, Pair Programming is when two people sit at the same computer with one keyboard and mouse and screen and code together. It’s part of a software engineering practice called eXtreme Programming, or XP. The two people take turns being on keyboard or “driving”. When you are not the one driving, you are not just sitting there idle. You may not be typing the code, but you are engaged in the process, reading it as it’s written, thinking through the test cases, working the design angles, the usability angles, providing different perspectives. Study after study has shown that two people pairing up are more productive than two people working independently at their own computers. This might seem counterintuitive but the truth is that there is less rework, fewer bugs, fewer dead ends and bad decisions, and more direct focus on the tasks at hand when there is a duo in charge of the process instead of a lone coder. There is a conversation and collaboration that serve to produce something that is greater than the sum of it’s parts. I LOVE pair programming but very few companies allow the practice because non-engineers (MBAs, managers, the money guys) generally think that it stands to reason that two people at two computers will write more software than two people working at one machine. They don’t care about the studies that prove they are wrong and I do care about the several year experience I had in XP labs which proved me right.

So, when I decide to look at this tech not as a panacea or a threat but rather as a way to emulate the dynamics of pair programming, a way to turn development into a more fluid process, I started to think that maybe it could be used effectively. Especially if I were to use good development practices like test drive development, the Pomodoro technique, well written user stories that break work down into small, manageable, deliverable chunks, and the rest of modern agile practice. A coding agent may not be all that good at building big giant complex systems without making a horrible mess, but nobody should ask it to do so. You should follow solid methodology that would lead to good results whether a human or an LLM did the actual coding. Break down problems so you understand them well enough to solve them. A question properly asked is half answered. Etc., etc., …

I went back to the book I was supposed to read after these experiments and found that the authors had come to the same basic conclusion. They basically said that with “vibe coding” you could create a catastrophe or you could get a great result and the major factor in which you wound up with was whether or not you followed good development practices. People who don’t know what those practices are will be able to ask LLMs to build them all sorts of slop but they won’t be able to engineer real production software because they don’t know how to ask for that. People who already know how to engineer production quality software will save themselves a lot of typing and, if they use the same good methodologies that they should be using anyhow and treat it like pair programming, they will be able to accelerate their ability to deliver quality software even using languages and platforms that are mostly unfamiliar to them.

So, my take away at the end of all this? “Vibe coding” is a stupid name but the current crop of coding assistants are legitimately helpful and even fun to work with. In the narrow domain of software engineering, in which the majority of the training data is based on open-source software (i.e. - not stolen IP) and with models that can be run on local hardware (i.e. - no cloud subs, less energy usage) I can envision a way to use this tech while experiencing less of an ethical dilemma (LESS, not none). I will be continuing to run experiments and trying to learn how to work with these new tools effectively because that’s my job. Interesting times…