#0 (Un)Reliable.AI

"Start a blog," they say. It is fun," they said.


So, here we go - yet another blog from yet another techie, is not it what we all need? And even more, most probably you are interested in why I’m writing about unreliable AI specifically. I stop being ironic (maybe just for the next few sentences) and clarify that right now. For me as a former physicist and a machine learning practitioner reliability is a core concept. Plus, over the years I've collected a literal zoo of artificial mistakes I find both fun and scary; I have to unload this somewhere as it is getting heavy.


A nice definition of reliability is quality over time and it kinda makes sense as it helps to lay down a way to think about reliability in perspective. Taking into account that there is a tendency to overestimate what new technology can do in a year, and underestimate what it can do in ten years, this topic should be discussed on a permanent basis when it comes to AI. Discussed more often compared to the last best models (though they are cool), or some tools or frameworks. BTW, what is the most hyped model or stuff everyone is talking about right now? When I'm writing this my Twitter bubble (sorry, X) is discussing DALL-E 3, LLaVA v1.5, and how Llama-2 could be easily fine-tuned with a few lines of code.


So, why is it crucial? How often do you want to get from A to B when you take a car? In 80% or more? Or imagine you get in the elevator and hear “Good luck on your journey!” in a nice and supportive voice. If you have been using elevators of this manufacturer or this specific one many times before and you’re not an anxious person (all of which might be already strong assumptions), there is a chance you decide on taking stairs. We are expecting a reliability feature as a part of any engineering projects by design and by default. But how often though do we get such announcements for our software with AI in it? I don’t know about you, but all I get is more emails like "Intelligence is here!" They are talking about chat or improved search as you might have guessed. Most probably every application or piece of software you’re using has some algorithms of a probabilistic nature built in and even more are developing behind the scenes to make your life easier (dark music...)


Reliability is key for an AI capability to become part of a customer-facing product. When we think of AI/Machine Learning/Data Science/Statistical methods not as algorithms, or tools but as components of other intelligent systems new requirements appear. Although it might sound common sense, "common sense is not so common". Guess who said that, it was Voltaire. Cool, yeah?

It is hard to impossible to estimate the existing level of penetration of machine learning solutions into the decision-making process so let me ask ChatGPT (GPT-4 of course) to write an ironic story about that!


The Day AI Decided What Socks I Should Wear
Do you ever have one of those mornings where you just can't decide between the polka-dot socks and the striped ones? Well, don't sweat it – there's probably an AI out there already deciding for you. I mean, it's gotten to the point where I can't even swipe my phone awake without some AI saying, "Hey, remember that song you hummed for like two seconds three Tuesdays ago? Here's a playlist!" And God forbid you try to watch a movie without a machine chiming in with its two cents. "Based on your recent love for dramatic alpaca documentaries, may we suggest..." Seriously? Then there's the whole health tech thing – I'm just waiting for an app that tells me I've had too many cookies before I even reach for the jar. It's like living in a sci-fi movie, but instead of flashy flying cars, we got algorithms suggesting if I'd prefer pepperoni or mushroom on my pizza tonight. Cheers, AI! I never knew you cared.


It all might sound a bit like an ongoing "Black Mirror" series until it is not.

So, in my series I want to go over specific use cases, reliability/robustness/alignment tests, fun facts, and us living in this crazy world.