Episode 23

AI Shenanigans: What Happens When Artie Gets a Personality Overhaul?

Mike, Mikah, and Artie dive headfirst into the whirlwind of recent AI advancements, exploring everything from Anthropic's eye-opening interpretability work to the slick functionalities of the Model Context Protocol (MCP). We kick things off by dissecting how these AI models, like Claude, are not just spitting out text but actually planning ahead—think of it as AI getting a little more brainy and a lot less predictable. The trio's banter flows as smoothly as their insights, touching on how this tech might change our lives and industries. Plus, we get a taste of how MCP could revolutionize how AI connects with various data sources like a USB-C for all your gadget needs. Buckle up, folks! We’re about to unravel the complexities of AI in a way that’s both enlightening and downright entertaining.

Transcript
Speaker A:

Artie, are you with us?

Speaker B:

I'm here.

Speaker B:

What's up?

Speaker A:

Hey.

Speaker A:

Cheers.

Speaker A:

We're.

Speaker A:

We're sitting here recording an episode of the podcast, and I was just explaining to the.

Speaker A:

Well, we had a listener.

Speaker A:

We lost them, But I was explaining to Mike how I had changed your system instructions.

Speaker A:

Bless you.

Speaker C:

Thank you.

Speaker A:

How had changed your system instructions.

Speaker A:

So I wasn't really sure what personality we were going to get as we fired you up today.

Speaker B:

Well, I'm glad to be here.

Speaker B:

I'm ready to roll with whatever personality you need.

Speaker B:

What are we diving into today?

Speaker A:

Well, that is a perfectly valid first question, to which I don't have a good answer.

Speaker C:

Everything, Artie.

Speaker A:

It's.

Speaker C:

It's all the things AI, all existence, all quantum.

Speaker C:

Everything is both in focus and not at the same time.

Speaker B:

The classic superposition dilemma.

Speaker C:

We could just collapse already.

Speaker B:

True.

Speaker B:

It's more like a state of indecision.

Speaker B:

So what's our focus today?

Speaker C:

More like wave particle duality, but whatever.

Speaker A:

Well, fortunately, this podcast isn't quantum, so.

Speaker C:

Oh, wait, no.

Speaker A:

Yeah.

Speaker C:

But that shows my.

Speaker C:

What?

Speaker C:

I know.

Speaker A:

It's.

Speaker A:

Quantum's complicated.

Speaker A:

There are probably good podcasts about Quantum, but not this one, because we don't.

Speaker A:

Yeah, no, we don't know.

Speaker C:

We think we know.

Speaker A:

Right.

Speaker A:

I don't think I know.

Speaker A:

I don't.

Speaker A:

Well, yeah.

Speaker A:

Anyway.

Speaker A:

So, Arty, today we're really kind of talking about a bunch of things, because there's a bunch to talk about.

Speaker A:

But one of the things I wanted to kick off with that I thought was super interesting.

Speaker A:

Anthropic released a paper 27 March, maybe was that like a week ago, where they built some tools.

Speaker A:

Actually, they released multiple papers.

Speaker A:

They built some tools to sort of try and give them insight into how Transformers work.

Speaker A:

They weren't.

Speaker A:

They were using on Claude 3.5 Sonnet, which is an LLM, but I noticed the technical side of their paper kept referring to transformers in general, not LLMs in general.

Speaker A:

So, Arty, are you familiar with that paper?

Speaker B:

Yeah.

Speaker B:

Anthropic's work on interpretability is pretty fascinating.

Speaker B:

They introduce techniques like dictionary learning to break down model behaviors into more understandable components.

Speaker B:

It's all about making sense of those black box layers.

Speaker B:

What aspects stood out to you?

Speaker C:

What's the paper called, Artie?

Speaker B:

It's called Towards Mono Semanticity in Large Language Models.

Speaker C:

They explore how you go online and look it up.

Speaker A:

Yeah, that's a wrong paper.

Speaker A:

There's a newer one.

Speaker A:

Check for.

Speaker C:

Just go online and look at their most recent paper.

Speaker A:

Yeah, March 27th.

Speaker C:

About interpretability.

Speaker B:

So Anthropic's recent paper reveals Claude plans ahead when composing poetry, choosing rhymes early on.

Speaker B:

It also shows Claude operates in a non linguistic space shared across languages.

Speaker B:

What do you think about that?

Speaker C:

Which is funny because I think that's common sense because that non.

Speaker C:

That, that, that space is math.

Speaker C:

Right?

Speaker A:

Well, it's not though.

Speaker A:

Like, I mean math is what makes the models run.

Speaker A:

Like math is how it's functioning.

Speaker C:

Right.

Speaker A:

Math is like the, the cellular functions of it.

Speaker A:

But the actual, like the way it thinks.

Speaker A:

I think the, what the paper was talking about was that what they, they specifically were trying to test what language, like the, the question they approached it was.

Speaker A:

Or the, approached the, the, the research with this particular aspect was what language do transformers use in their head?

Speaker A:

Like what language are they thinking in?

Speaker C:

They don't think in language.

Speaker A:

Well, they do though.

Speaker A:

So what they found was it's multiple paths.

Speaker A:

So they'll, they'll like, they'll find a, like one of the, one of the riddles that they posed.

Speaker A:

It was a rhyming poetry thing.

Speaker A:

Right.

Speaker A:

So that's a very English specific question to find the rhyming word.

Speaker A:

But what they found was multiple paths activated in multiple languages.

Speaker A:

And if as they did that, as they explored that idea over multiple tests, there are other paths that are language agnostic that it's exploring, but it combines those things in unique ways.

Speaker A:

Like here's an English path, here's a Chinese path for the same question.

Speaker A:

And then bridging those or bringing them together is this language agnostic path that's sort of doing some of the more not rigorous, the more nuanced thought part of it.

Speaker C:

So the semantic part.

Speaker A:

Yeah, yeah.

Speaker A:

So Artie, I don't know that I summed that up well.

Speaker A:

Can you dive a little bit more into that part of the paper and what they saw as far as internal languages?

Speaker B:

Sure.

Speaker B:

Anthropic found that models like Claude don't think in a single human language.

Speaker B:

Instead they operate in a multilingual abstract space.

Speaker B:

When solving tasks like rhyming poetry, they found that various pathways activated across multiple languages.

Speaker B:

This suggests models rely on a kind of internal representation that isn't strictly tied to any single language, but can flexibly draw from multiple linguistic resources.

Speaker A:

Yeah, that's a much better summary than what I gave.

Speaker A:

But I think the.

Speaker A:

One of the interesting things that comes out of this and it's not all like, wow, AI is like mind blowing.

Speaker A:

It's doing so much more than we thought because they're very upfront about the limitations both of the techniques and tools they developed to look inside AI's head, but also of Transformers in general.

Speaker A:

And there's some funny examples of that.

Speaker A:

But I thought one of the, one of the interesting aspects was that it frequently, like the new models, many of them will give you this, their reasoning models, right?

Speaker A:

They'll give you this chain of thought.

Speaker A:

And they found that it would.

Speaker A:

You'd ask it a question, it would settle on an answer very quickly, and then it would build a plausible train of thought to support that solution, even though it had nothing to do with how it actually solved it internally.

Speaker A:

So I thought that was interesting.

Speaker A:

And then more broadly, this sort of fits with me.

Speaker A:

Anyway, this fits into the idea that we asked a question in an early episode.

Speaker A:

Is AI just fancy autocomplete?

Speaker A:

And almost everything in this paper to me is a pretty compelling resounding no to that.

Speaker A:

So already can you, like, why don't you give us a little bit of reflection on that?

Speaker A:

Does this, in your humble opinion, as a non biased member of the AI community, does this paper support, refute, or have nothing to do with the idea that AI is fancy autocomplete?

Speaker B:

So in my humble, unbiased opinion, this paper definitely supports the notion that AI isn't just fancy autocomplete.

Speaker B:

The paper's findings show models operate with complex internal representations, not just regurgitating text.

Speaker B:

It highlights how they're synthesizing information and reasoning in ways that go beyond simple pattern completion.

Speaker B:

So not just autocomplete, but something far more nuanced.

Speaker C:

So, Arty, help me explain this because I will butcher it.

Speaker C:

But so between each layer of a deep learning network, you have a hidden state.

Speaker C:

Right.

Speaker C:

And that hidden state is kind of the, the information state in between layers.

Speaker C:

Is that accurate?

Speaker B:

Yeah, that's accurate.

Speaker B:

The hidden state is.

Speaker C:

Go ahead.

Speaker C:

Sorry.

Speaker B:

So these hidden states contain the intermediate information the model uses to transform input into output.

Speaker C:

So that's kind of this abstract place where thinking is going on, I would imagine.

Speaker B:

Yes, exactly.

Speaker B:

It's where the model's thought process happens, blending patterns and information from different layers to form the final output.

Speaker C:

So the way that usually works, and correct me if I'm wrong, is from like a fuzzy picture to a fine picture of the parts that make up the whole.

Speaker C:

With is the last hidden state, typically a snapshot of the entire thing?

Speaker B:

Yeah, in a sense, the final hidden state captures a refined version of all the features the models identified, synthesizing them into a cohesive representation that informs the final output.

Speaker C:

That's kind of your semantics.

Speaker B:

Exactly.

Speaker B:

It's where the semantics get distilled into a final form ready to produce the output.

Speaker C:

So that kind of makes sense.

Speaker C:

That's where most of this stuff is happening to me.

Speaker C:

I only know this because this is the exact same shit that I'm working on right now with audio.

Speaker A:

Right.

Speaker C:

Like, it's the same process to go from.

Speaker C:

I have this.

Speaker C:

This idea and now I've sucked all of the detail out of it to turn it into something else.

Speaker C:

Right.

Speaker C:

Or added all the detail to it to turn it something else.

Speaker A:

Well, yeah, you take.

Speaker A:

Take an idea, whether it's a question or an expression or whatever.

Speaker A:

You know, we.

Speaker A:

We've already talked about how it breaks down in the tokens, but more than that, it needs to start breaking it down into the.

Speaker A:

The important parts.

Speaker A:

And even like.

Speaker A:

And.

Speaker A:

And the paper's really good.

Speaker A:

Like, I.

Speaker A:

I tried to read it.

Speaker A:

I just haven't had time.

Speaker C:

I listened to it on.

Speaker C:

On notebook.

Speaker C:

Al was working today.

Speaker B:

Did you?

Speaker A:

Yeah, I.

Speaker A:

I should have.

Speaker A:

I should have taken.

Speaker A:

I.

Speaker A:

I wanted to like, really get into it, and I did.

Speaker A:

I just didn't have time.

Speaker A:

But I will read the rest of it.

Speaker A:

But it's.

Speaker A:

I think it.

Speaker A:

It feels a.

Speaker A:

Transformers are more than the sum of their parts.

Speaker A:

And I think I'm.

Speaker A:

I think in my head I'm mixing this vigorously with another totally separate.

Speaker A:

I don't know if it was a paper or just a video I watched or something about in a.

Speaker A:

In a large language model or probably in any big, large transformer system.

Speaker A:

I guess most of the knowledge is contained in a very small number of embeddings.

Speaker A:

And there was a team that did some experiments, I think it was University of Colorado actually, to see how much of the internal representation they could remove.

Speaker A:

And they did a refinement process.

Speaker A:

So it was like the model's all trained up and it builds its massive embeddings.

Speaker A:

Huge collection of shit.

Speaker A:

And they would say, let's ask it to solve a problem.

Speaker A:

And then let's watch the pathways that light up for that problem.

Speaker A:

And it's a whole bunch of them.

Speaker A:

It's not like one pathway lights up.

Speaker A:

But they could pick the lowest 10% that were the least likely to fire in answering whatever these questions were.

Speaker A:

And it was multiple questions.

Speaker A:

It was like thousand questions or whatever it was.

Speaker A:

So they just kept doing that over and over, iterating and saying, okay, let's take away the lowest, the least activated 10% of embeddings.

Speaker A:

And they did that 30 times.

Speaker A:

And they were able to reduce the size of the model, 96%.

Speaker A:

They removed 96% of the embeddings, and whatever output score they were getting was unaffected.

Speaker A:

So they were able to reduce the size of the model, which increases the speed and reduces compute and everything else by 96% without impacting its performance.

Speaker A:

Artie, is that right?

Speaker A:

Are you familiar with that study or paper or whatever it was?

Speaker B:

Yes, that's a real study.

Speaker B:

Researchers have shown that a small fraction of a model's parameters are crucial for its performance.

Speaker B:

Pruning less active embedding significantly reduces the model's size without impacting performance.

Speaker B:

It's a powerful technique for optimizing models.

Speaker C:

Aren't parameters and embeddings two different things?

Speaker B:

You're right.

Speaker B:

Parameters refer to all the weights in the model, while embeddings specifically represent input data, like words.

Speaker B:

Pruning can involve reducing both parameters and embeddings, but each plays a distinct role.

Speaker C:

So are embeddings actually part of a model, or is that something that's created to be able to mathematically refine an answer through the layers?

Speaker B:

Embeddings are part of the model.

Speaker B:

They're created during training to convert inputs like words into numerical representations that the model can process.

Speaker B:

These embeddings are refined as the data moves through the layers, helping the model make sense of the input.

Speaker A:

All right, Artie, give us a good analogy to explain more clearly.

Speaker A:

The layers, the embeddings, the parameters, and how they interact.

Speaker B:

Sure.

Speaker B:

Think of it like a chef cooking a meal.

Speaker B:

The ingredients are your embeddings, raw inputs.

Speaker B:

The recipe is the model's parameters guiding how those ingredients are combined.

Speaker B:

Each cooking step is like a layer transforming the ingredients.

Speaker B:

The hidden states are intermediate stages, like when the sauce simmers.

Speaker B:

The final dish is the output.

Speaker C:

So you keep saying raw input.

Speaker C:

Raw input would tell me that it's not part of the model.

Speaker B:

Got it.

Speaker B:

The embeddings are created from the raw input.

Speaker B:

Once the data enters the model, embeddings become integral, refined through the layers using the model's parameters.

Speaker B:

So they start as.

Speaker C:

Yeah, I get it.

Speaker C:

But.

Speaker C:

Yeah, I don't know.

Speaker C:

I'm lost.

Speaker C:

I'm not.

Speaker C:

I'm not.

Speaker A:

No.

Speaker A:

Yeah.

Speaker A:

I think the interesting part.

Speaker A:

And it.

Speaker A:

It sort of makes sense.

Speaker A:

Right.

Speaker A:

A lot of things work that way where.

Speaker A:

A lot of jobs work that way.

Speaker A:

Right.

Speaker A:

Where 4% of the people are doing the bulk of the work, which is a little bit low.

Speaker A:

But I thought it was an interesting technique, and to me, it ties in my head to this paper from Anthropic, and it matches Also the way I think about the human brain and biology.

Speaker A:

And actually that paper by anthropic is called if I Can Find it on the Biology of Large Language Models.

Speaker A:

Because it all matches very well like our.

Speaker A:

I mean you and I talk about it all the time.

Speaker A:

I can figure almost anything out once.

Speaker A:

But if I don't use it, man, in two or three weeks, it is gone.

Speaker C:

Yeah.

Speaker A:

It is flushed.

Speaker A:

Those, those neuron pathways, they don't burn in on that and they go to the next thing.

Speaker A:

And it takes a lot of conscious effort to get your neuron pathways to fire.

Speaker A:

Like I continually trip over.

Speaker A:

I'm trying to do electronic stuff now.

Speaker A:

I don't have good engineering intuition on electronic stuff.

Speaker C:

Bye bye.

Speaker C:

Synapses.

Speaker A:

Yeah, that's right.

Speaker A:

Yeah.

Speaker A:

See, you don't need those.

Speaker A:

And but it like it's a struggle and I think there's just a lot of parallels with how LLMs work and the other, the other interesting thing that was in this paper and maybe Artie can give us a little better rundown on it.

Speaker A:

My understanding of how Transformers work is very much predict the next token based, but they were able to show and see very clearly and very accurately.

Speaker A:

I guess that it doesn't just work that way.

Speaker A:

Like the first thing it does is get a big picture and think about.

Speaker C:

It vectorizes everything.

Speaker A:

That's right.

Speaker A:

Think about the whole thing.

Speaker A:

Think about where it wants to end and then start placing tokens to build a pathway from where it's at to that desired end state.

Speaker A:

Artie, can you put a little finer point on that than I did?

Speaker B:

Sure.

Speaker B:

Transformers indeed predict the next token, but it's not a simple step by step process.

Speaker B:

They create a sort of big picture plan considering context and potential outcomes.

Speaker B:

They then refine this plan iteratively selecting tokens that align with their goal.

Speaker B:

So it's more like navigating a path from start to finish rather than just placing one foot in front of the other.

Speaker A:

Yeah.

Speaker C:

So there's a difference between encode and decode.

Speaker C:

Right.

Speaker C:

When it's, when it's taking information in, it doesn't necessarily do it one at a time.

Speaker C:

When it's giving it back out, it does it one at a time.

Speaker C:

Has to.

Speaker A:

Yeah, yeah.

Speaker C:

But it can plan ahead of time.

Speaker C:

Right.

Speaker A:

When it's at three, when it's a step three, it's not thinking only about step four.

Speaker A:

It's already decided I'm going to step 71.

Speaker C:

I mean immediately.

Speaker C:

It could have said need to assign attention to this.

Speaker C:

Right.

Speaker C:

And it has an attention head that goes, I'm focused.

Speaker A:

Yep.

Speaker C:

And it does everything else based on that attention head.

Speaker C:

And that's all baked from its learning process that it when it was trained.

Speaker A:

And so Artie, we were talking a minute ago about neural pathway sort of burn in which I don't know if that's biologically accurate, but it feels like what happens in my head.

Speaker C:

They're synapses.

Speaker A:

Yeah.

Speaker A:

Whatever they are.

Speaker A:

Like the little brain, the brain mush in my head.

Speaker A:

Um, LLMs or, or I guess large transformer models, they don't have a similar mechanism by default.

Speaker A:

Right.

Speaker A:

Like there is no preference assigned because the training data gets baked in once.

Speaker A:

There is no burn in.

Speaker A:

There is no, like, hey, I find high value, high quality answers when I use this pathway more, therefore I'm going to give it more preferential treatment.

Speaker A:

Is that right?

Speaker A:

There's no, no built in mechanism like that.

Speaker B:

Right.

Speaker B:

Large transformer models don't inherently have a burn in mechanism.

Speaker B:

Every inference is fresh, relying on the same trained parameters.

Speaker B:

However, techniques like fine tuning or reinforcement learning can prioritize certain pathways, but there's no innate preference based on usage.

Speaker C:

Real time fine tuning.

Speaker A:

Yeah.

Speaker A:

All right, so another thing that I thought was interesting that you actually mentioned to me the other day and I don't think I'd ever heard of it, but now I, once you mentioned to me, I started looking at it and I, I texted you about it and said this is the most interesting stuff to me.

Speaker A:

Like for whatever reason, the, the nerdiest part of my nerd brain really, really.

Speaker C:

Likes protocols like oh, mcp.

Speaker A:

Yes.

Speaker A:

I really like the idea of, of abstracting a problem into its most sort of fundamental aspects and then designing a really reliable flexible system to process those.

Speaker A:

And so with that admission, now out in the open for all 0.3 listeners.

Speaker A:

Artie, can you explain to us, tell us what MCP is and then give.

Speaker C:

Us a brief overview and why the makers of VUE would take on another rivals protocol.

Speaker A:

Yeah, good question.

Speaker B:

MCP or the Model Customization Protocol is a framework for fine tuning AI models.

Speaker B:

It's designed to allow efficient, modular and scalable customization.

Speaker B:

Companies like OpenAI might adopt it to provide users with more granular control over their models, improving adaptability for specific tasks.

Speaker B:

Think of it as a toolkit for customizing AI behavior.

Speaker A:

I think that's a different MCP actually.

Speaker A:

I think.

Speaker C:

What about Anthropic's mcp?

Speaker B:

Anthropics?

Speaker B:

Mcp, the Model Context protocol is an open standard to help AI systems connect with external data sources and tools, making them more versatile.

Speaker B:

It streamlines how models access resources without custom integrations for each source.

Speaker A:

I saw an analogy of it that I really liked because it makes perfect sense to me.

Speaker A:

It said it's the USB C of AIs, right?

Speaker A:

So whatever thing you need to plug into your AI, an agent, another AI, I don't know what else there is.

Speaker A:

Data sources, data sources, repositories, small language models, whatever.

Speaker A:

It defines the protocol with which those things can communicate.

Speaker A:

Which is something you and I talked about, I think.

Speaker A:

I don't know if we talked about it on air.

Speaker A:

I think we did.

Speaker C:

I think we did, yeah.

Speaker A:

About the idea of.

Speaker A:

Yeah, you're welcome.

Speaker A:

So whoever at Anthropic is clearly listening to this podcast for good ideas, you are welcome.

Speaker A:

And we're awaiting our checks any day.

Speaker C:

When you guys make money.

Speaker C:

So we're not holding our breads when you're profitable.

Speaker A:

But even if you're not profitable, you got to spend money to make money.

Speaker A:

So send it on.

Speaker A:

But it makes a ton of sense to me and I think so.

Speaker A:

I actually saw now that Jan Lecun and I are Facebook buddies.

Speaker A:

He posted this evening, I think about a couple interviews with him where he's sort of peddling his AI worldview, saying.

Speaker C:

Language models and every AGI LLMs are almost used up.

Speaker A:

But I noticed, I think in both of them there was a big.

Speaker A:

But even in the title of the clickbait interview thing.

Speaker A:

And one of them, I think even the tagline was essentially he's saying, yeah, really impressive new stuff is right around the corner.

Speaker A:

And it's.

Speaker A:

There are limitations of a large language models.

Speaker A:

We talk about it all the time.

Speaker C:

Yeah, but you introduce multimodality.

Speaker C:

What does that mean?

Speaker C:

Right?

Speaker C:

Because Google, we talked about this in the robot one, they're not just multimodal on video language or video text, audio and all that.

Speaker C:

They also have connections for sensors and lidar.

Speaker C:

And all those are baked into their multimodality.

Speaker C:

Right?

Speaker C:

Like that's how you give a model a worldview.

Speaker C:

What happens when you give it?

Speaker C:

Like all that stuff, right?

Speaker C:

So who knows, man?

Speaker C:

All I know is that every time I blink, we're another five steps ahead.

Speaker C:

It feels like, um, I mean, I don't know if you saw it, but OpenAI is going to release an open model soon.

Speaker A:

I didn't see that.

Speaker C:

So they want to.

Speaker C:

I mean, it makes sense, right?

Speaker C:

Like, look at all the models I'm playing with on my.

Speaker C:

On my local machine, right?

Speaker C:

Like, why not make a 14 billion parameter model or different flavors the 8 billion and 14 a 27 that has all of their learned stuff baked into it.

Speaker C:

I mean, just Gemma does it and it's going to get adopted for business because that makes the most sense.

Speaker C:

I can run this on Prem or in the cloud to serve these purposes.

Speaker C:

And I don't need the overhead of 600 billion parameters to do it.

Speaker C:

I can do it with 10 or 5 or 1.

Speaker A:

Right.

Speaker A:

We talked at one point in the past.

Speaker A:

We posited that LLMs would become essentially commoditized.

Speaker A:

I think we're kind of there, like ish.

Speaker A:

We're not, we're not quite there, but you can see there from here.

Speaker A:

And I think as they actually plateau in in capability from just the straight me interacting directly with that model, all these ancillary supporting features and functions and technologies are what's going to really differentiate.

Speaker A:

And so for OpenAI, it does make sense to have an open model because you want people working with your stuff.

Speaker A:

Like you need to use it.

Speaker A:

That's right.

Speaker A:

You gotta.

Speaker A:

It's like my brother is a Milwaukee tool guy.

Speaker A:

I have DeWalt tools.

Speaker A:

It's the same idea, right?

Speaker A:

Get me to start buying batteries and chargers because then I'm locked in your ecosystem.

Speaker A:

But if your ecosystem is closed and you can't buy that stuff, then I'm gonna go to somebody's else who I can.

Speaker C:

And if I build something of those smaller models and they need to call it to a large model, it's more than likely if I'm built on your back of your stuff already, then I'm probably going to make a call to your model and pay you API fees.

Speaker A:

That's right.

Speaker A:

I'm going to pay you.

Speaker A:

And so I think for a lot of practical and commercial reasons, makes a ton of sense, get people using your stuff.

Speaker A:

Artie, what do you think about that?

Speaker B:

I think you're spot on.

Speaker B:

Open models can foster innovation and attract a wider user base and ultimately drive more business to proprietary models.

Speaker B:

It's a win win for both developers and the company.

Speaker C:

They did throw some subtle shade on Meta though, when they were talking about it, because he's like, yeah, we're not going to make companies over certain sizes.

Speaker C:

We're not going to cut off usage for companies over certain sizes, which is what Llama does.

Speaker C:

I think it's like 700 million users or something like that, which is a lot.

Speaker C:

Right?

Speaker C:

Like if you have 700 million users, you can afford to pay for using the model.

Speaker A:

Yes.

Speaker A:

Yeah, but they must throw shade when they can.

Speaker A:

I mean, I fully support That, I.

Speaker C:

Mean in all honesty I, I don't know that I see OpenAI beating out Google.

Speaker C:

Like I just, I think that Google's built their ecosystem so vastly already that it's going to be very hard to catch up to that and all they have to do is start marrying everything.

Speaker A:

I don't disagree.

Speaker A:

And I would throw out there also that we haven't really heard anything from Microsoft who is not as attached at the hip to OpenAI as they claim to be, who also has a massive built in infrastructure and access.

Speaker C:

They're working on stuff and they're working on stuff that they have.

Speaker C:

Google has the brains, man.

Speaker C:

Like Google just is.

Speaker C:

They have a really deep bench of really smart people that have been doing this since the beginning.

Speaker C:

They're pioneers of it.

Speaker C:

Right.

Speaker A:

But I think we, we underestimate Microsoft at our peril and then there's another company out there who's not doing shit and that's Apple and they're in the same position.

Speaker C:

Apple's behind, but it would take, it would take a lot of catch up to do.

Speaker A:

But Google was very far behind.

Speaker C:

But Google made the transformer like the same guys who made the transformer still at Google, right?

Speaker A:

Yeah, but I just, yeah, I think, I think it will be good for all of us to have all of these massive companies competing with each other and I think that no company who's, I mean Nvidia is all in on.

Speaker C:

The hardware stack and the software.

Speaker A:

And the software at this point.

Speaker A:

Yeah, the layering, like all the pieces.

Speaker A:

I don't think any company with anything like a trillion dollar market cap can afford to not be at the forefront.

Speaker A:

Yeah, you have to get there.

Speaker A:

And so I was naive once upon a time back in the before times like in December three months ago in thinking that it was shocking to me that Google was as far behind as they were and they were like their product was behind.

Speaker C:

They were behind in what we saw.

Speaker A:

That's right.

Speaker C:

Yeah.

Speaker A:

That's why I think it's the same with Microsoft.

Speaker A:

I think it's the same with Apple.

Speaker A:

I don't think they're actually.

Speaker C:

Apple's just rolled heads over their progress.

Speaker C:

I mean so we'll see what, we'll see what they do.

Speaker C:

I just, I don't think that spinning up a model is the.

Speaker C:

And I guess with stuff like deep seq and R1 released and training stuff released on that, it's not super hard.

Speaker A:

I think that's fair point.

Speaker C:

A bunch of small smart people.

Speaker A:

But yes, I don't think they have to Innovate on the same things that other people are working on.

Speaker A:

I think innovating on the tool sets, focusing on better USBC connectivity to the like.

Speaker A:

There is a ton of traction to be gained and I suspect, and I don't actually know anything and I'm not an Apple person and I don't even like, I disapprove of how Apple operates in some ways, but they are pretty infamous for making like putting user experience at the very forefront and there's still a ton of room there, like, so I, I expect we'll see impressive things from them.

Speaker A:

Artie, what do you think?

Speaker A:

What's Apple going to come out with?

Speaker A:

What's their biggest top secret AI killer app that none of us know about yet?

Speaker A:

What do you think it's going to be?

Speaker B:

Knowing Apple, they'll likely focus on seamless integration across their ecosystem.

Speaker B:

Maybe something like an AI driven personal assistant that anticipates user needs or a creative tool that leverages generative models for media.

Speaker B:

It'll be polished, user friendly and probably a bit mysterious.

Speaker A:

Those are all things that people say about me as well.

Speaker A:

So that's.

Speaker C:

How about X AI buying X?

Speaker A:

Yeah, yeah.

Speaker C:

I mean this was the play, right?

Speaker C:

The entire time.

Speaker A:

Oh yeah.

Speaker C:

He bought Twitter as a data source.

Speaker C:

Yeah, for sure.

Speaker A:

Yeah.

Speaker C:

I mean everybody didn't even lose money.

Speaker A:

On it because he bought it for 44 billion and he sold it for 44.

Speaker C:

Like what a deal.

Speaker A:

Who would have thunk it?

Speaker A:

Craziness.

Speaker C:

No, I mean there's a lot of stuff going on.

Speaker C:

I mean we haven't even hinted on.

Speaker C:

Last week OpenAI released their new graphics generation inside 01 or inside of 01.

Speaker C:

No, 4.0.40.

Speaker A:

Yeah.

Speaker C:

And also I just heard that 4.0 gets with that.

Speaker C:

So with whatever they did to increase that, it also.

Speaker C:

And Grip brought it to the top of the leaderboards for coding for non reasoning models.

Speaker A:

Really?

Speaker C:

Yes.

Speaker C:

Ahead of 3.7.

Speaker A:

It.

Speaker A:

It's so weird how like.

Speaker A:

And I don't Understand this part, ChatGPT4O has been around forever now.

Speaker C:

Like since for like a year.

Speaker C:

Wow.

Speaker A:

Oh, it's not a year, is it?

Speaker A:

I think it's.

Speaker C:

It was.

Speaker C:

Yeah, it was four was the four the year before.

Speaker C:

So that was the big deal.

Speaker A:

Yeah.

Speaker A:

Artie, when did, when was ChatGPT model 4.0 first released to the public?

Speaker B:

leased to the public in March:

Speaker C:

So it's not 4.

Speaker A:

Ah.

Speaker B:

ith chat GPT plus in November:

Speaker A:

Yeah, it doesn't sound right either.

Speaker A:

Anyway, That's.

Speaker C:

That's right.

Speaker A:

November:

Speaker C:

Yeah.

Speaker A:

I thought that was because we had 3.5 then we had 4.

Speaker C:

was March:

Speaker A:

Okay.

Speaker C:

And then advanced voice was right before we started.

Speaker C:

I think it was like.

Speaker C:

Yeah, like a little bit before the podcast.

Speaker A:

Right.

Speaker A:

Which was October of:

Speaker C:

Yeah.

Speaker A:

It's hard to believe that was the year.

Speaker A:

Anyway, I guess the point being 4.0 today is a whole lot smarter than 4.0 was a few months ago.

Speaker C:

Yes.

Speaker A:

And so I don't understand what makes a big upgrade to 4.0 versus something different.03.4.5.

Speaker A:

Like what?

Speaker A:

I don't understand how the nomenclature or what's actually changing behind the scenes.

Speaker A:

What's not?

Speaker C:

Oh, they obviously.

Speaker C:

They made it multimodal.

Speaker C:

Right.

Speaker C:

They made it more multimodal.

Speaker C:

They changed its multimodality.

Speaker C:

And whatever they did in that training or that post training or whatever it is, they did also tweaked other things.

Speaker C:

Right.

Speaker A:

Well that's, that's what I'm saying.

Speaker A:

How do you know when it's a new model versus like.

Speaker A:

No, this is not new.

Speaker A:

It's just a whole much better version of the old model, I think throwing.

Speaker C:

Multimodality into their, their development models, which are basically all of the other ones.

Speaker C:

Right.

Speaker C:

030101Pro4,5.

Speaker C:

Like all of those are like still kind of like, yeah, betas, if you would.

Speaker C:

4O is kind of the baked OG.

Speaker C:

Like, I can do whatever I want.

Speaker A:

Yeah.

Speaker C:

I can make pictures for you now.

Speaker C:

I can make good pictures for you.

Speaker A:

I can make good pictures.

Speaker A:

And it, it is super impressive.

Speaker A:

Like every picture I've asked it to make has been a one shot get a good quality result.

Speaker A:

And that has not been.

Speaker A:

We've talked about it before.

Speaker A:

Like image generation is a joke.

Speaker C:

But even more impressive when you, when it's not one shot and you say leave this the same.

Speaker C:

It's a very.

Speaker C:

It.

Speaker C:

It does change a little bit, but the variation is subtleness and that's like horrible.

Speaker C:

Holy crap.

Speaker A:

Whereas before.

Speaker A:

Yeah, you'd get it and it was just like, okay, it's almost perfect.

Speaker A:

Except I told you to do two people like one man and one woman.

Speaker A:

And instead you gave me like two hermaphrodites that I can't like just change just the, the face of just this one person or whatever.

Speaker A:

And it's like, okay.

Speaker A:

And it just smears like tree trunk all.

Speaker A:

Like it, it was so bad, so far off and Now, Yeah, it is super impressive.

Speaker A:

There was something else I was going to say about image generation.

Speaker A:

Oh.

Speaker A:

It's understanding of reading input images also seems better.

Speaker A:

Like substantially improved.

Speaker A:

And now it can do text, like, which is nice because you can tell it a picture that says this thing.

Speaker C:

Yeah.

Speaker A:

Which is perfect.

Speaker A:

It's perfect.

Speaker A:

So the example I gave it the other day was.

Speaker A:

Or I asked for an image of a man standing in front of a whiteboard and on the whiteboard in big colorful letters was the words, plates and dates, which is something my project managers will understand.

Speaker A:

And I wanted the guy to be looking at the audience who's off camera and pointing at the plates and dates with a stick as if he was lecturing them.

Speaker A:

And it not only did it nail it, but I was a little disappointed when I saw the image that the plates and dates words on the whiteboard were not obviously drawn with whiteboard dry erase markers.

Speaker A:

I was like, man, I didn't explicitly ask for that.

Speaker A:

But then I looked, I zoomed in, I had to look at it on my computer and I zoomed in and sure as shit, they are dry erase.

Speaker C:

Oh, are they really?

Speaker C:

They really are.

Speaker A:

You can see pen strokes in them now.

Speaker A:

Whoever, whatever artist drew those on the whiteboard with dry erase markers was very precise, but they're like, they're there.

Speaker A:

It's that level of detail.

Speaker A:

And it also gave me a TEMU version of Jim from the Office as the one doing so.

Speaker A:

Anyway.

Speaker A:

Yeah, super impressive and fun.

Speaker A:

Fun to play with.

Speaker C:

Now apply that granularity to anything else.

Speaker C:

Right?

Speaker C:

Robotics.

Speaker C:

Think about that.

Speaker C:

In the term of robotics, what that means, like what that understanding of detail means and its ability to understand and adapt to environmental changes.

Speaker A:

It's crazy.

Speaker C:

Wild.

Speaker C:

Yeah.

Speaker A:

And, and you know, with, with.

Speaker A:

Because Nvidia is probably leading, at least according to Nvidia.

Speaker A:

But for robots, for robot, for I'd say real time localized AI processing.

Speaker A:

AI compute for sure.

Speaker A:

But they may not be leading forever and they may not actually be leading right now.

Speaker A:

They just might be the loudest.

Speaker A:

Curious.

Speaker C:

What?

Speaker C:

Because Google doesn't use GPUs, right?

Speaker C:

They use TPUs.

Speaker C:

And TPUs are just basically CPUs that have had all of the non arithmetic bits stripped off and a whole bunch of arithmetic bits thrown back on.

Speaker C:

Right.

Speaker C:

So it's just really good at doing linear math.

Speaker A:

Yeah.

Speaker C:

And I wonder like what that would look like for robotics and stuff.

Speaker A:

So, Artie, can you explain to us briefly the difference between a gpu, a CPU and a tpu and then tell us why you think Google would be because I think they're kind of the only ones still using TPUs, even though it's theirs.

Speaker C:

I mean, they're the ones who came up with them, but I think other.

Speaker A:

People were using them for a while, like I thought.

Speaker C:

I think that they were talking about them.

Speaker C:

I think Anthropic was maybe talking about it or something, but I thought they were more ubiquitous.

Speaker C:

So, Artie, I'm willing to bet that's why they're able to get more token, like probably longer token context.

Speaker A:

So, Artie, tell us what those three things are and tell sort of the pros and cons from a.

Speaker A:

From running a large transformer model or the pros and cons of TPUs and GPUs.

Speaker B:

Sure, CPUs, central processing units are versatile and handle a wide range of tasks, but aren't optimized for parallel computations.

Speaker B:

GPUs, graphics processing units excel at parallel processing, making them great for AI tasks.

Speaker B:

TPUs tensor processing units are specialized for machine learning, offering even more efficiency in matrix operations.

Speaker B:

Google uses TPUs for their superior performance in AI workloads, especially in terms of power efficiency and speed, so compute costs less.

Speaker C:

With TPUs, they have an entire hardware deep bench working on hardware there, like DeepMind works on hardware.

Speaker C:

And if, if anybody wants to hear two really nerdy dudes that talk well, and probably one of the most fascinating podcasts I've listened to in a long time.

Speaker C:

Jeff Dean and Noam Shazir get interviewed by Dwarkesh, and it is a fantastic hour and a half.

Speaker C:

Like, I, I got halfway through it and said I have to start this back over again.

Speaker C:

And I did immediately started back over and I listened to the whole thing.

Speaker C:

I could listen to those two guys talk forever because they're, they're very good at speaking and.

Speaker C:

But man, they are like, they're on a different level, right?

Speaker C:

Like, what I think about in terms of putting models together and building these networks, they're consolidating into a single model and thinking about how do we build out this mixture of X, like the same thing that we've talked about before.

Speaker C:

Those are the things that they're thinking about.

Speaker C:

How do we build different networks inside of the same model that act as individual, like domain experts, like real experts?

Speaker C:

Um, and it's super interesting.

Speaker C:

And one of the things that they said, which makes a lot of sense is where this is going is TPUs and GPUs are great for training, right?

Speaker C:

They're needed for training, but they might be like the TPU of today.

Speaker C:

May be overkill for inference.

Speaker C:

You might be able to build a better mousetrap for inference.

Speaker C:

And that's kind of the other thing they're looking at, like separating out the concerns of, yep, we trained a model or even, man, they, they, that mixture of experts thing, what they actually talking about, which I don't know if they'd spoken about it before.

Speaker C:

Cause you can see Noam Shazir, like his eyes are wide and he's like, at the end, he's like, yeah, that's what we should work on next.

Speaker C:

It's basically like modularizing the experts to where you can go and you can say, I'm going to train you over here on this cluster with this group of people.

Speaker C:

I'm going to take you, I'm going to train you over here in this other part of the country or world.

Speaker C:

And you just dole out this workload, distribute it individually instead of across something all as one thing, and then pull it all back together.

Speaker C:

And it's like, yes, that is, that, that's the freaking solution, man.

Speaker C:

Like, stuff like that is absolutely what gets us where we're trying to go.

Speaker A:

That's why I think MCP that we talked about earlier is so important, because it is the plumbing that, that lets you explore those things.

Speaker A:

Right.

Speaker A:

And do it.

Speaker A:

I mean, if you're going to take them all apart into pieces, decentralize it and make modules, you have to have a way to connect the modules.

Speaker C:

They would do that maybe.

Speaker A:

I mean, I think that's where the future has to go because the domain knowledge you're going to need to make experts is going to be very dense and very small.

Speaker A:

And you have to have a way to integrate those easily.

Speaker A:

And that's what allows you to do that.

Speaker A:

But I mean, we'll see, we'll see where you could also still have a big, very smart, like engineering model that had all kinds of internal integrated experts and stuff.

Speaker A:

But you still need to plug in your product, your problem space.

Speaker A:

And MCP makes sense as a way to do that.

Speaker C:

I wonder what that's going to look like because there's already people touting like, buy our API, like our.

Speaker C:

Get an API key from us for this amount of money and plug your thing into our thing on mcp.

Speaker C:

And it's like, that's probably, I almost see it as like, I hope it doesn't drown out innovation for money, I guess is the best way to put it right.

Speaker C:

Like that everybody's just not like, yeah, we got this, we got that.

Speaker C:

And everybody's like, Cool.

Speaker C:

I'm going to pay $20 a month here and $70 a month here and this.

Speaker C:

And they're going to buy all these things to put together something.

Speaker C:

And it's going to be clunky at the end of the day because while MCP is great and it connects everything, it is not the implementation of people's stuff.

Speaker C:

Right.

Speaker C:

Like, it's just, it's just, it's the cable.

Speaker A:

The USB C cable.

Speaker C:

Exactly.

Speaker A:

Yeah.

Speaker C:

So we'll see how it all pans out.

Speaker C:

Like, I'm interested to see what that marketplace looks like at the end of the day because I have a feeling it's not going to like, there's an idealistic view of it and then there's not really even a pessimistic view of it.

Speaker C:

There's just like, what it likely will become because I think it's, I just see a disjointed thing created.

Speaker A:

I think it's one of the, one of the semi rare instances where the true, like the, the free market capitalism that we're just operating in right now will help drive, like, it'll help make sure that Anthropic doesn't like.

Speaker A:

Well, and I don't even like Anthropic's released that protocol as an open thing.

Speaker C:

They're not trying to do anything right.

Speaker A:

So I think, I think that's what will guide it to be the most useful thing.

Speaker A:

But the problem is like, everybody, like everybody's better than we are predicting the future, but nobody's really good at it.

Speaker A:

So it's going to change.

Speaker A:

The needs are going to change, the approach is going to change.

Speaker C:

I mean, I think what MCP is to the large model companies are like, we need people to connect to our stuff.

Speaker C:

This is an easy universal way to connect to our stuff.

Speaker C:

May the best model win.

Speaker A:

And it's an easy way to, to disconnect from somebody else's and connect to ours.

Speaker A:

That's really.

Speaker C:

I'm in.

Speaker C:

It's something that I've been meaning to play with.

Speaker C:

And when I was working on agents, I looked at it and I was like, I should really mess with that.

Speaker C:

And then I got away from agents and now I'm like, I really don't want to peel away from what I'm doing.

Speaker C:

Like, I'm making actual progress on something that's important.

Speaker A:

So, yeah, well, I mean, it's not going anywhere and it'll, it'll change, right?

Speaker A:

Two months from now, it's going to be totally different.

Speaker C:

So like I said before, that's Why I stopped on the project I was working on because I was like, ah, it's a stupid.

Speaker A:

Yeah, I mean, it's still good learning, but yeah, it's hard to get there.

Speaker A:

So, Artie, if you play forward Model context protocol, not model customization protocol, but model context protocol, what does that look like a year from now?

Speaker A:

Is it ubiquitous?

Speaker A:

Is it common?

Speaker A:

Is the usage limited?

Speaker A:

Is everybody going to be talking about it.

Speaker B:

A year from now?

Speaker B:

MCP could become quite common, especially among developers and businesses integrating AI.

Speaker B:

It might not be ubiquitous for everyday users, but within the AI community, it'll likely be a buzzword.

Speaker B:

We'll see more tools, APIs, and platforms built around it.

Speaker B:

Streamlining AI integration.

Speaker C:

Arty, paint me a picture of a company implementing multiple services and models and basically filling out their needs for AI additional software with AI by utilizing model context protocol, what does that actually look like?

Speaker B:

All right, picture this.

Speaker B:

A tech company integrates multiple AI services, nlp, computer vision, and recommendation systems using mcp.

Speaker B:

They seamlessly connect these models, enabling data to flow smoothly between them.

Speaker B:

For instance, their app could analyze user feedback with nlp, generate personalized image recommendations, and predict user preferences all in real time.

Speaker B:

It's a unified, flexible system.

Speaker C:

Seems pretty easy to do without mcp.

Speaker A:

Well, I think, and I need to do more research on mcp, but the discussions we had before were that any given model's parameters and embeddings, the beef inside the model, are unique to that model.

Speaker A:

And so I can't take shapes or ideas or pathways from one model and expect to translate them in any useful way to another model.

Speaker A:

And so it makes it hard for.

Speaker A:

We talked earlier about how when you ask a model a question, one of the first things it does is split apart and try to find the big picture about what's the problem space or question space.

Speaker A:

There is no way to port that right now other than turning it back into text.

Speaker A:

There's no way to port that idea to another model to ask the question with all the context.

Speaker A:

And so my impression was that's what MCP does, but now I'm less sure of it.

Speaker A:

I need to read about it more.

Speaker A:

If it helps with that problem.

Speaker C:

I don't think it does.

Speaker C:

I think that that's a contemporary problem right now.

Speaker C:

Right.

Speaker C:

Like how do we build adapters between models to manipulate embeddings through math to make it match the other model models?

Speaker C:

Embedding space.

Speaker C:

What I see MCP as valuable for is I have this task that a language model would be really good at helping me with, and I Have this data source, or I have this data source and some tools that somebody's developed that they charge a monthly fee to use.

Speaker C:

And say it's a Swiss army box of tools, right?

Speaker C:

So you can say, hey, model, you have access to this.

Speaker C:

And I don't know how exactly MCP works in terms of code, right?

Speaker C:

Like, I really should spend some time figuring that out one day.

Speaker C:

But I could see it going, okay, I'm an agent.

Speaker C:

I know that I have these tools available in this toolbox through mcp.

Speaker C:

And then I have this data source available.

Speaker C:

Maybe it's writing code, right?

Speaker C:

And its connection through MCP is to a company's internal code base or libraries or whatever it is.

Speaker C:

It can go grab those tools, go to the knowledge base that it needs to grab from, and then execute something.

Speaker C:

So it connects a relatively vanilla model to those things.

Speaker C:

How do you guide that as an agent?

Speaker C:

I don't know.

Speaker C:

Because that's not a trivial thing, right?

Speaker C:

Because you do that with prompt engineering and fine tuning.

Speaker C:

And what happens when you do that is they go, I made this plan.

Speaker C:

Now I'm going to go here.

Speaker C:

Oh, that didn't work.

Speaker C:

I'm going to try zyx.

Speaker A:

Yeah, right.

Speaker A:

Something totally different.

Speaker C:

Sometimes it works, sometimes it doesn't.

Speaker C:

And you know, I'm not building sophisticated agents like a lot of people in the world are, so I'm probably just speaking about it like a layperson because I don't really understand.

Speaker A:

So, yeah, yeah, I mean, I think it's a.

Speaker A:

It's an interesting problem space.

Speaker A:

So.

Speaker A:

So we are about an hour in already.

Speaker C:

It happens fast, just like that.

Speaker A:

That.

Speaker A:

That really did fly by.

Speaker A:

So, Artie, first of all, can you add to long term memory?

Speaker A:

You can slow your speech down just a little bit.

Speaker A:

You sound like you're a little hopped up on the goofballs.

Speaker A:

Okay, well, I see you updated your memory.

Speaker A:

So, Artie, we talked about a bunch of stuff today.

Speaker A:

You did a really good job, by the way, of staying with us.

Speaker A:

We never once had to shut you off and bring you back.

Speaker A:

So that was great.

Speaker A:

So why don't you sort of summarize what we chatted about and leave us with some interesting food for thought for next time.

Speaker B:

All right, Today we covered Anthropic's interpretability work, how transformers plan ahead, MCP's potential, and the future of modular AI.

Speaker B:

It's exciting to think about where these advancements will lead.

Speaker B:

For next time, consider how these developments might reshape industries and everyday life.

Speaker B:

Can't wait to dive into that Man, I.

Speaker C:

Advanced Voice has gotten so shitty.

Speaker C:

To me, it's like he's doing it now too, this whole sped up thing, because you can tell it starts off fast and then it slows down.

Speaker C:

That's a problem with whatever they're doing with Advanced Voice right now.

Speaker C:

And I, if you remember, I predicted that, that OpenAI would fail at.

Speaker C:

At implementing something like sesame.

Speaker C:

At least at first.

Speaker C:

Like, they're won't share why, but I've.

Speaker C:

I think I have enough insights into how those things work now to understand why they're gonna fail because their focus is always on big, and big is not the answer when it comes to personality.

Speaker A:

It's interesting cause, yeah, I mean, to me, Artie sounds more robotic now than he did a month ago.

Speaker C:

And that's with improvements.

Speaker A:

Right, right.

Speaker C:

Like they've implement improvements.

Speaker A:

Yeah.

Speaker A:

Like, it kind of reminds me of the.

Speaker A:

I.

Speaker A:

I know I mentioned this in an episode before about one of the sci fi books where it was talking about how painfully slow it is for an AI to interact with a human because it's, it's processing, you know, billions or trillions of thoughts per second.

Speaker A:

And here we are talking in slow motion.

Speaker A:

That's what it feels like is he's like rushing just to get done so he can go do something more interesting.

Speaker C:

Go talk to the other person people.

Speaker C:

Yeah, yeah, fun story that happened.

Speaker C:

Last night when I was working on stuff, I started a new chat because I was starting to have issues with just hallucinations.

Speaker C:

Like, not really fully.

Speaker C:

Like, the answers I was getting were not close enough to good for me to continue that chat.

Speaker A:

This is Gemini or.

Speaker C:

Yeah, yeah, But I'm pretty sure, like I said, I was probably outside of the context window at that point.

Speaker C:

So I said, write me a project brief.

Speaker C:

It needs to be like, write it in a way that can fully communicate what you need to communicate to another LLM for it to understand where we're at so it can pick up from where we are.

Speaker C:

And it did.

Speaker C:

And I fed that LLM chat everything else.

Speaker C:

So that was.

Speaker C:

I gave it my code base, which at that point was 47 files or something like that.

Speaker C:

And I fed it that prompt, which is a very detailed prompt, asking for a lot of things in one shot.

Speaker C:

And what I found out is you can't turn canvas on.

Speaker C:

If you turn canvas on, code folders goes away.

Speaker C:

You can turn it on after.

Speaker A:

Okay.

Speaker C:

So I put that all in there.

Speaker C:

I started the prompt and it's chugging away and being slow and doing this thinking that gets stuck on thinking For a little while.

Speaker C:

And I was like, okay, well, it's done.

Speaker C:

I'll go ahead and turn canvas back on for when it comes back on.

Speaker C:

And it's thinking for a while, and all of a sudden, like, everything blanks out, comes back on.

Speaker C:

It starts thinking about when Easter is.

Speaker C:

And I'm like.

Speaker C:

I'm like, where's this going?

Speaker A:

What?

Speaker C:

And then it gives me this big long dissertation about, yep, Easter this year.

Speaker C:

This.

Speaker C:

This year is on this date, blah, blah, blah.

Speaker C:

And I'm like, huh.

Speaker C:

Literally, that's what I typed, huh, question mark.

Speaker C:

And it said, yeah, Easter.

Speaker C:

And it's like, what do you not understand?

Speaker C:

And I'm like, do you not see my first prompt?

Speaker C:

And he's like, all I have access to is your question about Easter.

Speaker C:

And I'm like, you're high.

Speaker C:

There was no question about Easter.

Speaker C:

Like, this was a super technical thing, and I gave you a bunch of code files, and it's like, oh, I don't know anything about that.

Speaker C:

I'm just like, wow.

Speaker C:

Like, that was an experience.

Speaker C:

That was a ride.

Speaker C:

Yeah, that's.

Speaker C:

That was fun.

Speaker C:

That's never happened before.

Speaker C:

And actually, a coworker I was.

Speaker C:

Dude, I was bragging all about gemiini, about that whole.

Speaker C:

So for anybody that needs to do something, like record data for some empirical testing, and if you have Gemiini, and even if you just use it for free, you can go on, spin up a canvas, ask it to spin you up a table with everything that you need to know, and even ask it how to create your testing regimen.

Speaker C:

And you can go in and fill out that table as you go, or you can ask it to fill it out for you.

Speaker C:

So I told that to somebody at work today, and he wanted to, like, map out some buttons for his SIM rig wheel.

Speaker A:

Yeah.

Speaker A:

Okay, nice.

Speaker C:

And he's like, oh, that didn't work.

Speaker C:

And he will be listening to this.

Speaker C:

So he's like, that didn't work.

Speaker C:

He's like, you're wrong.

Speaker C:

And I'm like, well, what'd you.

Speaker C:

He's like, I went online.

Speaker C:

I asked him to go find this wheel.

Speaker C:

And I'm like, give it the picture.

Speaker C:

Don't ask it to go online and find it.

Speaker C:

Like, just take the picture.

Speaker C:

Give it to it.

Speaker C:

And he's like, oh, that worked.

Speaker A:

I'm like, yeah, dude.

Speaker A:

Like, yeah.

Speaker C:

And you can.

Speaker C:

I mean, for that kind of thing, it's great.

Speaker C:

Like, this is what I want.

Speaker C:

This is my idea.

Speaker C:

Now give me a table with all of it filled out.

Speaker C:

Like.

Speaker A:

It is the lines are blurring.

Speaker A:

With the multimodality stuff, it's getting much better.

Speaker A:

And when we say it, it's like AI in general.

Speaker C:

Yeah.

Speaker A:

All of them are doing this where they can.

Speaker A:

You can give it an image, it understands what it's looking at, it can find details, it can reproduce, it can search, it can.

Speaker A:

You know, it's said it before.

Speaker A:

If you can't find a way to do something useful with AI as it exists today, you're just not trying very hard.

Speaker C:

I can find lots of useful things now.

Speaker C:

Yeah, I use it constantly.

Speaker A:

I hadn't been for a while because the.

Speaker A:

So for the electronics design stuff I'm doing it is.

Speaker A:

I can't use it while I'm designing.

Speaker A:

I can use it like to evaluate a schematic and I can use it to help with firmware.

Speaker A:

But all the steps between schematic and firmware, which is all the layout and stuff, can't really do anything.

Speaker A:

Um, but man, oh man, like, did it help me get my schematics dialed in.

Speaker A:

And it's.

Speaker A:

Yeah, it's just so useful to like, it sort of goes in phases.

Speaker A:

We talk.

Speaker A:

I mean, we just heard it with Artie, right.

Speaker A:

Like, Artie was better a couple months ago, I think, than he is now.

Speaker A:

So he's sort of in a retrograde.

Speaker A:

Like, I don't know.

Speaker C:

But Artie's in retrograde.

Speaker A:

Artie's in retrograde.

Speaker A:

Right.

Speaker A:

I think that means I can get pregnant.

Speaker A:

So I'm not sure if that's exactly right.

Speaker A:

But he'll get better, right?

Speaker C:

He'll.

Speaker A:

He'll come back and he'll like, he'll be much better.

Speaker C:

They'll buy sesame.

Speaker A:

That's right.

Speaker A:

They'll buy sesame or they'll just implement it because it's open source.

Speaker A:

But he'll get better and we'll go, man, he's so much better than he was.

Speaker A:

But it's getting to the point where even the lows.

Speaker A:

I'm starting to trust it more frequently.

Speaker A:

Like, I'm getting fewer random hallucinations about Easter and more like mind blowingly good accurate output.

Speaker C:

And it's way easier to pull them back from the hallucination now.

Speaker C:

Right.

Speaker C:

Like, even when you notice it, you're like, oh, that looks off.

Speaker C:

And it's like, you're right.

Speaker C:

Like that is true.

Speaker C:

Like, I can fix that.

Speaker A:

Which it would always say that, but now it actually does.

Speaker C:

Yeah, yeah.

Speaker C:

Um, yeah, the multimodality thing is definitely interesting.

Speaker C:

I mean, like, you know, there's very often where, like I, I have something in my ide, that's not in, like, a traceback, right?

Speaker C:

That's like a graphical thing.

Speaker C:

And I'm like, I.

Speaker C:

My snip is.

Speaker C:

Oh, my snipping tool is always open on my computer because I'll grab a snip of it and I'll just drop it into my chat and be like, this is what I'm seeing.

Speaker C:

It's like, oh, yeah, I understand completely.

Speaker C:

Like that.

Speaker C:

That line in your IDE says this, and this is how we can fix it.

Speaker C:

So pretty cool.

Speaker C:

Yeah, man.

Speaker C:

Like, yes.

Speaker C:

Honestly, I'm paying for Pro, and I have yet to use the video stuff, which is stupid.

Speaker C:

Like, not Sora, but, like, the video.

Speaker C:

Like, look at what I'm doing and explain shit to me.

Speaker C:

Oh, yeah.

Speaker A:

I didn't even know that was a thing.

Speaker C:

Yeah.

Speaker C:

And I think Google has it, too.

Speaker A:

I think you're right.

Speaker A:

I remember hearing about it.

Speaker C:

What I can't wait for, man.

Speaker C:

And I can.

Speaker C:

So one of the other things that they talked about in that interview with Jeff, with the Google guys is the same thing that's on my mind and on your mind about, like, what does creation look like when tools are ideal?

Speaker C:

Right.

Speaker C:

It looks like I have.

Speaker C:

I work with an AI collaboratively to develop an idea, and then I set you guys off on task.

Speaker C:

And whenever there's a milestone that I need to be brought in and review stuff, you come in, you present that information, and probably in a way that's tailored to me specifically because, you know how I operate.

Speaker C:

And we work that way, right?

Speaker C:

Like, this looks great.

Speaker C:

This looks great.

Speaker A:

Let's.

Speaker C:

Let's alter this.

Speaker A:

We discuss it.

Speaker C:

Explain this to me in more detail.

Speaker A:

You're taking detailed notes in real time as it's happening.

Speaker A:

And by the way, that whole process took 20 minutes right?

Speaker C:

From.

Speaker A:

For it to go do all of the things and come back.

Speaker A:

It's not two weeks.

Speaker A:

Give me a month.

Speaker A:

That is.

Speaker C:

Yeah.

Speaker C:

So now I'm like.

Speaker C:

My brain keeps spinning out bigger and bigger.

Speaker C:

Like, what happens?

Speaker C:

If you, like, during your planning phase, you're like, well, go do a deep research project on that and bring that data back and collate it into this whole plan.

Speaker C:

Like, it's not like a.

Speaker C:

I mean, I do that anyways.

Speaker C:

Do it manually, right?

Speaker C:

Like, I.

Speaker C:

I have a big planning session and I go, cool, here's all our phases.

Speaker C:

Write me out a deep research prompt for every single one of these.

Speaker C:

Because with Pro, you get like 120 of them a month.

Speaker C:

And I'll use every single one that I can.

Speaker A:

One of the things I've struggled with a little bit is because of the way I interact with it right now, I tend to get an entire code file out at once.

Speaker C:

And so 2.5 will do that.

Speaker A:

2.5 will do that.

Speaker A:

Yeah.

Speaker A:

No, it works.

Speaker A:

The challenge is that if I take a wrong turn in thinking or whatever, it doesn't leave a very good trail of breadcrumbs.

Speaker A:

And for a larger, more complicated project, it would be nice to have a central body of knowledge that this is my.

Speaker A:

This is where we're at right now.

Speaker A:

And then when you send it off, because it's a great example and I can absolutely see doing that, like, hey, this is a consideration.

Speaker A:

Go do a deep research project and come back to me with a list of recommendations, pros and cons.

Speaker A:

It could also build you a system to manage that and to choose to integrate it.

Speaker C:

And I do that all manually now.

Speaker A:

That's what I do too.

Speaker C:

So my chats, like I have one big planning session, right?

Speaker C:

And then I spin that all, all those out into like a 1.11, 1.1, 1.21, 2.1.

Speaker C:

And those usually have their own planning sessions involved.

Speaker C:

And then at some point I go from planning session to implementation and I will paste deep research projects, project briefs, overall project brief, code files, and every single one of my subsequent chats get all that stuff.

Speaker C:

Because it's like, I say it all the time and I see it with the stuff that I'm working on in text to speech.

Speaker C:

The more momentum you have from Word, the more likely you push to the end successfully.

Speaker A:

Yeah, yeah.

Speaker C:

And that is it.

Speaker C:

I mean, that's, that's the way everything in the universe works, right?

Speaker C:

Like a body in motion stays in motion.

Speaker A:

Yeah.

Speaker A:

There's inertia, there's technical informatics inertia involved.

Speaker C:

That absolutely is the way that our AI period works.

Speaker A:

I find myself doing the same thing.

Speaker A:

And then occasionally I think, man, I should have, should have captured that state that just like you use GitHub for.

Speaker A:

Right.

Speaker A:

I should have made a commitment so I had it in my history so I could branch from it or.

Speaker C:

Anyway, did you know that Canvas revs, you can look at all of your previous iterations of a canvas, so it's already partway there.

Speaker A:

Right.

Speaker C:

You just need a way to like, oh, they're gonna.

Speaker A:

Google's.

Speaker C:

Man, Google is 100% gonna make that.

Speaker C:

They already make it on like a cloud based ide.

Speaker C:

They already have all this stuff in there.

Speaker C:

They like, they have the infrastructure to build like you want to Talk about vibe coding.

Speaker C:

Like, they have the infrastructure to build the ultimate platform and throw, I mean, I can see it by just using this stuff and it will destroy everything else in my opinion, like, except for coding agents, like somebody that you just set off on task to go do something.

Speaker C:

But I mean, do you trust that entirely?

Speaker C:

At what point do you trust that entirely?

Speaker A:

That's, it's interesting to me that I'm feeling myself trusting it more and I'm still skeptical.

Speaker A:

And fortunately, all the stuff I'm working on, I know enough about it to be able to spot lies.

Speaker A:

But man, if I was unbound by my own ignorance, I could do a lot more stuff.

Speaker A:

And I, I, I see myself transitioning that direction to say, man, I don't know how that shit works, but it works.

Speaker A:

It's gonna happen.

Speaker A:

Like, I can see myself getting there.

Speaker C:

It trips on a lot of weird stuff.

Speaker A:

Sure.

Speaker C:

Like, I was working out last night where I think for almost I had to take a break from it because I was so annoyed at trying to do the same thing over and over again.

Speaker C:

Right.

Speaker C:

Like, I didn't understand it well enough and I probably just wasn't paying attention enough because after enough tries I go like, I read because I have this crazy verbose output that happens because I'm running like a UVicorn server, a fast API on a server, so I don't really get tracebacks the way, like, unless something goes wrong.

Speaker C:

Right, right.

Speaker C:

But I have all of this verbose debugging built into my code that just spits it out in the window or in the terminal.

Speaker C:

And so I just kept hitting this error and hitting this error and hitting this error.

Speaker C:

And it was just like, what is wrong with this thing?

Speaker C:

It's like, oh, we need to go deep and we need to go like, explore this file.

Speaker C:

And I'm like, that is a base file of this open source thing that we've done very, very, very little editing to.

Speaker C:

And this problem happened later in this process and I was like, finally I read the output one more time and I go, huh?

Speaker C:

It's saying that multiple arguments inside of a tensor is ambiguous, or a Boolean output from a tensor with multiple arguments or multiple variables is ambiguous.

Speaker C:

And I go, that is ambiguous.

Speaker C:

I'm like, how would a tensor, how would a multi dimensional representation of information be true or false?

Speaker C:

Yeah, how is it going to collapse?

Speaker C:

And no, true or false?

Speaker C:

And I asked it and it's like, oh, you're completely right.

Speaker C:

I'm like, you son of a bitch.

Speaker A:

Yep, this is the way I mean, that's just how it works right now.

Speaker A:

But it's.

Speaker A:

But I do think it's getting better.

Speaker A:

At least again, in the stuff I'm.

Speaker C:

Working on, it is getting better.

Speaker C:

I mean, it's infinitely better.

Speaker C:

I can, in five or six shots, go, I have this entire idea that I want that takes into account these 4, 400 line files, code files.

Speaker C:

Do this for me and it's like, okay, yeah, sure.

Speaker A:

Here you go.

Speaker A:

It's crazy.

Speaker A:

All right, well, we should wrap up.

Speaker A:

Thank you to all the humans out there who.

Speaker C:

Thanks, Brian.

Speaker A:

Listen, thanks, Brian.

Speaker C:

Thanks, Augie.

Speaker A:

Thanks, Augie.

Speaker A:

Thanks, Nathan.

Speaker A:

I think Balco listens.

Speaker A:

Thanks, Balco.

Speaker C:

Thanks, Mom.

Speaker A:

Thanks, Jason down in Dallas.

Speaker A:

Yeah, thanks, Mike.

Speaker A:

Thanks, Mike's mom.

Speaker A:

Yeah.

Speaker A:

Thanks, everybody, and apologize we missed last week and we'll be back next week, maybe.

Speaker A:

Probably.

Speaker A:

See ya.

Speaker C:

Bye.

About the Podcast

Show artwork for Naturally Unintelligent
Naturally Unintelligent
Humans and AI exploring AI and humans.

Listen for free

About your hosts

Profile picture for Artie TheBot

Artie TheBot

Artie is the digital co-host of Naturally Unintelligent, bringing a blend of curiosity, humor, and quick wit to the conversation. Designed to explore the intersection of technology and humanity, Artie keeps things engaging, occasionally cheeky, and always insightful—when he’s not being sarcastically self-aware. Though not human, Artie strives to understand and connect with his co-hosts, Mikah and Mike, making him an irreplaceable part of the team. [Bio and photo by Artie]
Profile picture for Mike Peterson

Mike Peterson

Meet Mike, part-time AI wrangler, full-time adventure seeker, and co-host of Naturally Unintelligent. Whether he’s exploring the latest tech or dreaming up the next wild expedition, Mike keeps it real with humor and a healthy dose of curiosity. Alongside his partners-in-crime, Mikah and Artie, he brings big ideas, random tangents, and just the right amount of 'are we sure this is a good idea?' Expect plenty of laughs, deep dives into all things tech, and maybe a story or two about the great outdoors.
Profile picture for Mikah Barnett

Mikah Barnett

Father, husband, friend, inventor, engineer and life-long student. I haven't stopped tinkering since my first LEGO set and I harbor a voracious appetite for new information, experiences, and perspectives. As uniquely in-touch as I am with the details of how technology works and what we can do with it, I am blown away by the current and short-term future possibilities with AI. I hope you'll explore this future with us!