RAG continues to rise
source link: https://changelog.com/practicalai/264
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Transcript
Changelog
Play the audio to listen along while you enjoy the transcript. š§
Welcome to another episode of the Practical AI podcast. This is Daniel Whitenack. I am founder and CEO at Prediction Guard. Iām joined as always by my co-host, Chris Benson, who is a principal AI research engineer at Lockheed Martin. How are you doing, Chris?
Doing good Daniel. Howās it going today, man?
Itās going great. I just landed in Boston, and you were texting me and youāre like āHey, Demetrios from the MLOps community wants to hop on and record an episodeā, and I was like āIāve got to get out of this train station.ā So I just found the nearest stop and got out, and I donāt have my normal setup, so I probably sound weirdā¦ But I was like āThese are the best times, when we get to have our friend Demetrios on.ā Howās it going, man?
You guys canāt get away from me. I blackmailed your boss into letting me come back on here. [laughter]
Standing invitation.
Whatās up, man? How you doing?
Iām so excited, because I try not to abuse this standing invitation. I had a great time the last time I was on here, we had a bunch of laughsā¦ And a lot of people reached out to me because we had this RAG versus fine-tuning conversation, and I think things have kind of ā like, the pieces have fallen, the cookies have crumbledā¦ It feels like fine-tuning is not as popular as it was. I donāt know what you all are seeing out there, but RAGs are the go-to in these days.
Yeah, thatās what Iām seeing. Everyoneās RAGging each other all over the place.
And I think - yeah, along with that, thereās sort of likeā¦ In my mind has developed this category of - so there was RAG, which of course youāre augmenting the generative model with dataā¦ But not in the way that people typically think of fine-tuning; youāre just doing retrieval. But I think also these other sort of workflows around calling external tools, or the neuro symbolic stuff that weāre seeing, of like combining a rules-based algorithm or function, or a ātraditional machine learningā algorithm with generative models maybe to get the inputsā¦ And then like connections to databasesā¦ All these different ways. It kind of seems like people are figuring out that generative models are great at being kind of assistance and automators, but not necessarily predictors, or other kinds of functions too, like analytics types of things, and that sort of stuff.
Iāve noticed thereās been a new term coined, at least for me, RAG as a service. Have you guys run across that? RAG as a service is now like a thing.
Is that acronym just RAAAAAG?
Donāt even try. Youāre just gonna strain the vocal cords if you do that, man. Yeah, RAG as a service. āWeāll RAG you. If you bring your thing, weāre gonna RAG you.ā Yeah.
Well, I think what youāre saying, Daniel, really speaks to something that Iāve been seeing too, which is the maturity in the last (whatever) six months; itās become very clear that thereās traditional ML workloads and use cases that are kind of going to always be traditional ML workloads. If you think about like your fraud detection models, or like the term prediction, or the recommender system evenā¦ And then you have your generative AI workloads or use cases, and thatās something - like transcription, or you have the LLMs which are doing all sorts of stuff, but RAGs are probably the biggest ones in that, where you get that Copilotā¦ Or code generation I think is a huge oneā¦ Thereās not like such a big overlap where youāre saying āOkay, the generative AI use cases, or the generative AI models are going to dethrone the traditional models.ā
I agree with that. Definitely different use cases.
I think one thing - we had our first gen AI mastery webinar with the podcastā¦ When was that, Chris? A couple of weeks ago, or something like that.
And we were talking about text to SQL specifically, and that analytics ā like, SQL is really good at doing analytics, right? Especially descriptive analytics, and aggregations, and all of thatā¦ And it just doesnāt make any sense for you to take a big table and somehow figure out how to dump its contents in a prompt, and have a model reason over it, because itās probably going to get it wrong anyway. But also, thereās this existing tool which can be called, which is really awesome at doing those things.
I love the idea and the exploration thatās happening right now on āHow can we merge both of these worlds?ā and āHow can we see what different parts work well together, and which combination of the traditional ML plus the generative AI can go together?ā
[00:05:43.27] And I know that you do a series of surveys with the MLOps community. I think the last time that we talked, we were talking about some of your survey work and some of the interesting findings, but I think youāve gone through other iterations of this, right? So are you seeing interesting things pop up as the community around this technology matures?
100%. And I just come on here when Iāve got survey insights to share, I guessā¦ Thatās what Iāll be known for. Whenever we have a nice survey ā
The pollster of AIā¦
Yeah, Iāll come and share it with you all. But this one is cool, because this time - so we did an evaluation survey, and we launched it when we had our virtual conference, which had a huge turnout. It was over two days spanā¦
Which was awesome, by the way.
It was great. You were part of one of the past ones, right?
You had an awesome spot. But the two-day span - we tried something different and we said āWhat if we do two days, but since itās virtual, nobody has to fly anywhere?ā So instead of you trying to watch a live stream for eight hours a day on a Thursday and then a Friday, why donāt we just do two Thursdays in a row? And then you donāt have to feel like āWow, I just had 20%, 30% of my week eaten up by that eight-hour live stream or 16-hour live stream. Now you can tune in, tune out on a Thursday of your choosing.
But we launched it there, and the response has been amazing, because normally maybe weāll get like 100-150 people that will fill out the survey; this time we had 322 is the number of theseā¦
ā¦which is super-cool to see. And so let me give you some of the clear insights. One, thereās budget being allocated towards AI these days. I donāt think thatās going to surprise anybody. The fascinating part is that 45% of the respondents said that theyāre using existing budget, and then a whole whopping 43% here said āNo, weāre using a whole new budget.ā So youāve got exploration happening in generative AI like never before. But when it comes to that, one other takeaway has been that the MLOps, AI, ML engineers - theyāre really trying to figure out what the biggest leverage use cases are and how they can explain that. And I think what weāre seeing is thereās a lot of companies that are open to the exploration right now, and theyāre open to letting people say āAlright, cool, what is the most valuable for our teams and our company? Is it a chatbot that is an internal chatbot? Is it an external chatbot? What does that actually look like? What is the use case?ā
Itās kind of funny, we actually talked about that a little bit last week, Daniel and I did, in terms of trying to get non-technical people engaged in it. And I think that there are organizations all over the world right now that are doing exactly that, to your point on the result that youāre seeing. Thereās a lot of effort and a lot of money being thrown at how do we start doing that.
And itās all ā really, the shining star here was just RAGs, obviously; itās very clear. Everybodyās using RAGs. And the participants self-identified as being intermediate in RAGs. That was the majority. So we had like 31% saying that āWe have some experience with LLMs and RAGs.ā And then you only had 6% saying āWe are at the frontier of LLMs and RAG model innovation.ā
Break: [00:09:42.17]
So let me ask you guys a question and get some opinions going here on that. Do you think, you know, with all of these assistants and chatbots and all the other kind of focus things in gen AI, with those using RAG, do you think those are for more kind of general use cases having some domain knowledge in, and do you think that - maybe going back to the last time we had you on the show, when we were talking about fine-tuning, do you think that kind of highly specialized fields will still stick with fine-tuning instead of RAG? Do you think, in other words, the degree of expertise required, if you will, to get a job done productively - do you think that makes a difference on whether or not you go RAG or fine-tuning? Or do you think it has nothing to do with that?
I would love to hear what Daniel has to say in a minute, but Iāve heard something said about like fine-tuning is for form. So if youāre trying to get a different form on the output, or for example if youāre trying to get functions and the whole - basically, GPT functions are a perfect example of this. So if youāre trying to get a homegrown model to do that type of thing, then fine-tuning makes sense. Otherwise, itās not necessarily the good call, because unless youāre using a very small model I think is the thing there that āOkay, whatās the trade-off that youāre doing?ā
And the other thing that is probably worth talking about when it comes to the difficulty levels that Iāve seen is people that want to go and use a small model, a small domain-specific model, that is distilled, or it is very fine-tuned and distilled, and itās in their own infrastructure, and they need a whole team to support that - thatās like hard mode; youāre playing on hard mode, and you contrast that with just a GPT-4 call. Thatās a whole different level of the game that youāre playing. And so I kind of look at it that way, like how much are you willing to trade off when youāre trying to figure out āIs this the way forward?ā
I would agree with that. I think the only thing I would add is there is still, at least as far as Iāve seen with our enterprise customers, an inclination that they need to fine-tune. Thatās still a kind of general āOh, we need to do that at some point.ā And I think once they kind of solve a few of their use cases without that, they kind of disillusion themselves of that notion, in many cases. But the good thing, like youāre saying, Dimitrios, is like you can probably prove to yourself without a huge amount of effort using an easy to use API, whether or not youād need fine-tuning, and do that in like a dayā¦ Versus just immediately going to jump to fine-tuning, and āHow do we get GPUs? What model server are we going to use?ā and all of that stuff. Like you say, that gets very much into hard mode very quickly, and you donāt sort of need that to validate your use case often. And even if you do fine-tuning, sometimes it may be down the line, when youāve been running the pretrained model for quite some time, and you actually have a good prompt dataset to fine-tune withā¦ Because most people donāt start out with that either.
Thatās a huge point, too. And this is probably the biggest unlock that happened, is that okay, now that anybody can use the Open AI API, you can quickly see if thereās value in that crazy idea you have. And then you can go down the line, like āAlright, now Iām gonna use an open source model which is turning the knob to something harder.ā Or maybe itās not even using an open source model. Maybe itās just like āCan we get the same results with a smaller model from Open AI? So instead of GPT-4, weāre going with 3.5 turbo. Can we then go to an open source model and get the same results?ā I almost look at it like a spectrum of how difficult you want to make your life, and how much upkeep youāre going to need, and all of that. But as with everything, there are benefits if you go to that very small model, if you need it. And you just have to really like play out and see if you actually are going to need it.
Speaking of another survey, just to make it an increasingly survey-driven showā¦
āSurvey saysā¦ā
[00:15:43.12] Yeah, āSurvey saysā¦ā The Andreessen Horowitz post that they did another sort of survey which was kind of interesting, and part of what they drew out - I forget the range of participants that participated in the survey. We can link it in our show notes. It was just posted March 21st; weāre recording this maybe a little more than a week laterā¦ It was about enterprise ā so 16 changes to the way enterprises are building and buying generative AI. And one of the things that they specifically highlight there was enterprises are moving towards a multi-model future, and specifically a multi-model future driven at least partially by open models. And so I think the other kind of interesting trend that youāre seeing is they have a graph with how many model providers are people using per company, or whatever, and you see three, four, five, in many casesā¦ And also a high adoption of open models. And I think what theyāre trying to draw it is some of that is maybe driven by security, privacy thingsā¦ But I think also itās driven by like control and flexibility. And once people start realizing ā I also find it still a pretty big misunderstanding that people have, that all of these models sort of behave the same. And in reality, pretty much every model has a character of its own, and its specific behaviorā¦ And even just switching from Open AI to an open model for like text to SQL, for example a model that doesnāt do other things well, but does that really well can prove to be really useful. Or maybe itās a specific language thing. Weāre doing a mandarin chat right now with one of our customers.
And so whether itās language, or whether itās task, people are I think finding out that their future is multi-model or multi-model provider, or whatever, mainly because of that behavior thing, but also because they can have some control over when they use this model, or when they use that model, and kind of create the mix thatās right for them. And thatās kind of a way - itās like a route around fine-tuning in some ways, because you can kind of assemble these reasoning chains even with multiple models involved that do very specialized tasks. And that can kind of help you avoid spinning up and running your own fine-tune that does this very unique reasoning workflow. You can kind of bring all the experts in, bring all the expert models in to help you.
I canāt help but wonder - you know, thatās a built-in capability that you have at Prediction Guard, but I think that thereās a maturity issue there with a lot of organizations on getting to that pointā¦ And so what Iām seeing is very mature organizations are going exactly to that - having multi-model capabilities, and they have the ability to distinguish between which models they should use for which circumstancesā¦ I suspect - and maybe thereās some survey data on this, but I suspect that thatās still a fairly small group of even enterprise organizations that have gotten to that level of understanding of what they can do. And I think thereās a spectrum falling off from there that the bulk of the world is in right now in terms of trying to figure out how to make it work for them.
Let me twist some statistics to play in your favor. Hold on, Iām gonna crunch some numbersā¦ [laughter]
Thatās awesome. I love it. Live ā youāre probably not using an AI model to do it, butā¦
No. Iām gonna weave that narrative that youāve got, Chris, and Iām gonna go with it, with the survey data as youāre asking forā¦ No, but actually the biggest question as you guys are talking about this that goes on in my head is you know what engineers really do not like, and I think it makes them very anxious? Itās having a single point of failure. And so if you are relying on open AIās API, and you have a lot riding on that, where does that all go if you the CEO gets ousted?
[00:20:02.02] So I imagine that a lot of people thought twice after there was that big drama that happened, and people started thinking āYou know what? Maybe we should try and have a few redundancy options, just in case.ā Now, you do have to have a bit more maturity to say āOkay, I canāt use the same prompt as I use always. So I have to have this prompt suite, or prompt tests, or prompt templatesā, and I think thatās another thing thatās happened since the last time we talked when we had the conference, to try and get people to ā
Well, Prompt Ops is one, Agent Ops is another oneā¦ But I created a song called prompt templates, and Iāll put it in the ā maybe we could play it real fast.
As someone whoās in kind of the defense industry, Iām for Agent Ops, because it sounds bad, doesnāt it?
Oh, I thought you were gonna say you were for prompt songs, which would bring smiles to people in the defense industryā¦
Thatās what Iāve always wanted to doā¦ [laughter]
Missed out on that oneā¦ I donāt know if people can see this, but heās got a wonderful shirt on, that is for sureā¦ And being in the defense industry, I donāt know how you can get away with that.
Work from home is how I get away with it.
And for those that arenāt watching, his shirt says āI hallucinate more than ChatGPTā, which is a classic shirt.
Which Demetrios sent to me, I have to say. So thank you very much. I love this shirt.
I love that you wear it. That is what Iām very happy about. But the other thing that I wanted to mention about the survey - and then we can move on and keep talking about other stuff of the day, topical issuesā¦ But the data that people use, and the data with which we evaluate the output - it seems like people just donāt know whatās going on there. We havenāt figured that out yet. Thereās no consensus, itās not really clearā¦ And the classic datasets, or the classic evaluation pieces that you use - they donāt really hold up, so everyoneās gotta be having their own data that theyāve created, and that theyāre testing against the outputā¦ But itās really hard to do that at scale. And itās really expensive also. So thatās what we saw in the evaluation data, or in the evaluation survey data, is that youāve got to handpick these, youāve got to match them up, and itās human-curated. The testing datasets that you create, we had 42% are using data that theyāve created as their datasets to evaluate if the model is working or not.
Yeah, thatās crazy. And you mentioned expensiveā¦ It could be monetary, but it could also be sort of iteration time, too. Typically, back when we were creating machine learning models, which lots of people still do, because itās the thing driving basically all of the predictive stuffā¦ But you could run your model and evaluate it; maybe it took a few seconds, but you know, a couple minutesā¦ But here, when youāre thinking about running, especially against an API that has like variable latency, or maybe the calls, each execution of your prompt chain is taking like 15 seconds. And even if you want to run that over 100, 200, 300 example reference inputs, all of a sudden your iterations become really, really slow.
[00:24:24.15] Thatās something Iāve noticed that people kind of struggle with, is really making their evaluation quick enough that they can iterate ā feel like they can try a lot of things. Even if they have like a big budget to try a lot of things, this kind of iteration time is really frustrating, also because maybe thereās other people that are involved that arenāt technical, and they donāt want to think about like concurrency in Python, right? They just want to go into an interface and like try some stuff. And yeah, so youāve got all these things mixed together, which make it a bit of chaos in many cases.
Thatās so true. The iteration speed, the time, theā¦ I mean, we see here that ā this is crazy. 72% of ground truth labels were manually labeled by humans. And so to have to go and do that, and then also, how often are you doing thatā¦. Thereās so many questions and so many unknowns for what the best practice is. Thatās one thing that came up on the challenges, is that itās just like - a lot of people called out something that we synthesized into lack of guidance. Like, nobodyās saying āThis is the best practice. This is what weāve seen works really well for usā, because maybe some people say āWell, this worked well for us sometimes, and you can try it and see if it works well for you, too.ā Thatās kind of the state of the industry right now.
I want to tie back into something else that weāve been talking about lately, and that is, the fragmented nature of the community. Itās another thing Daniel and I have talked about recently. We do have communities, but we have multiple communities, and in many cases - not in your case, but in many cases - theyāre very platform-dependent, vendor-specificā¦ And compared to a lot of programming languages, it makes it harder for people to come in and find specific best practices. So Iām actually not at all surprised to hear the survey kind of playing that out. I think that thatās kind of a natural fallout of the challenges that weāre having with community in general.
So if Iām understanding this correctly, because a lot of the communities are being built around certain tools, you have the best practices for those tools, but not necessarily for the industry. You canāt generalize those best practices.
Yeah. And I think also the different channels through which people are communicating kind of naturally develop their own bias. And I donāt mean bias in a bad way necessarily, but just the bias towards like emphasizing certain things. Like, you get into the news research communityā¦ We had a great conversation about that. And people are talking about āOh, weāre doing all this activation hacking, and representation engineeringā, but thatās like not really talked about if youāre over here in the LLaMA Index Discord, or the LanceDB Discord, or whateverā¦ And some of thatās driven by the focus on what those tools do, but also where people are coming from, right? More of the indie hacker, building app sort of stuff, or the rigorous academic side, or the enterprise, āI really just want to get something into productionā sideā¦ Thereās all these different slants people are coming from.
[00:27:48.06] That is so fascinating to think about how each of these communities has their main focus. And since there is so much surface area, and thereās so many areas that you can go, different areas to explore, that each community is exploring their own area. And if you go into that, you can tap into what people are talking about in that area, versus if you go into another community, you get āOh, whatās going on here? Whatās the focus of this community?ā
This is a different outcome from what weāve seen - if you step out specifically of kind of the AI/ML world, and you look at more just computer science, computer programming communities out there, thereās usually kind of a place to go, and you kind of learn the same sets of skills and values around that. And thatās a little bit different from this. So itās been one of the challenges, I think, that the AI/ML world has struggled with a little bit. So like I said, I think your survey captured that essence.
Thank you for sharing that with me, because Iām going to steal it, and Iām going to say it a bunch. Hopefully you donāt mindā¦
[laughs] Say it all you want.
Itās a great insight. Iāve seen it. Just in the MLOps community, we have people that are really trying to productionize AI. And so what people in there are talking about is really pragmatic and practical. āHow can I get this being used in my company, so that I can either save money, or make money?ā Money is the ultimate metric there. If you go into, as youāre mentioning, these different communitiesā¦ If you go into the LLaMA Index community, thereās a lot of talk of RAGs. And actually, we had Jerry on in the conference, and he showed this slide that I thought was so incredibly done. It wasnāt him that did it. I canāt remember the person who created it, but it was like āThe 11 ways that RAGs fail.ā And so it had all these different ways that you need to be aware of. And I think one thatās coming to light that people are seeing is so important is how you need to get that retrieval evaluation correctā¦ Because if youāre not retrieving the right thing from the vector db, then it doesnāt matter what the output of the LLM is; if you give it some kind of crap, then itās not going to give you anything there.
And the other piece that I think is fascinating is that - like, how do you make sure that all this data that youāve got in the vector db is up to date? And so weāve talked about this a bunch. And again, in the MLOps community weāre very industry-focused, and how can we make sure that we are productionizing thisā¦ So in a production scenario, youāve got your HR chatbot that is using a RAG system, and you say āAlright, cool, weāve updated the vacation policy.ā So we went from a European vacation policy, to an American vacation policy. And youāve got Daniel over here saying āAlright, HR Chatbotā¦ How many days of vacation do I have?ā How do you make sure that everywhere in the vector database it now is updated to the American vacation policy?
So okay, cool, in the vector database maybe you say āYou know what - we were able to scrub everything, or we just pull from the most recent documentsā, but then you were a good engineer and you made sure to pull in a bunch of different data sourcesā¦ So in Slack, it turns out that youāre grabbing some data from that, and people talk about how it still is the European vacation policy, and now Danielās been quoted of having 30 days of vacation, when really he only has two.
[00:31:38.20] Thatās unfortunate. [laughter] Yeah, actually, this is a conversation we just had the other day with a customerā¦ Because also, at least some of these databases, they have - depending on what you go with; if you go with a plugin to an existing database, maybe thereās kind of more traditional updating and upcerting sort of functionalityā¦ But some of these is just like āPut a document in, get it out, delete it.ā There has to be a layer of logic on top of these that actually help you do some of that. So in their case, it was like āOh, we want to take in all the articles that weāve had on this websiteā, and thatās going to be it. And then theyāre like āWell, what if we update those? Do we just blow everything away and redo it?ā And it ended up ā my answer, with the amount that they had, I was like āProbablyā¦ā Like, if you can have something running in the background, honestly, thatās probably the safest thing for you, and itās gonna take a couple hours or something, but then at least you ā making sure everything synced up isā¦ In that case they could just version the files of the embedded database. But yeah, itās an interesting set of problems.
That is a fun one. And also, what Iād love to explore too is the idea of RBAC, or role-based access controlā¦ How are you seeing people go through that and do it well? Because that feels like another one that can be really misused.
So for RAG itās one thing. For text to SQL, some of that maybe can be kind of nice, because if youāre embedding some function in an application that already has RBAC on the database, then you could use that credential, and hopefully that carries through. But for the vector database side, weāve interacted with people that have maybe like an internal chat, and an external chat, where the external chat is a subset, or should use a subset of the documents from your internal chat. So in that case, you sort of have like two; itās bifurcated rather easily. And thatās somewhat easy to do with, because you could just have like two tables or two collections, whatever that is, in the vector database, and kind of merge the retrieval or use them selectively in certain waysā¦ But as soon as then you have many, many different roles, or even user-specific things, I donāt know ā many vector databases, however you managed that would be transparent to that vector database. So youād have to somehow manage the metadata associated with it.
There may be certain people will have to follow up - Chris, we havenāt had Immuta on for a while, but theyāre always thinking about these role-based access to really sensitive and private data. Iām sure thereās people doing advanced things, but in terms of the main tooling that people are just grabbing off of the shelf, a lot of that logic is just absent.
Exactly. Yeah, I want to hear if anybody is doing RBAC and they figured it out. Thatās one thing Iām fascinated with. Because it is a very ā again, going to the community that I run with, thatās something that productionizing kind ofā¦ It comes hand in hand with that.
Yeah. And it could also have to do with the guardrails that you put around the large language model calls, because if itās like a public-facing chat or something like that, that you may want to filter out, PII, or prompt injections may be a very important thingā¦ Versus internally, ideally you trust people as long as you know how the data is flowing. Like, there might not be as many restrictions in terms of what can go in, or whoās accessing things, and that sort of thing. But yeah, itās interesting.
So in addition to this sort of evaluation stuff - weāve spent a lot of time talking about data and evaluation and retrievalā¦ What about on the model side? Do you think weāll ever escape the world of transformers, Demetrios?
So this is something Iāve been thinking about a ton, manā¦ And Iāve got some thoughts on this that ā is everything that weāre doing now in AI a band aid because transformers just arenāt the right tool for the job? Have you guys thought this?
Itās like one big workaround?
Yeah, exactly. Am I crazy to think that?
I donāt think so. Actually, I was talking to one of our customers about this; they have so much logic around double-checking the outputs of models, or like formatting the outputs of modelsā¦ And Iām talking like hundreds and hundreds of lines of code, thousands of lines of code, I donāt know, written all around this sort of workaround of like ā and itās because theyāre using a general-purpose model, that you sort of have to massage into how you want it to behave.
Is it a little bit ironic that you use RAG to clean up the problems with transformers? Is that what weāre saying here?
What we need is the Lysol wipesā¦ [laughter]
But I often wonder, are we having to over-engineer this because the core of the problem ā itās like weāre trying to put a band aid on something, instead of going and fixing the root of the problem. And right now, it feels like thereās nothing out there that can even stand a chance against the transformer architecture. So of course, we canāt say āWell, I would rather use XYZā¦ā But I just get the feeling, when we think about AI in 2024, or the ChatGPT AI era, weāre probably going to be laughing at the whole idea of transformers. If in 10 years weāre looking at that, itās gonna be like āYeah, okayā¦ Transformers were great, but they were a stepping stone.ā
I know that thereās a quite a bit of research going on in general about doing different types of architectures. I know that thereās a number of organizations that had been testing alternatives to transformers in the last couple of years. But I donāt think anyoneās gotten there. Or if they have, then you should reach out to us and let us know, so that we can be talking about it here on these podcasts.
I think thereās a lot of folks out there that are really wondering whatās next. Because weāre essentially taking one superset of architecture and weāre doing everything we could possibly do with it. And every big step forward in the last few years has been around what else can we do with this architecture. So at some point, I agree with you, Demetrios, somethingās gonna give, and weāve got to try some new approaches in there.
Yeah, thatās what it feels to me. Itās just like, āWhatās the next step?ā And I would love to also hear from whoever ā if thereās something that feels like itās promisingā¦ Itās really exciting to me; I donāt know enough about that. Thatās very much the research community, that I donāt get to spend a lot of time inā¦ And Iām sure thereās a bunch of false flags, and people get excited about something and then it turns out that after you throw a bunch of GPU at it, it doesnāt work out like we thought it would; or we saw a promise, but it didnāt actually work out when it holds up to scale. So I understand that right now weāre in the era of transformers. I wonder how long weāre gonna stay in this era.
Not only around specific architectures in that capacity, but almost new approaches. For the first time in a while neuromorphic computing is really rising again as a topic of interestā¦ And itās not there yet. Youāre talking about architectures, both on the hardware and the software side, that are not specific to either transformers or even GPUs underlying it. But itās been interesting to see the maturity thatās developingā¦ You talked about the exposure to research; even for me, thatās the same case, is that you have all the pure researchers out there, but now weāre starting to see them expand out in lots of ways, and trying completely different approaches. And Iām pretty excited that weāre going to start seeing some interesting results over the next few years, as people are looking for alternatives across both hardware and software architectures. I think weāre pretty close to a turning point.
[00:42:19.25] Can you break down real fast ā what was that big word you just used? Anthromorphic? What was that? I canāt even say it. My tongue tied.
Neuromorphic computing I think is what youāre talking about.
Neuromorphic computing. That is a big word. What does that even mean? I donāt know, Iāve gotta google that real fastā¦
[laughs] And I am the last person on the face of the Earth that should be trying to explain neuromorphic computingā¦
Iāve put you on the spot.
No worries. But having been exposed to that, the short version is almost like - you know in the earlier days of AI, in the marketing people would talk about āOh, mimicking the neocortex, the human brainā, and stuff like that. And we all kind of ā as this GPU and transformer-based architectures, weāre like āItās not really like the human brain.ā Well, the neuromorphic architecture is actually that. Itās the legitimately, like, āHow does that ā the architecture of a brain.ā And Iām saying thereās probably neuromorphic computing scientists out there listening to me now, going āOh, my Godā¦ Somebody take his mic away. Thatās a terrible explanation.ā But in my fairly primitive understanding, thatās kind of where it is; how do neurons really work in real life, and how do you do compute artificially in that capacity? But I know that thereās definite interest in doing that. I know Daniel has a relationship with Intel through Prediction Guard, and I know Intel has an interest in that field. I think theyāre one of the leaders in it.
I googled itā¦ Intelās all over the first page. Perplexity. It was all cited from Intel. That is very true.
I would hesitate to say it out loud, because Iām probably wrong, but they may very well be the global leader in that space right now.
Yeah, it makes sense. Well, that is awesome. Iām glad that you taught me about that. I appreciate you for teaching meā¦ Neuromorphic. Now I can say it properly, and everything.
Well, you know what - now that you say that, weāre gonna have to have a show on neuromorphic computing coming up pretty soon.
Yeah, exactly. Letās get down into it. I want to listen to that, for sure.
Weāll dive into that. Daniel can reach out to his contacts there.
Well, thank you very much for coming on, as we wind up hereā¦ Itās always a pleasure. Anyone who has been listening to the show long knows that youāre joining us regularly on the show. Itās always special for us. We have a great time with you, so thanks for coming on today. We will get the show notes for the survey, and some of the other topics that you brought up today, so people can joinā¦ And folks, if you havenāt gotten into the MLOps community podcast that Demetrios hosts, you definitely need to check that out. Itās an awesome podcast, highly recommended by both myself and Daniel. So I hope people join you over thereā¦
Oh, and can I also plug ā weāre gonna have an in-person conference, and Iām really excited about that, a little bit shaking in my bootsā¦ Because June 25th itās going to be our first in-person conference ever, and itās going to be all about AI quality. And weāve got some super-cool speakers coming. We managed to get the CTO of crews to come and talk about what theyāve done since their little mishap in regards to like making sure that their AI is qualityā¦ Weāve also ā I mean, thereās so many great people; you can go to AIQualityConference.com. And weāll throw the link there in the show notes, too. Iām very excited for itā¦ But the speakers are going to be awesome, the attendees are going to be amazingā¦
I think what Iām most excited for though is that weāre going to have all kinds of fun, random stuff. You can imagine itās going to be a conference, but itās probably going to be more like a festival. I may have people riding around in tricycles, giving out coffee, orā¦ Weāll have a little DJ area, or a jam bands breakout roomā¦ A bunch of Legos hanging aroundā¦ I donāt know yet, so if anybody has any ideas on how we can make it absolutely unforgettable, I would love to hear about that, too.
And Iām gonna throw out one last plug for youā¦ When you say that, I believe you, because - I know that youāve heard me say this when we were off the air, but just in case anyone doesnāt know this, Demetrios is the funniest guy in the entire AI world, and does hilarious things. If you donāt follow him on social media, you are missing some really great, great content. So anyway, I just wanted to say that ā
People should show up at the conference just to see what youāre doing, if no other reason; even aside from the cool content you have. Theyāll enjoy it. So thanks for coming backā¦
I mean, thereās gonna be great speakers. Youāre gonna learn a ton. But thereās also going to be some really random stuff that youāre going to be like āWhat is going on here?ā and hopefully you really enjoy it.
Thatās what Iām going for.
Okay. Well, thanks a lot, man. Iāll talk to you next time.
Changelog
Our transcripts are open source on GitHub. Improvements are welcome. š
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK