# Glauber Costa - Scaling Databases Every Possible Way
**Glauber:** [00:00:00] when we released, when we put Sila out there, again as a result of the pivot that was actually our marketing strategy. There were a couple of blogs from Netflix. Netflix ne Netflix was one of the heaviest users of Cassandra at the time in which they were touting that they managed to scale a Cassandra cluster to a million reads a second.
And then the whole cluster, like the whole group of machines could do a million reads a second. So you had the Cassandra cluster that could read 1 million queries a second, and we actually managed to get one single instance of s doing a million reads a second.
**Lane:** I'm so excited that you reached out to me on Twitter. Obviously this is a new podcast backend band, so we talk about backend stuff and the stuff you are doing on the edge with databases, your experience with Silla. I just think we have a ton of really cool stuff to talk about.
**Glauber:** We have time. If not, we'll do another one.
**Lane:** yeah, exactly. Okay, cool. Let's jump right into it. So maybe if you could just introduce to our audience, your [00:01:00] background, what you've done with, I'm primarily interested in three things, Illa DB contributions to Linux Kernel and what you're doing now with torso.
**Glauber:** That's it. And I'm a guy that I've never been one to job hop too much. I've, I was always been like, very passion oriented very project oriented. So I essentially, I've done 10 years of LeCar, 10 years of s and now hopefully 10 and, or even more years of tour. So, uh, In the Linux car, as I said, like that was around the time I started contributing to Lennox.
Around the time I was live in Brazil, so I was still there. That was in 2000 something. So it's not some it wasn't really feasible today, of course it is to open a company, get funded of course you could get funded from local funds, but like you wouldn't get to, to the world today.
Of course things are much easier. And open source was my way out into the world. That was also like a. Very interested in open source at the time, and I started contributing to the Luis Kernel on my own time. I just for, and it was funny because I was always like this, a little bit of contrarian even back then, [00:02:00] like in, in Chitaus in 1999 everybody was moving to other stuff like a web and mobile.
Mobile was not a thing, but becoming and I was like, I don't care about any of that shit. I just wanna do I like Linux and I like systems and the kernel sounds interesting. I had a bunch of people telling me, you're never gonna find a job with that stuff. Cause it's all who cares about that?
I said, I don't care. I just wanna do what I wanna do. And then I started contributing to the kernel. Got a job, a remote job at Red Hat at the time. And I spent around 10 years doing this. It was mostly virtualization. So I worked a little bit unfortunately with the Zen Hypervisor.
I say unfortunately because I truly don't like it. Although I made many friends in that community, I hope they forgive me, but it's a piece of shit. And then with kvm, when K V M came along, I started contributing to kvm which I liked a lot. And redhead eventually acquired the company by High kvm.
I became friends with their founders um, and when they founded another company, That was around 2012. Um, I was living in Russia at the [00:03:00] time just because, cause uh, uh, and then uh,
**Lane:** to live just because
**Glauber:** yeah maybe a conversation for another day. But I spent around four years, four, four or five, four and a half years in Russia.
And then AOR that were the people behind kvm, they started a new company and I decided to join them. I was employee number three. And that's the company that became Cila. The company did not start with c we started with a unikernel. If you're not familiars, just a kernel that is designed.
So we wrote our own kernel cuz Linux is too boring by now. Let's just write it own kernel uh, in c plus plus. And the idea for Unikernel is that it runs a single process so it can, you, you don't have to do memory management type of sharing. Anything like that.
**Lane:** a single purpose
**Glauber:** Yeah, single purpose.
It's not single purpose. Cause it can run general purpose applications, but it cannot run more than one application at a time. And containers were not a thing back then. So keep in mind, so when you think of this sounds retarded because containers were not a thing, right? So the and people were doing virtualization with Linux on top of Linnux, we essentially had the Linux [00:04:00] hypervisor in KVM with a full-blown Linux kernel on top.
It was very heavy way to do virtualization. So the idea was like, maybe we can instead, like we, instead of putting Linux as a gas os we just put this library os and it didn't work. Funny enough, there are many companies trying to do this again today, and it might just work. One of the things that I think that changed the landscape as web assembly because the trick back then is that like you had to change your programs to make it work.
Like it was POS compatible, but not really. So you had to, because fork wouldn't work. Exact wouldn't work. Yeah.
**Lane:** code and stuff, right? Yeah.
**Glauber:** So web assembly might be the thing that makes this work, but at the time it didn't work, and then we eventually pivoted it in Silla. I spent around eight, eight years, 10 years doing that as well.
And then I had a covid driven stent to Datadog for a year. And eventually it now through, so
**Lane:** Amazing. Okay, cool. I want to recap really quick. So you got into open source now with Linux. I talked to a lot of people that have gotten into open source, but I don't get the chance to talk to a lot of people that have [00:05:00] actually been big time contributors to the Linux kernel. And you mentioned that, what was it, 20 years ago?
The advice was why are working on this open source thing, it's not going to make you money. My, my guess is that in hindsight it actually was one of the best things you could do. Is
**Glauber:** Oh, I should do Abso absolutely. Because again, open source is a thing today. And even, especially in infrastructure, like nobody today would use a database that is not open source. Like you will not consider, you will not consider. But it wasn't the thing that wasn't the case back then.
So the, you have to imagine that again, in 1999, the first time I, I touched Lennox and like open source was not proven. So the advice that I got from a lot of people, my university professor and stuff like that, is like you, people still view too much as if you, if op open source, it's free.
So that means no money, right? There's no money coming to you. So you should be able to sell your work. And how, and even if I would [00:06:00] say hello, their companies starting to do open source. Red Hat is starting to do open source now and et cetera. Cause Red Hat was already, red Hat would sell like those CDs that they could get like to install Linux.
And they were a company. And I think Red Hat changed the game a lot and Red Hat was one of the most impressed. I was so blessed to have having worked the right head at the time. But it was really this non-pro thing. And on top of that, like on top of like open source, it's like you wanna be working in the bleeding edge. Of technology to make a living that, in, in a sense so why would you be working on an operating system? Cuz this is old news. There's still so many and there is a, there is windows and et cetera. And the old Unixes and I, IBM has a bunch of operating systems, but I think what those people do not realize.
And I'm not saying like those people are, I didn't have access to anybody. It was like business savvy or Right. Just most like university folks, is that innovation doesn't have to come necessarily just from technology. It can come from different ways of consuming technology. So when things change, and by the way, it's one of the things that changing a lot with the edge, [00:07:00] things change and then it can go back 20 years and say, let's see, what is it that they were doing 20 years ago that haven't made sense in those 20 years past, but now makes sense again.
And this is a great way to do innovation, right? Just looking at the world and saying did anything change in the world that can allow me to do old things in a different way? And that's also innovation. So Len, the way Linux was, Innovation in that sense because it was just a completely different way of doing business, right?
Than what it was before allowed for so many other things. And obviously to, to the folks in your audience today, they're 20, 25, something like that for them. Open source is second nature. Like I saw the birth of GI because a Git, as you probably know, was created by Lenux Stavos because of a fight we had in the Linnux kernel.
And the quick story is that the Linux kernel used the proprietary versioning system at the time called beekeeper. And nope, just none of the open source versioning systems worked for a project like Linnux that was so distributed. There were a couple of edge projects here and there that would have this need.
[00:08:00] And then there was a company serving that called Bee Kipper. We could use it for free. But the the agreement is that one of the things that, that, that was I wasn't privy to, to the actual agreement, but you can't reverse engineering this. But dude, Linnux is not a company.
It's just a bunch of people doing whatever the fuck they
**Lane:** it's like a mailing list of people like, yeah.
**Glauber:** So this dude, Linnux did not reverse engineer it, but this dude did. And a friend of mine actually he wasn't my friend back then. We became friends much later. And then the guy from Beekeeper said we can't use it anymore.
And for a couple of, for a couple of weeks, everybody thought, oh, that means Linnux. What's gonna happen with Linnux? Because Linnux now doesn't have a versioning system. And then Linnux just disappears for three weeks comes back with gi, right? And then but today, like what that allowed is that now this thing about writing distributed software that's what became GitHub eventually, where GitHub would not have been possible without that.
And again, it's not a versioning system where you think like a, can innovation come from a versioning system? It can, because if you have an open source versioning system, [00:09:00] now you can allow a, so a social innovation in the way people write code. So it's not, it's innovation is funny. It's tricky.
It doesn't have to be like a technology that has never been tried before. So Linux was, for me, was very pivotal in that sense because it taught me all of that. It gave me the opportunity again, red Hat at the time was hiring a lot from the Linux community gave me the opportunity, gave me the opportunity to, start in, in, in international companies in, in companies that were doing things for the whole world instead of the local market which is what I wanted back then.
And from there on, I think everything that I've done so far in, in a sense derives from the fact that I've done Linux when I was 17, even though everybody told me that was a terrible idea.
**Lane:** The way I look at it is, it seems to me that contributing to open source is like a job on your resume that you don't have to apply for. Like in, in my experience, hiring managers look at significant contributions to open source. Like they look at a job, like if you're getting poor requests reviewed, like that's a strong signal to a hiring manager [00:10:00] that sorry if you're getting poll requests reviewed and merged, that's a strong signal that you are doing good work.
**Glauber:** yeah. Again, poor request didn't exist back then. It was like emails on a mailing list as you, just as you just said, and but absolutely. And another thing is like open source. It also depends on the project. And that was also, again, I didn't do this on purpose, I swear.
I started on Linux again. Everybody on the contrary, everybody told me it was a terrible idea. But lots of other open source projects that people followed their passions for. Even example game, like a gimp, like the image manipulation program and et cetera It's cool. Yeah. And you can use that to showcase your skills and I did a bunch of contributions, at the time I was very in show, open source.
I've done a lot of small contributions to other projects that didn't help me either at all or a lot. Because the thing is that Linux became this central thing in the economy, right? So now you're not only contributing to open source, you're contributing to this project that is like a. It's the back.
It's becoming the backbone of the
**Lane:** [00:11:00] Billions of dollars depend on this thing. Yeah.
**Glauber:** So my advice for folks in the audience is o on that is like o of course. Especially for the folks that want to do this as a strategy, say I'm just a lucky guy at the end of the day. But like not only contribute to open source, but try to find a project that is relevant to the stuff that you're doing.
Because the gimp example that I use, it could very well be relevant if you, if what you wanna do has to do some with image processing obviously there's a much smaller market than Linux, right? But you find an open source project that is very relevant to the things that you wanna do. You start contributing to it.
And if you become a person with significant contributions even better. Cuz now you're a part of that economic motor like that economic engine that.
**Lane:** Yeah. No, that, that makes a ton of sense. I love that advice. Cool. Okay, so after Linux, you went to work for, I can't remember the name of the company, but you said the product was Zillow. What was the name of the
**Glauber:** Oh the company that became Seal eventually, cuz you, you end up rename it, but it was called Cloud Systems and we were working with the O SV Colonel. Just [00:12:00] a,
**Lane:** Now Silla. The only thing I know about Silla is that Discord switched to Silla recently.
**Glauber:** yeah. I, yeah. And it it's amazing because I was working, like I, when I joined Phillip, It became Phillip, for the first two years I've just really done engineering cuz that's like writing the database. But then later I liked talking so much as you probably noticed that I I started working with customers as soon as we started having customers.
And when Discord first approached us, I started working with them to just make sure they had lots of problems back then. And it was years and years. So again, I left s in 2020, the Discord thing probably started happening around 2018, 2019 when they first got in touch, like in the blog was now published in 2023.
So just so you understand like how in, in this world of infrastructure, those cycles are very long. They were using it back then, but they were using like already hey, this thing, that thing. And then it starts growing and growing. But the idea of Silla is the following. S is an eventu consistent [00:13:00] database that is API compatible with Apache Cassandra.
So Apache Cassandra was a. For, I'm gonna guess that some folks in your audience know. But the ones who don't know it's a no SQL database, so it doesn't have the same guarantees that a SQL database will give you. It is designed to ingest lots of data. So it was always like heavy rights workloads to have heavy rights, and it's designed to scale horizontally.
By doing this, it's really hard to guarantee SQL levels of consistency. So you become an eventual consistent database. You can write your data there. When you read you don't really may read the stuff that you wrote, right? You don't read garbage, but you read data that is stale. And especially because you had, you don't have, back then you had none today.
You had some transactional guarantees. If you're writing, for example, to chew keys, you may see an update to one key, but not to the other. So you're up and it sounds like it sucks, but the thing is, this thing is so fast and can scale so well that for some workloads, For which, for example, the stuff that Discord has, [00:14:00] you're willing to make the trade.
And I said, I'm gonna complicate my application. Because I'm willing to, and I want to deal with a venture, I'm okay with dealing with venture consistency. And because I need this level of scalability and speed that this thing is giving me. Cassandra was good from distributed system point of view, but we found out that with our experience coming from the Linux kernel, the founders were always also people from the kernel.
Like we understand systems programming much better than those folks do. If we just copy the distributed model, but re-implement the node architecture from the ground up. Taking to account a synchronous io doing kernel bypass whenever necessary and like doing direct io, synchronous io and a bunch of stuff that nobody else was doing, how much faster can we become?
And s was around 10 times faster than Cassandra in a variety of workloads. So that's that's my background. Performance nerd sniping,
**Glauber:** much like just,
**Lane:** I love it. Okay, cool. I wanna break some of that down because I think to, to some of our listeners, they'll have rocked a good portion of that to others. It'll be like, [00:15:00] you lost me in the second sentence okay. Let's like break down a comparison between say SILLA and Postgres. So for the edification of our listeners boot dev at the moment runs on a monolithic Postgres database, right?
We have pretty solid, yeah, we have like pretty small scale. We don't have crazy requirements. And we use like very basic, it's a very basic CRUD application for the most part, right? You do it exercise, we log a row in our database to say you completed an exercise, et cetera. Why would someone use something like Silla over Postgres?
Like where, what does the application need to require in order to make that kind of a shift?
**Glauber:** Yeah first don't do it you specifically because you don't need it. Exactly. And when interest rate interest rates were zero, everybody always went for the crazier architecture first. It was always like that. All the crazy. You have something like that. Let's do a microservice architecture oriented with a bunch of message passing, do a monolith.
It works for almost everybody. But then when you start seeing the requirements change that, that's the, [00:16:00] you know, how much data do you have? I'm gonna guess here. I'm gonna go on the limp. 10, 20, 30 gigabytes of data total about the stuff your students do, did I get that right? So it's more or less ballpark that fits in.
**Lane:** magnitude. Yeah.
**Glauber:** yeah, that's right. Lots of our customers, ATS had a petabyte of data.
And that, that's the first difference because a petabyte 30 gigabytes of data fits in a box today. Back then that was about it. That's about what you could fit in a box today. You can buy hardware from aws. Like how did the commodity stuff that can have 15, 20 uh, when I start, when I stop keeping track the largest box on AWS in terms of storage, add something like 30 terabytes.
Once you go past that point, it doesn't fit in a box anymore. So you already need the complication of distributed database, if I'm story. So think about what kind of things would lead you to need,
**Lane:** and when we say
**Glauber:** byta data.
**Lane:** and when we say distributed, we're talking about, okay, I've got one physical machine with 15 terabytes, but now I have 20 terabytes of data, so I need like a [00:17:00] second physical machine and we're gonna keep them in sync somehow.
**Glauber:** That's about it, right? And also you need to, you, if you do that, you can charge your data statically and figure out two Postgres. And some students go to one, some students go to other. But companies that have a petabyte of data they tend, it tends to be because their data keep, keeps growing, right?
So you need the distributed database. We'll just do all of this for you. We will just figure out like which data goes in which node. And when you need your query, it will figure out, okay, so where is that data? And then there's the variety of algorithms to do that.
So that is the first, there's the first category. The second category is something like how many requests, the second do you get your database of yours in, in, in your project. And again, I'm gonna guess again that it's gonna be a thousand requests a second. At most, when a lot of people are submitting your stuff.
**Lane:** Way less, but yes yeah. Yeah. Like we're not concerned about it at this
**Glauber:** absolutely. And when we released, when we put Sila out there, again as a result of the pivot that was actually our marketing strategy. There were a couple of blogs from Netflix. Netflix ne Netflix was one [00:18:00] of the heaviest users of Cassandra at the time in which they were touting that they managed to scale a Cassandra cluster to a million reads a second.
And then the whole cluster, like the whole group of machines could do a million reads a second. So you had the Cassandra cluster that could read 1 million queries a second, and we actually managed to get one single instance of s doing a million reads a second. So that was our marketing material, right?
Which backfired, which, which backfired a little bit because we presented that and everybody thought, oh, but the good thing is that Cassandra is a distributed system. And yours is not to do a million request a
**Lane:** You only do it on one node. Oh, no.
**Glauber:** Then, but what if I need more? And then I said, dude.
But then of course, that was the first thing. And then, and obviously later it became clear that, okay if we can do a million request a second in a node, then, and, I can do 10 million and a billion and whatever, but having more notes so like those that were, those giant Cassandra clusters with a thousand nodes, right?
And we could cut it down to a much lower [00:19:00] number of nodes. So those are the things that will make you look into a database like that. Then you think about Discord, for example. Discord uses the, as per their blog for their messaging cluster. So just imagine how many messages are being written and read on Discord all the time by all the people.
And then that is, and that is a lot of messages and growing because you keep
**Lane:** Yeah. Adding more users. Yeah.
**Glauber:** adding more users and more messages and a really big pay, a amount of traffic coming to this database.
**Lane:** Yeah. Okay, cool. So you've got this crazy distributed database. You can add n number of nodes, right? You can have tons of nodes in this distributed database. And you mentioned that, so for example, discord was using Apache Cassandra, which is a very similar type of no SCL database. In fact, my understanding is that you
**Glauber:** Compare API comparable. Yeah. You use the same drivers. Yeah, exactly. Yeah.
**Lane:** Yeah. You write the same application code basically to interact with these databases. The first thing that stuck out to me when I was doing some research on this was that Apache Cassandra's written in [00:20:00] Java, and I have a lot of experience with Elastic Search, so I'm very familiar with this idea of stop the World Garbage collection.
**Lane:** Can you, is that one of the key optimizations of Silla? Is it not using a garbage collected language?
**Glauber:** We always said it was the choice of language was key. But the fact that the garbage collection stuff was not the most important thing using a language like c plus plus at the time allow us to do a lot of optimizations by being closer to the matter.
And a lot of that frankly, was just the fact that we knew how to do that kind of stuff, right? And but it was a lot of stuff. The other thing was like the asynchronous io. Cause Cassandra was a database at the time that was based on memory mapping files. And everybody actually thought it was a great idea.
All of those dis MongoDB was the same thing. And Cassandra was the same thing. They just had this amazing idea, let's keep it simple memory map of file, and then the colonel takes care of that and et cetera. And we, in the colonel, were seeing that St. Said, what the fuck are you doing? [00:21:00] Does it make, this makes no sense. Like for a database.
**Lane:** Yeah. Yeah.
**Glauber:** and then actually one of, one of my most read articles that I had I'll send you the link later. You can publish to your audience exactly about this. Like I said, just don't do it. If you're writing a database don't do it like it's stupid, don't do it.
And it made some sense, I guess for those but the thing is that the hardware was changing so much that with with N VMEs, it really didn't make sense to do it. And we were, Zillow was coming up at the, around the same time as was and n VMEs were becoming dominant. So it, it was a lot of things.
It was the fact that so we could utilize CPU a better. You would have a Cassandra node using 20% cpu. And it wouldn't do more than that cuz it's just like a, you had a bunch of locks and a bunch of waiting and so if you would look at a SY box, it was always like under load at a hundred percent CPUs, cranking traffic, et cetera.
So we knew how to utilize
**Lane:** That's how you know it's a good program. Like you've maxed out your CPU and your ram and everything's just cranking. Yeah.
**Glauber:** And then the other thing that, it's funny is that when people start saying, [00:22:00] oh, but we are cheating because of this and that, to say, but no, I'm not cheating. You just don't know what you're talking about.
Because when it takes education to the point, like you live with those technologies and with those concepts for 10 years and in that case it's like the patterns of utilization with spinning discs and N V M E were so different that you had to throw your intuition into the trash when you're looking into an N V M E versus it was not an N V M E is not just a faster spinning disc is a different piece of harbor,
**Lane:** we're talking about, can you explain what an N V M E is versus a magnetic SPINING disc?
**Glauber:** Yeah, so magnet spinning disc is something that hopefully we'll only find those days in a museum. And it's something that like had but still, but lots of people still have this intuition today. I know, but you may think for example, that writing, like reading sequential data is faster.
So this is one of the things that our industry keeps repeating. The in reading. So let's lay the, let's
**Lane:** three years ago, I had a manager tell me, oh, we need to optimize this for sequential reads. And I was like, I don't think that's a [00:23:00] thing anymore.
**Glauber:** not a thing anymore. Exactly. You can optimize for reading larger buffers. That's a thing. But it also maxes out in a couple of megabytes, so it makes sense to put some data together. On, on the megabyte boundary. So you can have larger buffers because you're doing less work per buffer.
But it doesn't make sense to put a terabyte file sequentially anymore cause you don't need to do sequential scan. That's bullshit. People keep repeating because, and I don't blame him cause we're like, for 30 years in this case, cause been in this like when you read from a spinning disc, you have the sick time, you have to first find the position and that's very expensive.
That's 300 milliseconds. It was at some point, right? E even sp and then after that, after your spino is in the right place, it's faster to scan. Just n v music. Not like that at all. You can read from anywhere. You don't have initiation time, you don't have sick time, you don't have anything like that.
And again, I spinning this because of the spindle. You can do one request a second, and then if you have a rate array, you can do 3, 4, 5, however many [00:24:00] requests per second. Not per second, sorry. At the same time. So concurrent requests on an N V M E, you can do a thousand concurrent requests, right?
So that program of yours that had a SEMA four to make sure that not a lot of people are going to, to the media at the same time that SEMA four that protected you from overload now is preventing you from using your CPU U because your CPU is generating data. But and your disc could take it, but obviously this is a stupid example cuz it just make the semi far bigger.
But like when you go look at the architecture of the, you have stuff like that all over the place, right?
**Lane:** Yeah. Okay, cool. And just to be absolutely clear, when we say N V M E, same thing as s s d, interchangeable words,
**Glauber:** in, they're not interchangeable words, but from that point of view, pretty much just the transport technology is different, but they both have this from the point of view of parallelism and concurrency and that it's the same concept applies.
**Lane:** if I squint, it's like the same.
**Glauber:** Yeah. Yeah.
**Lane:** Cool. Okay. Awesome. Okay, [00:25:00] so you worked at Silla 10 years. You, because this is nice cuz
**Glauber:** 8, 8, 8. Around eight years. Yeah. Yeah.
**Lane:** Yeah, a decade ish chunks of time after Silla. You are now the c e o and founder, co-founder. Is it?
**Glauber:** unfortunately co-founder cuz I, I have a co-founder with whom I have a relationship of love and hate, which is 95% hate and 5% love. But the love is so strong. The love is so strong that co I just I'm always joking about him and I love that guy to my heart. And like I met Becca. Becca was the maintainer of the memory management, parts of the memory manage the memory management subsystem for that was so big.
There were many maintainers for sub parts in particular. He was maintaining one of the object allocators. An object allocator is essentially a memory allocator that has like a allocate subjects for you, not arbitrary memory sizes. And I met him 2009 or 10, something like that in, in a conference that we've been together.
We became instant friends. He joined c after me [00:26:00] again to work on this kernel that we were doing. Then we've benched the people. We've been working together since then. So it's been 12 years of Becca my co-founder. So we, I'm, we are co-founders together,
**Lane:** Awesome. Okay. So with Silla, my listeners really only need to know about Silla and how it works if they go to work at a very large company like Discord, or if they work at a very successful startup and start having crazy
**Glauber:** scaling problems. Yeah.
**Lane:** with torso.
**Glauber:** Oh not necessarily just to correct that. Not necessarily. Because if you're doing something like machine learning then you may have those problems from day one. Cuz it, it really depends because it depends on the problem. If you're doing like user tracking and stuff like that you may, again, you may not have you may not reach a petabyte of data, but you may need an architecture that, because it's more of a, it's more of a characteristic of your problem than of your scale.
The scale obviously mattered to some extent because at very large scale, very small scales, you can do anything anywhere. But it's, the problem needs to be taken into account as well.
**Lane:** Yeah, that makes sense. So depending on your domain, [00:27:00] but at the end of the day, it's basically scale of data that we are that we're concerned with. Okay. So Silla is like at one end of that spectrum, torso. My understanding is like at a different end, right?
**Glauber:** The absolute other end of the spec. Yeah.
**Lane:** So again, like we can relate this the problem that torso's solving again to what I'm doing on boot dev with Postgres. Like I have this monolithic Postgres database and it sits in some cloud center in Salt Lake City at the moment. And anyone who accesses like the boot dev application, their computer connects to my server, which is also in Salt Lake, and that server connects to the database, which is right next door to it, right?
And has a really fast latency time. And my server communicates very quickly with my database, but my users can be quite far away from my server, right? Someone browsing boot dev into India is going to have to make requests all the way across the world in order to interact with my website.
So tell me how TORSO deals with this issue.
**Glauber:** Yeah so first of all, torso only makes sense in a world in which [00:28:00] people are already, which they are already looking at the problem of, okay, but now for those users in India, How can I serve them better? And in all fairness, people have been looking at this problem for a long time, and the usual solution for this problem is like the static parts of your application go to a cdn, right?
The user in your example in India let's make it, let's go further, man. Like the in, in Australia or Japan, right? So just really the other side of the world. So you have this user in Japan and the user will go to your website and hopefully your static assets will come
**Lane:** Yeah. Our front end is on a cdn. Exactly.
**Glauber:** right. So th this is already the way the world works for a while.
**Glauber:** Yeah, everything that doesn't change is there. So what we've been seeing though in the past couple of years is that this is starting to feel, because technology is always like that, right?
Technology doesn't exist in a vacuum. You always you solve one problem and then there's this's one problem [00:29:00] leads to the next problem and so on and so forth. So I inso wouldn't make sense five years ago cuz nobody the use cases were not calling for this. But what started to happen is that now you wanna add some personalization to those static assets that you just served.
You want to, to display something, you want to do some transformation and, there's something that you might wanna do to serve those users in Japan a little bit better.
**Lane:** I want your user profile to show up a little quicker.
**Glauber:** Let's not even go into the user profile really just things that you can do with code and no data, right?
Just that you want, you wanna do some rendering, like some specific rendering. There's
**Lane:** I wanna show the dollar value in like your local currency.
**Glauber:** Some exactly. Something like that. You can go with a static table and et cetera, but you wanna do some transformation, some personalization based on that. So now it will be fantastic.
A, anything that's server side, think about something that is server side on your application but doesn't touch the database. You wanna do that for that user before you load that page. So you would love to execute that code[00:30:00] as close as possible to that user in Japan. Now it doesn't make sense for you to have a bunch of AWS servers and a bunch of AWS regions all over the world.
Again, doesn't, which is why again, the problem was already there, but it's why this wouldn't have happened five years ago. Cause that just doesn't make sense. What changed the game a little bit is that the CDNs themselves, Became programmable like something CloudFlare workers now, okay, now you can you can, you don't have to worry about the server stuff.
You don't have to worry about architect having a complex, crazy architecture that doesn't make sense for your startup that just to deploy those servers, give me a function and I will execute that function as close as possible to my user, right? And a lot of the web assembly platforms are starting to look into this direction again for things like Python and go and et cetera.
So let me execute a serverless function close, shoot your user and it's super simple because again, I know where your user's coming from cause I have the cdn or the Edge network as you may wanna call. Now I added code capabilities to my [00:31:00] Edge network. I can execute that code, I can do some personalization and whatever their goal, your Japanese user is super happy.
**Glauber:** But when you wanna serve the user profile that touch the database, you're still coming to Salt Lake City. So that's the thing. And that's what we're trying to solve with torso. So what, with torso, we wanna make sure that your data is also everywhere, so that now you can really have your user profile your personalization code, your ab, the result of your AB testing uh, your L l M models uh, your embeddings and all this stuff like that.
We're not gonna be the database that cranks the training model, but we can be the database where you're starting embeddings. In Japan and in Salt Lake City, and in India, and in Paris, and in Brazil and in Argentina all over. So again, not very good for rights because that, to distribute those rights all across the planet that is expensive.
Unlike Sila, for example, s was all about like millions of rights a second.[00:32:00] This, you can do maybe 10,000 rights a second, right?
**Lane:** Your rights blow up, right? You go from one right to a hundred or
**Glauber:** Pretty much, but still, like a thousand, 10,000 reads a second covers a lot of applications. But now you can read from all of those locations and hopefully you can scale those locations independently so you can serve more reads from if you're having more traffic from here and then you're having more traffic from there you, you can essentially make sure that those things happen independently.
So that's the story of torso, that's the value that we provided. It's a complete you're absolutely right in your it's a completely different problem than the problem that we had at Zillow, which caused, of course, for completely different solution. But we just saw the edge becoming more and more important and functions and geographical distribution and we figured that we could bring a solution to that.
**Lane:** which, I think this is one of the things that best differentiates a good. Engineer, or especially a good backend engineer from a bad one or an inexperienced one. And it's like [00:33:00] not thinking that a piece of technology is just the best because it's new and it's hyped up. Like we need silla for this.
Yeah, go ahead.
on Twitter the other day, somebody was, because our database is based on SQLite, right? And I can talk more about why that's the case. Somebody said, oh, are you trying to tell me SQLite is better than my sql? And I said exactly this. I said, we have one piece of technology in our industry that is objectively better than anything else, which is Vim
**Lane:** objectively better than neo vim.
**Glauber:** Oh, you have for me, like I started using, you have to understand that I started, when I started using Vim, VIM did not exist. It was just vi right over a, over s ssh. So for me, they're all a family, right? For they're all a family of they're,
**Lane:** And that family is better than e
**Glauber:** That family is better than anything else.
Like VS. Code or vs. Code is essentially you versus code. It's in the name red versus
**Lane:** You versus your code, you have to fight your
**Glauber:** yeah, that's right. All the [00:34:00] rest. All the rest, right? There's no such thing better. You can't, there is better, but before you talk about better, you have to select which metric matters the most.
Once we are talking about the same thing, then I can talk about better or worse. Because people keep saying, oh, but this database is faster than that database. But no database is faster than any other database. They're faster at different things usually. And then they cost different amounts, and they cost different amounts in different workloads.
So you can't really talk about better before you define a axis, and say, okay, now let's compare that. Right?
**Lane:** And they have different costs in different metrics as well. You can have a complexity cost versus a licensing cost. Like it's a, it's an explosion of complexity which I love. That's why we're all paid to do what we do.
**Glauber:** I'm not paid to do what I do. I pay other people to do what they do. Like I'm a founder now, so like a life of a founder man, life of a founder is hard.
**Lane:** don't make money as, as
**Glauber:** We don't make money. We are hoping to make [00:35:00] money all at once in 10 years. It's just, but yeah, someday,
**Lane:** I'm right there with you. Okay. So I explored the world of serverless.
I think too early and now I need to go, like re-explore it. This is back in 20 17, 20 18. I was playing around with like Lamb does they were fairly new on aw w s and they were exactly what you described.
Basically, I can, give AWS some function and they'll execute it and it scales infinitely and all these really nice things. But immediately, of course, the problem that I ran into was like this is silly for a couple of reasons. First of all, cold starts what's the point of scaling to zero if once I scale to zero, like my application sucks, it takes five seconds to start up.
All of a sudden my web application feels awful when I'm like demoing it to my one customer. And then the other problem, of course, is the one that torso is working on, which is that like my database is monolithic anyways. Like I'm putting all these lambda functions in front of a Postgres database that's sitting in Salt Lake City.
So there's really no point. Okay, so what, what has shifted? You mentioned there's been this [00:36:00] technological shift. What has made this like a better idea now than it was say, five or 10 years ago?
**Glauber:** So people still complain a lot about the CodeStar problems, but I do think it's a lot better now than it was before. By the way we have a bunch of ideas that we haven't implemented yet, just out of time on how to and we mostly keep it as isolated from users, right? But on even at the database level, be able to to scale to zero with unnoticeable code starts.
So there, there's, there are more technologies to handle these kind of things today. And again I don't wanna minimize the problem of code starts. I do think there's still one of the biggest problems with serverless. But keep in mind that they're also a very similar problem than people complain with with just in time compilation, right?
So lots of Java applications they had just in time compilation and they new benchmark a sale for five minutes. Those applications suck. But after that they, they don't, so that's fine. And it sucks. If you're really showing demos, then don't last five minutes. But when you [00:37:00] have a long running thing, that will really not, so the main advantage of the serverless in that case is not that it can scale to zero, is that it can scale from 1, 2, 2 to three to four independently.
Which is why I think that it will take still a little while before you actually completely solve the scale to zero. Cause the biggest value is still in what comes after that. And a couple of things that shifted is that I just think that, the technology got more mature, it got more available, it got deployed in more places and a little bit of market education as well.
Lambda is not the only game in town. You have things like CloudFlare workers and Netlify for the TypeScript folks also has edge functions, and also I think the geographical. Thing is also a factor because again, you were using Lambda. Lambda is not that transparent as far as I understand.
I may be wrong on that. Just a disclaimer, I'm not a heavy user of Lambda. I know of Lambda from other, from reading and et cetera. And also those things obviously change very fast. But my understanding of Lambda is that Lambda focus a lot more on the function [00:38:00] execution side of things.
And a lot of those edge platforms today focus on the fact that your application will run across the planet transparently. So there is a, it's essentially, again, it's a might look like a minor difference, especially from a technology point of view, but it can be a big difference from the business point of view.
Cause now we can use that infrastructure to serve your users wherever they
**Lane:** You're saying it's less about the scaling and more about the geographic distribution.
**Glauber:** Yes. And again, one of one of the things that I love about the Edge, and once more, you could have done this with a w s 10 years ago by having machines in all of those regions. And it's also one of the things I love with Turso, you can span, you can spin, read replicas.
We keep hearing this, which is from, we can, for me, reminiscent ofs. Cause when we started publishing the first benchmarks of Sila, everybody's oh, you don't understand the problem. I said, come on. Of course I do. You're cheating because you're not showing this, and that. And one, one of the things, for example, Illa scale on the hardware size very well.
So c people were running Cassandra on those small four c p machines. And then [00:39:00] because Cassandra wouldn't scale ver vertically past that, you would run it on an Acor machine and you would get 20% improvement. So why would you do it right? But still scale linearly on, on the size of the machine.
So we were running those like. 40 core machines. And what we heard the most is it's not fair because of course you could cut the number of machines because you redo, we increase the size of machines. Not fair. People think that fairness in a benchmark is all about everything has to be the exact same.
And it's not like that. Cuz look, I would rather manage, I would rather manage a fleet of three machines than a fleet of a thousand. So if I can increase it both vertically and horizontally, amazing. So at the end of the day, what matters is the business problem, what you wanna do and then you just do it.
And we're hearing a lot of similar things with torso, which is great. It's a, for me it's a sign that we're on the right track, which is like, of course your latency numbers are great because you haves everywhere. Yeah. Because we can
**Lane:** Replicas everywhere is not an easy problem necessarily. Yeah.
**Glauber:** So first of all, [00:40:00] if you get a database like, like Postgres and you wanna replicas everywhere, It's costly.
Number one, you're gonna be paying a lot. And again, our scaler plan that's coming up, our free tier with Turo, allows you to have three locations. So for free, you go sign up for Turo for free. You can put your database in three places. For 29 bucks a month, which is coming in a month or two, you can put your database in six locations, right?
So the first thing is the cost of that. And the second thing is that if you have those machines that are distributed, that those, if you happen to be willing to pay the cost of putting those machines everywhere, now your application logic needs to figure out which database to hit. And with torso, like you just hit our single endpoint and we find the right replica for you.
**Lane:** How does that work? If let's just, let's actually just use a super concrete example because this is really interesting to me. So boot dev as a platform, as we mentioned, is like this monolithic application and it's a B2C product. So I have consumers all over the world learning [00:41:00] backend development on boot dev.
So actually would love to, over the next 10 months or years, be looking into how I can use Edge to optimize like, speed times basically for my students. What does, what needs to change in my application code? What re-architecting needs to happen to make boot dev, which you can again think of as like this pretty simple transactional model.
You log into the site, you've got your user profile. When you complete exercises, we're marking rows in a database. What changes so that I could take advantage of torso.
**Glauber:** Yeah. So I don't know where you're deploying your application today, but the first change that needs to happen is that you need to move your code, forget the data. You need to move your code to a platform that does this geographical distribution for you. Cause I started talking about Turo as an example, but the question that you were asking is, was it specifically about serverless and stuff?
And. The thing, again, execute if you need to execute a function, but you still need to worry about where the function is executing, that's still not that convenient. [00:42:00] And what things like CloudFlare workers will do for you is that you don't have to worry about that. You push the function and the platform cares about where, which is the thing that we are doing with torso at the database level.
And now we have this match, this perfect match in which our code the platform that runs your code automatically figure out where I'm gonna run this function. And then the database and the database figure, okay, where am I going to go to fetch the data from? So now I put the data is a single logical database.
That's the thing. It's a, it's not a database of a bunch of replicas. It's a single logical database that finds the best location for you at any given time. So the first architecture is that you need to move your code to some surplus platform. Let's say CloudFlare
**Lane:** need to happen at the same time? Because like I imagine if I took my existing code, broke it up into like serverless functions or maybe if I could even run the whole app serverless that say I have an endpoint currently that has three or four round trips to the database [00:43:00] if I move it to the edge.
But now the edge is far from Salt
**Glauber:** No, that doesn't have, that doesn't have to happen at the same time, cuz now the edge is far from Salt Lake, but it's closer to your user. So the and at the end of the day just so now the user before would take 200 milliseconds to get your server, and the server will get you the database very fast.
Now if you move to a code edge architecture, the user can get you your server in 50 milliseconds.
**Lane:** But in my
**Glauber:** But it's still about the same. Am I making sense? Cause the real trip is not from the server share database. It's from, it's the total, the real time is the total time between user and the user getting what they need.
And then that might look when you're migrating something, shit breaks. That's the rule of life. So there might be a use case that becomes pathological. Sure. And but I don't think you necessarily have to do those things at the same time. You can do them in increments. Look first you move your code.
Again, if 100% of your code is always talking to the database, sure. Move. Both might as well. But if you have, for example, [00:44:00] some code that doesn't touch database a lot and some code that does, you can benefit from moving the code first and then moving the data later. Might, as that's not me saying that you're.
You can't do both at the same time. But, we're engineers, man. We know that, shit breaks and et cetera. So you wanna move one thing first. Sometimes you can do that. So it, it's perfectly possible. Torso is si bay. Go ahead.
**Lane:** I was gonna say, this is why torso is so interesting to me, because in my experience, again, it's a backend doing a lot of like crud application type stuff, right? Rest APIs. If it were the case that user talks to server, say it takes one second and then server talks to database and takes 50 milliseconds, and then it goes back, you've got like a total roundtrip of one second, 50 milli.
Then that's easy. We just take the function and we move it closer to the user first. The problem that I see is that like with 90% of apps I've used or that I've, written and worked on the server, talks to the database like.
**Glauber:** of [00:45:00] times, yes.
**Lane:** 3, 4 times synchronously, right? First it does like an authorization check, an authentication, and it does, a couple business logic round trips.
So moving the server close, like that would be the logical step. But I know that as, at least with boot dev for example, I'm like checking if you got achievements and all this stuff that it would
**Glauber:** Especially, yeah.
**Lane:** the like
And I, there are some things in that benchmark that I do not like. There are some things that I did and one of the things that I did was the fact that you could simulate the benchmark with a single request. Or with five requests, cuz again, that is a lot more realistic, right?
So again, it's closing. And some people say again, it's unfair because it replicated then this case is [00:46:00] better for you. I said, no, it's not unfair because this is how people use databases, right? So in a benchmark, you always want to get closer to real life. And another thing that, that you can do with tourist is that those replicas are quite ephemeral in a sense.
So you can ephemeral in the sense they all, they're long lived, of course, but it doesn't take an hour to replicate. And also we're not dealing with the level, petabyte kind of scale that we've dealt with s for a scale like yours, it would maybe take 30 seconds to a minute to replicate.
So you can run a replica in Japan. For a day, because you expect that in that day you're gonna have some event in Japan or something like that. And then I said now you, you don't need that anymore. I say, okay, shut it down. And now you're not paying for that anymore, right? So just I agree with you as I said.
So if you have a workload like this, you can move to your serverless functions, to global and then you move to torso to do that. Now you cover, because whenever the user hits the server will hit the database four or five times, but your database is now [00:47:00] made close.
And to fully answer your question, what do you need to change is the first, is the code architecture that you just decided to move it at the same time. So let's help you with that. The second thing is just the database accesses and torso is not Postgres. It's not based on Postgres, but it's based on another database that is SQL based.
So there are some differences, but SQL is by and large the same, right? If you're moving to from one SQL database to the other. There are some differences that you need to get used to. Some of the stuff there's it's a lot simpler of, SQLite, which is our under underlying technology is a much Yeah,
**Lane:** BD dev.
So your users, for example, when and also Turso also runs locally. The Truo drivers also run locally. So another thing that I love is that now you can use those local files to run your ci to run your test. Imagine like testing databases before torso. Like you spin up a container and you do this, or you create an account, a free account in your serverless Postgres provider, like with Torso, you can run all of your tests.
Locally, you can do all of the development locally because [00:48:00] the SQLite experience keeps working. One of the things that we have on the website, the SQLite developer experience, now on the edge. Cause when you point your driver to a file, you talk to a file, you replace that URL with, instead of SQLite blah, blah, blah to https s something, now you're talking to torso over the network.
And it's the same code, it's the same driver, it's the same everything. So you get some, you, you port your application from Postgres to SQLite, make sure it works locally so you don't pay any server bills in the meantime. And then you deploy torso and there you go.
**Lane:** That's awesome. Okay, so I wanna just really quickly talk through what, just from a high level, what the application code looks like. So let's assume that BD dev is all distributed on serverless cloud functions, right? Edge distributed. Each one has a connection string. And this was like really interesting to me.
The connection strength to the database in every geographic region you mentioned will be the same.
**Glauber:** The same, yes.
**Lane:** does that, like from a high [00:49:00] level, how does that networking work? So that like latency is avoided. I'm not like going to one central location and they're getting redirected.
**Glauber:** Yeah. So again, we have a we essentially have a router. So when you're contacting when it's the same technique that you are, that would happen between the client and the edge provider, right? So when you're hitting the edge provider, you're hitting a unicast address that can then pick the address that's closer to you and route you directly to there, right?
So you have you have your cdn, so we have parts of torso in the cdn that, long story short, without going in lots of architectural details, that's how it is. Parts of Turo, like the entry points of torso are also at the edge level.
**Lane:** So your logic that runs on a CDN is now connecting to a database cdn.
**Glauber:** that's right.
Yeah. So the data the database is still, the data is still distributed, but I can, I will figure that out for you. You don't have to figure that out. And you contact me, by the way. Some people still do this, some people don't. [00:50:00] But there are people doing client side database access. That can happen sometimes.
Why not? Especially be because we have read only JWTs all the access to torso is done with JWTs, right? So you can have read only J JWTs and you can have a table, essentially. They can just say this or database.
**Lane:** Based style, right?
**Glauber:** Yeah. So you, it's not impossible to do. I'm not advocating for that. Cuz every time they say, oh, sql, SQL injection and blah blah.
So I'm not advocating that you allowed this on the client side. But you can do this on the client side. And it's the same thing. So your client will now connect to, instead of going to the server and then tutorial, so connection to directly, and it's as if. If you were connect, you get that single address, your client gets you that address, the CDN already picked the address that's the closest to you.
And now we'll see of all of the locations where you have data, what is the location that I wanna route you that is closest, that's closest to you?
**Lane:** Oh, fantastic. And for everyone listening and not watching right after Glover said he didn't advocate for it. He winked like big, so.[00:51:00]
**Glauber:** no, I don't care, man. Let's just uh, yeah,
**Lane:** Oh, fantastic. Okay.
**Glauber:** I don't, I don't wanna pick, I don't wanna pick that fight. I can pick other fights if you want, but that fight is like a, it's server side, client side. People have different opinions on that. Just I don't know. I'm a database vendor now, so you can access from, it's your problem and it's not mine.
**Lane:** Yeah. Yeah, that's a good point. It's throw it over the wall.
**Glauber:** We wanna support as many use cases as possible. But just
**Lane:** So one more question that I have about torso in regards to, I guess like what my experiences in application developer would be or how it would change. I'm currently using Postgres and I don't, I'm actually a big advocate of when you're designing web apps to not use like crazy database specific features until you, unless you like, really need them.
For exactly the reason we described earlier. If you use kind of very simple s sql L in your application, assuming you can get away with it and assuming it doesn't affect performance, then like you're portable at that point, right? You, very easy to switch vendors, very easy to switch technologies.
[00:52:00] But I've done a lot of SQL light. Like I said, the S SQL L course on boot dev is done in SQL Light. And the really, the only complaint I have about SQ s SQL light is this loose typing situation. Has that been a problem with your clients? Is there any concerns there? Cuz obviously while I don't love type script, I do love static types.
**Glauber:** So first of all I'm. I came from cc plus plus later, rust and et cetera. I'm gonna offend some people now. I'm feeling that, but
And the discussion looks crazy to me because for me it's like have to I would have to move languages not to use
**Lane:** you want to know what your data
**Glauber:** Yeah. Yeah. And for them, yeah. And for them it's I can keep the same language, but I have the option of using types or not. So this sounds crazy, and why would you not do it right?
But so that I fully support typing. But the thing about it is that at the database level again I wish and maybe this is something that we can do. For example torso doesn't run on pure SQLite. We have a fork of SQLite called leaps that is open contribution, by the way.
So SQLite, as you might know is not open contribution. They don't take contributions. They're open. They're not open source. Technically speaking, they're public domain, which is like even more lax than open source. But it's very rare that they can take, they don't have a community of contributors, like even if they take individual contributions.
So we, for Coli and we have LICO on top of that so we could extend that at some point, but this is not a complaint that we heard yet in a [00:54:00] massive way. And I think the reasons are twofold. The fir, the first reason is that again, people who are reading our message and see torso, they read the developer experience of SQLite on the edge.
As and your programming language hopefully has types, and then you have schema, and then you have ORMs and you have stuff like that. So your experience as a programmer is to deal with types. The fact that the database decided to start a date as text is not necessarily visible to you if you're going through an orm, which lots of people are, because your code will show you as a date type.
[00:55:00] Let's understand.
**Lane:** Awesome. That's fantastic. That's all I wanted to hear. Now I need to go spend some time checking it out. Where can, yeah where can people find a more information about serverless edge databases, and then b, where can they find torso? And whatever you're working on.
**Glauber:** yeah. So Thso, we are at Thurso. That's for those who, again not necessarily seeing that, but the T U R S o.tech. So that's website. You can find us there. And I don't know, I think information about server, is it, that's pretty broad, man. It's it's like people asking me, sometimes people ask me do you have any books?
Do you recommend? Cause I'm learning A, B and C and I'm not a book reader. I have I mean I tech when it comes to tech, right? Because
**Lane:** They just need to go find all your podcast
**Glauber:** yeah. It's just because I think the, the internet kind of replaced that for me in a sense. I think the last book that I read on t was in 2006. Uh, Packer's not like that. Like by, they're my co-founder here. He reads a ton of books, but I don't, so I don't know of [00:56:00] any. One place, right? In any format that you can go to read about serverless. Just that there's a ton. What that means is that there is a ton of resources out there.
We talk a lot about that stuff in our community as well. So we have our Discord community. Everybody's welcome to join. It's a merry band of people just excited about edge and serverless and databases in general.
**Lane:** Fantastic. Go check out torso.tech. You guys have a blog too, right? Then go read about
**Glauber:** we do so from the website, you can reach the blog. It's just and, our website is really the central place for a lot of the stuff. So we have a blog and we have our documentation as well. So our documentation, hopefully we'll also teach you a bunch of core concepts about serverless as well.
**Lane:** Amazing. Thank you so much for coming on the show. It really was amazing talking to you. I had such a good time
**Glauber:** Likewise, man. Likewise,
**Lane:** this again sometime.
**Glauber:** we will absolutely.