aakash.io

View Original

QuestDB - The fastest open source time series database (as All Schemes Considered)

Listen Now

See this content in the original post

Podcast Notes

Aakash and Xand discuss QuestDB [https://questdb.io/]. QuestDB's product focuses querying large amounts of time series data quickly and easily. Xand takes Aakash on a journey which delves into why massive companies like Elastic and MongoDB give away their product for free. We learn that open-source is the developer product version of freemium, and there's a huge benefit to being open-source.

Send in a voice message: https://anchor.fm/all-schemes-considered/message

Support this podcast: https://anchor.fm/all-schemes-considered/support

Contact:

Email: allschemesconsidered@aakash.io

Twitter: @aakashdotio [https://www.twitter.com/aakashdotio]

Music credits:

Syn Cole - Gizmo [NCS Release] provided by NoCopyrightSounds

Autogenerated Transcript

Aakash Shah: [00:00:00] We're talking about QuestDB today. They're a member of the Y Combinator Summer 2020 cohort. And they've raised $2.5 million in total. 

QuestDB is the fastest open source time series database. 

 I don't know what that means. Xand can you tell me what that means? 

Xand Lourenco: [00:00:20] Yeah. So this is a very niche thing we're talking about. And I'll back up a bit just to explain the concept. So databases,  everyone probably has a bit of familiarity with, and it's just a thing where you store data for software companies. For any company, really, and you've probably heard names in that space. 

Aakash Shah: [00:00:35] A database is a place where companies put their data for software to access or that the data can be visualized and exposed to users like you or myself or anyone that's opening up something. 

Xand Lourenco: [00:00:49] Right. So the easiest way to think about this traditionally is databases  are row and column based. So the easiest way to think about [00:01:00] this is Microsoft Excel. If you can visualize in your mind a spreadsheet.  Then you have an idea, at a very high level, of just what the vast majority of software databases are, which is a way to, it's basically a billion record Excel spreadsheet. Normally that's fine. Right?  Traditional database usage in most companies, you insert things very slowly into it. Maybe you have a couple of sales a day or a couple of calls a day at most. You're inserting two, three. 4,000 rows a day, something like that. Where this begins to break down and where you see a lot of development over the last 10 years is with the rise of big data. You're starting to need, you know, all of a sudden from 4,000 rows a day, you're ingesting millions or billions of data points, and traditional databases simply weren't made with that use case in mind. A lot of traditional databases, it's much more important for the data to be super consistent. So they do a lot of verifying, they do a lot of [00:02:00] checking, they aren't optimized for really high level of row insertion. And that's where you start to hit problems.

Aakash Shah: [00:02:06] Right. So older databases,  the stuff that has existed for 40 or 50 years, they aren't built for the amount of data that is being created nowadays. The sheer volume of information that. Companies can collect and have, the databases that they use right now, aren't really made for that.

it's kind of like, they're trying to write a novel in the Apple Notes app.

Xand Lourenco: [00:02:35] Right. That's exactly right. And you even see this in the evolution of the way that people conceive a databases.   15 years ago, it was already starting to shift, but the common wisdom for a database was get the biggest machine you can possibly find, the most memory and the most hard drive space. And that's what you got. That's your database. But now we're at the point where the data you have might need to be on 10, 15, 20, [00:03:00] 30, even 40 different machines, because it's just too much data for any one computer to handle. Does that make sense?

Aakash Shah: [00:03:06] Yeah, there's just, there's just too much, like there's more data than the processing power of a computer can handle. 

So ever since the rise of the internet and the rise of big data, we've seen databases struggle to handle the enormous amount of data that's being generated. Has there been any changes in the past? Two decades. So since 2000, since the rise of Google, and Amazon that's kind of helped companies helped. Engineers manage gigantic enormous amounts of data.

Xand Lourenco: [00:03:32] Yeah, absolutely. And the way this happened is you start to see databases specialize right? Think of it a bit like a car right?  You had the model T . You didn't have trucks or you didn't have sports cars didn't have hatchbacks, but as the needs of the consumer changed, Cars started to specialize. And that's what we're seeing with databases. You know, now you have your, your trucks they're slow, but the carry a lot of data. You have your sports cars, these kinds of things. What we're talking about here, specifically time series databases. is kind of [00:04:00] a truck that's also the sports car. And this is in the past, I'd say five years, something that a lot of different companies have been trying to solve.

Aakash Shah: [00:04:07] So until now. People kind of were able to get by with either a truck that could handle large amounts of information, but not necessarily accelerate and move very quickly. or they could handle a sports car that could handle a smaller amount of information, but could analyze your data very quickly. But since big data has gotten even bigger. And, You know, people are pushing the limits of what they can do. There's become a need for. Something that can handle a large volume of data very quickly. So like, A sports truck. 

A high performance truck, so to speak.

Xand Lourenco: [00:04:43] Yeah. Yeah, that's exactly right. You know what a lot of these companies are doing is they're going to developers and it's like going to the dealership and being like, Hey, I need a truck. That also goes zero to 60 in three seconds and it's like, Oh boy. That's a tall order. 

Aakash Shah: [00:04:57] So. who does this provide challenges [00:05:00] for?

Xand Lourenco: [00:05:00] So this will provide challenges for anyone who wants to do any kind of ad hoc or realtime analysis on data. So what happens with traditional databases is you can, kind of scale them out to do this, but it takes forever. You want to run a query? It might take 10, 15, 20 minutes. What, what you really want is kind of the same feel you get from an Excel spreadsheet. Right? I want to know. What rows matched this filter and boom. I got the data that I want right away. And that's something that until recently, it was, it was pretty hard to do with any traditional database stack.

What you're seeing is  an attempt to solve that from developers is an attempt to do what we call time series, data analysis. And the reason for that is a lot of this big data that we're analyzing now.  They're events that happen in a moment in time. So when I say time series, you could think of things like  the price of a certain stock ticker at a certain moment in time. These are pieces of data that are anchored to a [00:06:00] specific point in time. And as you can imagine stock prices,  they fluctuate. 

Aakash Shah: [00:06:04] So there can be a lot of data it's related to time. And analyzing a lot of data related to time can be very difficult. and people, people need this. Now, financial companies need this now enterprise analytics companies want this now. internet of things, companies want this now.

Xand Lourenco: [00:06:22] Right.

Aakash Shah: [00:06:23] thank you for that Ted talk on databases. And it was very informative. Let's bring this back to questDB. So questDB explicitly claims to be the fastest time series database. Does that mean that they're claiming to be. The fastest database for this sort of specialized data, which is tied to times and can come in. So frequently where the, where the data comes in so frequently, and it's so large.

Xand Lourenco: [00:06:53] So first off everybody makes that claim. Right. You look at influx. You look [00:07:00] at timescale. You look at elastic. Everybody says, Hey, where are the fast, scalable time series database? But fast, isn't always enough with a lot of these databases. What you find is you have to make some sort of compromise in some aspect to achieve that quick, right? For influx DB. it doesn't handle textual analysis very well for elastic search. You kind of have to summarize or aggregate data historically to make it fast enough to get what you need. So. There's a lot of legwork that happens when you select these databases, depending on your data set and what need you have, like what kind of data you have. And I think questDB'S kind of pitch is performance without compromise. It's okay. You don't need to worry about this. You don't have to do any weird tricks. We just give you the data and you can query on it really fast.

Aakash Shah: [00:07:53] Performance without compromise. That is a claim. That's a strong claim. [00:08:00]

Xand Lourenco: [00:08:00] Well, my words, not theirs, but if they, if they wanna use it, they could, they can let me know.

Aakash Shah: [00:08:03] Given your understanding of databases like this, Do you think. It's technically possible. Like, do you think something like this can be implemented? 

Xand Lourenco: [00:08:14] their demo is certainly very impressive, but in these kinds of database things, I always bring a healthy skepticism. 

Aakash Shah: [00:08:19] And you almost have to bring a skepticism when it comes to evaluating core software products, that's business critical like databases. because committing to something that has a huge cost on working with, committing to something that doesn't deliver. When you're evaluating the sort of business critical software. 

No one gets fired for buying. From Amazon. You do get fired. If you make a call On a new company and. It doesn't pan out. it's a no known versus known unknown. So quest DB does have. To deal with initial hesitation. but that's part of any enterprise software sale. And they'll build that [00:09:00] expertise. Hopefully. 

  Xand Lourenco: [00:09:01] Yeah, This actually explains kind of the business model of this kind of open source  sale. So typically in a business sale, what happens is it's it's top down from the business side. what happens is the Business side of things. Some like a product manager or an executive gets sold on a specific technology and it's given to a team to implement. What you see here, and in a lot of these kinds of very like specialized or niche software stacks, is you sell from the bottom up? So what you do is you're like, "Hey, I'm going to open source this and open source means I'm providing this to you. You can edit it. You can modify it. You can use it for free, no problems."

 And you give that to developers people implementing it kind of vet it without paying for it. And it comes from the bottom up. So you you'll get these small teams, or these small startups who will take on this kind of high risk unproven software stack. They say, "Hey, this is great. We're using it. It's fantastic. And then they convince executives to buy consulting or [00:10:00] hosting. Or some other kind of ancillary or consulting service from the company that's open sourced their product." it's a very confident gamble. Right. What you're saying is my product is so good. I'm gonna let you use it for free forever because I know you'll come back and pay for my expertise afterwards.

Aakash Shah: [00:10:18] That is quite a gamble. It's worked out for a few companies that have gone public though. Elastic is one and MongoDB is another. And those two companies, I would actually say are predecessors. To QuestDB there. They address the older style of database pains.

Xand Lourenco: [00:10:38] Yeah. And    Mongo is really. , not the first, but certainly one of the first wave of databases that tried to address this right.

Aakash Shah: [00:10:44] So being open source is a function of their distribution model and their go to market movement. By being open source quest DB can tap into research and development efforts that engineers do regularly to experiment with new technologies that could [00:11:00] give their company an edge. It allows questDB to build the community and the documentation that  every very technical software product. Inevitably needs to have. so that it's easy to work with and easy to use. And. If a company needs to hire someone to manage QuestDB for them, they'll be able to do that. 

Open source is a way for quests DB to get all of these things. 


Xand Lourenco: [00:11:27] That's not, that's what editing's for. 

Aakash Shah: [00:11:28] How does   go to market movement with an open source? Platform or database look though.

Xand Lourenco: [00:11:34] it's pretty much kind of how you, you just laid it out there. You kind of use the open source aspect of it, the free aspect of it to bootstrap, a lot of things that you would otherwise need to go to market. So community documentation, QA, I mean, a lot of these databases start out as alpha or beta things. that get shared it around in the community. People iterate on them because they scratch an itch. And then you get a better product incrementally, so to speak.

 I'm less interested in the [00:12:00] open source aspect of it though. As the selling to developers aspect. that's something I was initially very negative on. even just a few months ago,    would see these different companies. Gatsby elastic mongo. And I would say Why are they getting millions of dollars in VC funding? . What developer is buying these products, but actually it's an amazing if you think about it, the opposite way. Developers are some of the hardest people to sell. 

Historically, I found that that segment. Can be a bit resistant to being sold to. But  when the product really has perfect fit for a problem. You see amazingly short sales cycles, you see amazing loyalty. For these developers, because it really scratches and solves an itch. A great example of this. They're not open source, but they use this model is Data dog. 

Aakash Shah: [00:12:46] Data dog is like task manager  for a fleet of computers, instead of just your own

Xand Lourenco: [00:12:50] Yeah. And, and the loyalty. And I don't know if they're public. 

Aakash Shah: [00:12:55] Datadog went public last year and they are doing very, very well. [00:13:00]

Xand Lourenco: [00:13:00] Yeah, and same for elastic.   the loyalty you gain from a developer by creating a product that specifically scratches their itch is immense. And it's kind of an untapped market in terms of people who have typically been resistant to being sold to before, or kind of have a cold shoulder towards. Traditional sales. Right. So I think it's actually, I've kind of come around full circle, one 80, I should say on the idea that this kind of. model isn't viable. In fact, it turns out it's pretty viable.

Aakash Shah: [00:13:24] It's viable as long as you can win hearts and minds of developers, if you can demonstrate that "our solution is the best solution for the problem you're having." 

Xand Lourenco: [00:13:37] Right. 

Aakash Shah: [00:13:37] Developers will use you. The next step is to make sure that the problem that you're solving is large enough. And that companies are willing to spend money on that problem. For quest DB it is large enough. They're airy help financial companies do stock market analytics. quest DB. We know the market's large enough because there are enough [00:14:00] analytics. Databases out there that they could help improve. And  there are companies that make money. Solely off of how good their analytics are. 

So we know that the problem space is big and we know that companies are willing to spend on solving this problem space. What sort of managed services could quest DB have. Or what could quest DB build additional products? That. Would allow them to monetize their open source work.

Xand Lourenco: [00:14:28] So you got kind of two forms of monetization here. Typically you've got the consulting thing. So think of your red hat or something like that, where they're Linux shop, they provide a Linux distribution for free. But the way they make their money is kind of consulting on how to use their product, how to deploy it. The other is, as you pointed out in managed hosting, so you give this open source thing to development team, they're like, "wow, this is amazing, but we're only a shop of three or four people. We don't have dedicated system administrators or dev ops. We really just want to write our application, [00:15:00] send the data and be able to use this database problem-free." So what questDB or Mongo, or what one of these kind of. Database companies will do is say, Hey, you can deploy this and use it for free on your own stack. If you have the people to work on it, if not, if you're a smaller shop. You just let us handle it. We'll manage it for you. We'll take backups. You just use the database and we'll charge a little bit of a premium on top of that.

Aakash Shah: [00:15:23] And then if you want to build anything super fancy. We can help you do that at a consulting rate.

Xand Lourenco: [00:15:29] Right. That's exactly right.

Aakash Shah: [00:15:31] Okay, that makes sense. I believe in that. And. 

The fact that there are companies that have grown into hundreds of millions of dollars off this exact problem case and with this exact business model demonstrates that. Yeah, QuestDB could have a good future in front of it. I think one thing that we forgot to mention during our open source conversation is open source developer tools is like, Freemium for [00:16:00] business tools.  So when quest DB is open source, it's like how Slack is free to use eventually. The people that use questdb, or the people that you Slack get frustrated at the limitations of having to do everything themselves, or having to deal with the free tier and they start paying. And freemium has proven to be an incredible model to get. Businesses too. Adopt a product, from the bottom up. Though top-down also works as we've seen with Microsoft teams. 

So it goes both ways. 

Xand Lourenco: [00:16:30] Yeah. And I think that works because teams and Slack are in. They're not heavily technical products, right? That's not something that you're saying, Hey, implement this database. It's Hey, just use this slightly different software. I think the reason that questDB and. These kinds of technical, sells are so important to come bottom up. Is because the lift to, to implement them and to rip out what you already have is Herculean really. Is that how you pronounce that? At me, if you think I'm pronouncing it wrong, but,

Aakash Shah: [00:16:57] They can't, you're not on the public [00:17:00] digital space.

Xand Lourenco: [00:17:00] That's right. It's it's so free. Yeah, and I think. W w with something. DB, right. The difficulty is. You gotta do. You gotta do not only what the, what they're already doing better, but you have to do it better enough for them to consider switching to overcome that inertia. And that's why I think the bottom up thing works because. You get huge pressure from the developers being like, "Hey, we'll save so much money in time and stress and labor and maybe infrastructure, whatever the kind of pain points are. If we take the time to migrate to this new infrastructure."

Aakash Shah: [00:17:34] Very cool. We've talked about the product we've learned about the history of the industry of the product. We've talked about the sheer need for a product like this. and we've even discussed their go to market. Let's talk about whether or not it will be a good investment or what's required for quests DB to be a good investment. They've raised $2.3 million in a seed round 2.5 [00:18:00] million in total. For a 10 X return on their investors. That means questdb needs to sell for $25 million for quests DB to sell for $25 million. They probably need to make around $5 million a year. 

I feel like if you're enabling. Financial teams and. Machine learning teams to , Move a lot faster. And. 

The better. It's. Definitely possible  to find five companies that will give you $1 million a year. 

Or 50 companies that will give you $100,000 a year.

Xand Lourenco: [00:18:34] Yeah, and I think definitely hit the nail on the head. Many, many companies are already spending. Two or $300,000 a year and analytics, databases costs without breaking a sweat, because it's so important to their day to day operations. I think certainly you could get someone to pay you. A little bit more than that, even, you know, if, if, if it really does solve a problem elsewhere. I could very easily see some people paying. Yeah, half a million dollars for like a, Like a [00:19:00] full rollout consulting and managed. A quest DB install.

Aakash Shah: [00:19:03] We're going to have to keep our eyes out for their next fundraising rounds. Cause it sounds like it's something we're going to want to get into. 

so, what all did we learn? We learned about databases. We learned about the pains and working with databases and where product is evolving towards. And we learned how quest DB can be the next evolution of this space. 

and we learned how STB is using open source. And why open source is so important for a product like quest, Stevie, is there anything you think we missed? 

Xand Lourenco: [00:19:31] the takeaway for me really is I think I was a little bit. For the past year, I think I've been pretty down on this kind of business model and go to market. And it's not until recently that I kind of saw.  why VCs would be moving into this space. And why these companies are going public now. So it's definitely something that I was slow on the uptake on in terms of seeing the potential value of. This of this sort of business model and how you can monetize it. 

Aakash Shah: [00:19:54] Sounds good. Thank you for listening. We'll see you next time. Adios. 

Xand Lourenco: [00:19:59] Later.