aakash.io

View Original

Handl - the API to convert documents into structured data (as All Schemes Considered)

Listen Now

See this content in the original post

Podcast Notes

We talk about Handl. Handl is a tool to label and manage data for machine learning. On Handl, you will get your high-accuracy datasets with ease.

Autogenerated Transcript

Aakash Shah: [00:00:00] Hello, and welcome to All Schemes Considered. I'm your host Aakash Shah. And this is the podcast where we evaluate the ideas that venture capitalists have recently invested in.  As usual I'm here today with my cohost Xand. Hello. Hello. This week talking about handl -- H A N D L. It's website is handl.ai. Handl  is a company which Y Combinator invested into in 2020 and what they do is, they provide an API to convert documents into structured data. So that is to say, you know, if you have an insurance claim or if you have a lease or if you have any, if you have a will or any sort of paper documentation that you want to turn into structured data for a computer. Handl, will handle, well handle that in-between. [00:01:00] Excuse the pun, please. Oh God. What jumps out to me is incredible applications in business process management, especially business process management that interacts between real life and, computers. So, you know, you have frontline bank employees that are having people fill out paperwork or you have insurance claims that are being filled out, or, you know, even those traffic tickets that you get from a cop that pulls you over -- Eventually the government will take advantage of the internet too. So,  yeah. What's your first state Xand.  

Xand Lourenco: [00:01:38] It's a really important problem that everybody has, right?

Like when you are converting unstructured or freeform data into a database. When you talk to like a machine learning engineer or a data scientist or data engineer, or anybody who kind of does like database stuff at scale, the vast majority of the work is not like [00:02:00] modeling it. You know, like, like most of these people can come up with a really good solid database model, or, you know, training model.

If you're doing ML. But the hard part is the data wrangling, like people input data wrong, you know, people, mislabel data, people don't put the data in, or they make it up like, like there's all sorts of, the, the really hard part about doing any of this sort of large scale data analysis is just making sure that you have the data in a structured format that's easily queryable. I mean --  so many companies are just people solving one aspect of that problem. And so this is certainly an important aspect of it. 

Aakash Shah: [00:02:39] So, you know, we hear about this rise of AI and the rise of machine learning and how computers can automate a lot of processes. But computers are still kind of dumb in that they don't know how to handle bad input or not so great input. And this is a way to get ahead of that or to, to make [00:03:00] sure that the input is correct and what the computer expects. 

Xand Lourenco: [00:03:04] Right. And I, so just, just to be clear, the Handl seems like a brand new company and they're doing something really important here.

They're augmenting. So  when you're a company like Handl, when you're, when you're doing a lot of like, this kind of large scale, like freeform data input, you need to train your models. And it, I think so Handl really just based on their front page and like what they do feels much more like their actual end game is getting a huge amount of data of real world examples that have been labeled accurately.

So the, the, the actually giving customers good data is. I could be wrong, but I think it's more of a side effect of the actual plan, which is collecting and training this huge model on a large amount of verified and labeled data. Right. Obviously as they get more data, they can train the model better and it'll get better and faster at doing this [00:04:00] sort of labeling.

But the really important step is that if they're first to market, you can create like a data moat, right? So let's say nobody else is doing this yet. Handl does this for a couple of years. They now have a hundred million trained labels and a model. That's a huge barrier to entry for anybody that comes after them because they've created a moat of like well-labeled accurate data, basically.

So there's kind of a first mover effect here where I think that's going to be that's really, I think what's going to make or break them as the, is the kind of insight that even if they have to use manual labor to categorize a lot of this freeform data initially, That will pay off later, both for them in terms of having a data set and for customers who will slowly start to get more accurate data over time.

Aakash Shah: [00:04:46] So let's dig into that a little bit. You said there's this concept of a data moat of simply just having data, both the raw data and the structured data, is a very powerful thing to [00:05:00] have in a competitive advantage. And -- it  -- like that competitive advantage comes from being able to train a very good AI model.

So one example that jumps out to me is how Google's spent literally a decade. Having people fill out CAPTCHA forms and identify various objects on the street, like, you know, find the fire hydrant, find a bus, find the street sign. What does the street sign say? And they're now using that, not just for,you know, labeling images on the web, but they use that data set to power, a completely unrelated industry known as self driving cars, which is, you know. So Google is actually creating that data moat, you know, taking it from one place and applying it to another place, which I think is what you just said about handle right here.

Xand Lourenco: [00:05:53] Yeah. And I think like the kind of the early... the [00:06:00] early examples of this strategy, proving successful is stuff like stuff like Google, like Facebook, like Amazon, like they it's, it would be very difficult to start a competitor to Facebook advertising right now, simply because you don't have the demographic data, you don't have the targeting data.

You don't have the audience data. Like there's no, there's no way that a new company that does that no way, but it would be very difficult for a new company to credibly Mount a competitive advantage against Facebook advertising. You know, like, like that, like that moat is there. It's very real. And I think we're, we're kind of seeing what kind of seeing this move into more niche moats, you know, like, like this is very niche service.

If they can get that, if they can get that moat established there's, it's almost not worth it to go head to head with them. You know, like their end game would be either they become the people in the space and they, they take home bagfuls of money or they get acquired by somebody like Amazon or Google. Who you know, wants to do, wants to incorporate this into one of their services.

Aakash Shah: [00:06:58] You say it's niche. [00:07:00] I guess you can say the product is niche, like, you know, turning physical documents into structured data. So the product might be niche, but the applications are certainly not niche. 

Xand Lourenco: [00:07:10] No the applications are not, every company could use this. 

Aakash Shah: [00:07:13] They span every company. Every sales team wants this.

Every company, every finance company wants this. Every insurance company wants this, you know, every consulting firm wants this, you know? 

Xand Lourenco: [00:07:27] Right. And I mean, this is true even of new companies, right? Like Parse.ly is a pretty new a company and we could still use this, you know, like, like there's, there's even like mostly digital or fully digital companies still have times where they need to, they need to ingest freeform data.

Aakash Shah: [00:07:44] Yeah. So definitely powerful. I think one thing that's interesting to me is they almost, I think Handl explicitly says that they have a human in the loop. Yeah. And I think they say this [00:08:00] to kind of alleviate the fear that like, "Oh, what if the AI gets it wrong?"  and we've kind of seen this before with Amazon MTurk, which was a way to, you know, share data with the world, not data, which is what was a way to kind of like crowdsource kind of this, boring work, and also with, and so like, sorry, not also with like a real world example of Amazon MTurk being used to augment, AI, optical character recognition would be like Expensify.

Which was sending, when their receipts. Expensify is expensing software, very large, very widely used. and one of their killer features is you can take a picture of your receipt and it will extract all the information out and automatically create the expense report for you. Expense reporting pain in the butt, no matter how large or small your company is.

and so people love this about [00:09:00] Expensify and expensive. I had a great, had a great,  I guess ML model to extract this information, but when that model failed, they would actually, crowdsource the,  interpretation or the extraction of the, of the data from the receipt with a MTurk. And I guess, I guess this is like a Handl is almost like this little specific part of Expensify pulled out to every piece of paper in the world.

Xand Lourenco: [00:09:30] Yeah.

Okay. Yeah, I guess so you're right. That it's not niche you're I should block that back a little bit, but you're absolutely right. Yeah. 

Aakash Shah: [00:09:38] I think the idea of using people in the loop is a very interesting. 

Xand Lourenco: [00:09:44] I mean it's necessary, right? To train your model, 

Aakash Shah: [00:09:47] I think it's good for training the model, but I also think it's interesting from a sales perspective, because you know, let's think about who can like, let's think about who can pay for this, right?

Like how has [00:10:00] Handl going to take this to market? Are they going to be a direct to consumers? Are they going to be a B 2 C like selling to end of line consumers like you or me directly? Probably not.  you know, it's not like Dropbox where we can really take great advantage of this. Sure. If we had our own personal like receipt management or if we were had some weird process of our own, but the handle isn't going to sell this to like 1 million people, they can make $10 million.

I think instead the handle would be very smart to go for massive companies, you know, like Geico, which has tons of insurance agents. And, you know, if they can close that one deal, boom, you have all that data. You're helping this one customer. And you know, that company can pay enough to keep you keep Handl afloat you know, while they continue growing their company.

So it's definitely an enterprise sale. If you think about how enterprise buyers [00:11:00] interact with technology, you are rewarded for being cautious in an enterprise environment.  you know, latest and greatest is good, but only if it doesn't break down. So having the human in the loop for me, actually demonstrates to like demonstrates to Handl's customers that like, yeah, look, we're going to make it fast with the AI, but we're going to verify it with a human and you can trust that this human is smart and is as capable as your agents. And this is a good thing. 

Xand Lourenco: [00:11:35] Right. By saying that the human human in the loop is kind of a soft signal, that they're an enterprise company that they're looking for. Enterprise customers. 

Aakash Shah: [00:11:45] Yeah. It's a hard, I think it's a definite signal, but I think it's also, I think like if you come at it from the perspective of an enterprise buyer -- It feels good to have a human in the loop, even though for handle, they need a human in the loop, just like, as [00:12:00] error-correcting for their model, for the enterprise buyer that's a, that's actually a feature, not a bug. 

Xand Lourenco: [00:12:07] Right. 

Aakash Shah: [00:12:08] whereas, you know, if you were maybe selling to an engineer and a purely an engineer, that's only focused on how good the technology is. that engineer would see a human in the loop as like, Oh, that's a bad model or they're not very good at their technology.

Xand Lourenco: [00:12:22] Right, exactly. 

 Aakash Shah: [00:12:24] so that, that, I think it was interesting, but how do you sell to these gigantic customers? 

Xand Lourenco: [00:12:31] painfully, I don't know. I mean, the enterprise sales plan is, I mean, the, the way you get in right is. You you price your like entry level model at just enough where some manager in the marketing department or the finance department can just put it on the corporate card, right?

You say like, "Oh, our starter tier here is 99 bucks a month" or whatever it is. And that's kind of, I think in tools like this, that seems to be how you kind of get a toe Holden. Otherwise you have like a proper full enterprise [00:13:00] sales team.  Which seems 

Aakash Shah: [00:13:02] Do you think they could do like a credit card pricing to get onto, you know, marketing managers?

Xand Lourenco: [00:13:09] yeah, actually they don't seem to be doing that because if you look at their website, they just have a contact us button and there's no pricing, both of which are hallmarks of like an enterprise sales team, 

Aakash Shah: [00:13:20] you know, 

Yes. Yeah. So for me, I think it's almost, they're forced to go. I think they sell to people that want to do digital transformations and people that are looking for huge solutions.

So,  what you, I think their angle is that they go in and they sell cost savings at a huge scale for their customers. maybe they can sell to the it team. Maybe they can sell to the, like directly the team that's like focused on this work. So, you know, the front office team at a bank whoever's managing [00:14:00] that or he,  you know, the claims agents at an insurance company.

but then you gotta like you, if you think about how. Let's use, let's use a neighborhood bank, for example, they're all closed right now. Cause we're still in the pandemic. But,  yeah, you know, let's talk about like, if I walk in to my local, a Capital One bank, right. Who decides, you know, what form I fill out and how that form gets processed.

Cause whoever decides that information yeah. Is who Handl has to sell to. And they have to discover that because I certainly don't know. 

Xand Lourenco: [00:14:41] Yeah. Yeah. That's really interesting. I don't, yeah. I don't know how you go about doing that. Right. And it also seems - I can't see their backend, so I don't know, but it's seems like it might be a low code, but not no code platform like it given, given their whole kind of homepage blurb, [00:15:00] it seems like it it's an API that you hit.

You know, so I wonder if like, it doesn't seem like you can paste a document. It automatically gets sent to Salesforce or whatever, unless they have like some sort of interface for building those hooks themselves. But it seems like you're very much to involve an it team or a tech team. Right. Because you need someone to actually make the hooks to Salesforce or whatever. 

Aakash Shah: [00:15:21] I think that's exactly what it is. I think you have to have a technology team involved because it's not something that people can do themselves. 

Xand Lourenco: [00:15:28] Right. 

Aakash Shah: [00:15:29] And there's probably, you know, there's the Handl team probably does a lot of handholding.  I would expect for this company, they're only doing deals at, I wanna say six figures or higher, maybe even up to seven figures. Yeah. It depends on how good their founders are at selling.  Cause that's really what it comes down to. 

Xand Lourenco: [00:15:53] Yeah. I mean, you can, you can look at their, their kind of, "customers that use us wheel" it's like Nvidia, Nestle, [00:16:00] UI Path, Mercedes Benz, big companies. Right, 

Aakash Shah: [00:16:03] exactly. but they don't have.

They say that they've processed about two thirds of the way through the month, 1.6 million documents. And that doesn't feel that high. I mean, the 

Xand Lourenco: [00:16:16] kind of tracks, right? Like, depending on what these companies are using it for, like Nestle, for example, probably doesn't have that many order forms that are handwritten anymore.

You know, like I said, as associate types, those up. So like potentially, potentially these big companies are using them for like, smaller things that are outside of like a digital workflow, you know, like they could, they could potentially being used Handl, could be potentially used to augment and almost digital sales workflow or something as opposed to being the main entry point for it.

Aakash Shah: [00:16:46] I do think it's interesting in that. You know, there's been this whole kind of "digitize everything" idea in the sense that like, when everything is digitized, then you have the data and then everything can interact. So [00:17:00] like, you know, if I'm a finance, you know, if I'm a company and I have all the information that I need, then I can analyze it.

However, and that's why I want everything digitized. And Starbucks was really into this for a long time. They would have  -- they built software. for their iPads, that area managers would drive around to every single  Starbucks in their location. And then they would like write things down on that iPad and everything was done through that iPad.

There were few, if any paper forms, they were very much encouraged to, you know, do everything through the iPad and share everything through that. But that's a. Like that's a big ask for any company, like "build your own platform of digitization for yourself and train everyone on how to be digitized."

Whereas Handl kind of eliminates that whole in between step right there. Handl says, you can keep your paper and we'll turn it into the data that you need. And then all your other systems can trigger off [00:18:00] that data. 

Xand Lourenco: [00:18:00] Yeah, but there's still, there's still a little bit of building, right? Like the next logical step for them would be to have an interface like Zapier to make those hooks.

Yeah. 

Aakash Shah: [00:18:09] Oh for sure. But they're not, I mean, they're probably two or three years from that, 

Xand Lourenco: [00:18:14] right? 

 Aakash Shah: [00:18:16] I think for sure there two or three years from that. Yeah. Because where they, I think what they want to do is if you think about it if their biggest. Selling point is like we make handling paper documents easier.

They want to go for people that have millions of paper documents for months. Right? So you have to go for gigantic companies that have lots of, you know, either lots of physical locations or lots of client, lots of this lots of paperwork that has to be handled. 

Xand Lourenco: [00:18:47] And I mean the ideal for a company, 

Aakash Shah: [00:18:50] lots of paperwork that has to be "handled".

I get it. 

Xand Lourenco: [00:18:54] I mean, I mean, the ideal for a company like this, right. Is like, you want to become so easy for the [00:19:00] business side to use that you become completely indispensable. You know, like a thing that you see a lot with API is when you're selling an enterprise environment is you have a huge, huge mismatch between who wants the product who's invested in it and who wants to make it work and the team that's implementing it.

So, or like that need, that needs to. Buy in to make it work. So you'll see like, Oh, we're, we're all in. We want this, but our dev team doesn't want to do it. Or, you know, we don't have iteration cycles and then the deal falls apart. Right. so really if handled becomes like a Zapier, like click and drag, where the, the, the kind of person who feels the business pain point can configure the app start to finish.

I mean, they become, they become one of those services where like it's two years down the line. You're doing a whole like database redesign. It's the integral service, you know, like everything in your company runs off this pipeline or like of one division of your company, you know, sales or insurance review or whatever that you're using it for.

And I think once you kind of have that no-code [00:20:00] solution in your that's when really you become super sticky, you know, or, and like you close your, your sales loop, like immensely that's 

Aakash Shah: [00:20:08] true. Yeah. So once they, so like being easy to use will be a huge help in their deal, velocity and closing fast. Definitely a down the line thing that they should consider.

I think they're, you know, this company is very early. we are a year or two old, so probably lot to ask of them right now. 

Xand Lourenco: [00:20:28] Well, this is the way to start, right? Like the easiest thing to do is just have an API and then you build on top of that and it lets you kind of narrow down like the business use cases.

You can kind of see where it would be best to apply. If you do want to go down the no code route, where to apply it, what are common use cases for it? You know? And then you can always leave the API as like an advanced thing people can even upsell. Ironically, you can eventually make the original implementation a value-added feature.

Right? 

Aakash Shah: [00:20:53] That's true. So for me, I think I'm pretty bullish on this company. I think they're going to be around. I think we're probably going to see [00:21:00] them,  raise a few more rounds and either get acquired or IPO just because if they can execute. 

Xand Lourenco: [00:21:08] Yeah. I like the, the only real big thing there is, does it work?

You know, like that's really the only thing 

Aakash Shah: [00:21:16] I think, I think they can make it work. Just because I don't think they would have investment if it, if they weren't able to make it work. it's more of a question for me. If like, can they sell it? Like, can they get into these companies? Cause one of the biggest things I could kill them is if they spend six months chasing after one deal that then gets killed 

Xand Lourenco: [00:21:35] or have two big ones churned in the same quarter kind of thing.

Aakash Shah: [00:21:37] Yeah. But outside of that, I know, you know, when we do our followup episode, I'm very excited for that one. 

Xand Lourenco: [00:21:44] yeah, I think that'd be good. 

Aakash Shah: [00:21:46] Cool. That's it for me? Any last thoughts, Xand? 

 Xand Lourenco: [00:21:50] no, that's it. I think I'm curious to see a Y Combinator, I think is kind of doing a lot of this sort of like automation.

[00:22:00] They're, they're kind of investing heavily into it. And I'm curious to see if we've reached a point in like the. AI life cycle or software where it's mature enough that you can kind of base low code, no code startup ideas on it. I think we're kind of, I think we're going to very quickly enter into a, this kind of field where people start doing automation for stuff that we wouldn't really expect to have been able to automate before, like document recognition or even like website generation, these kinds of things.

I'm curious to see. Given Y Combinator is kind of like field that they've, that they fielded that this year, I think that's kinda the direction they're headed as well. 

Aakash Shah: [00:22:36] So cool. We'll talk about that on our next episode, too. Yeah. Okay. See you guys 

Xand Lourenco: [00:22:43] later. I sincerely hope you enjoyed that episode. You can find me Aakash on Twitter @aakashdotio. So that's A A K A S H D O T I O, or you can find me at my website: aakash.io. This time "dot" is not spelled out.  Aakash.io also has past and future episodes of this podcast, but whatever podcatcher you're listening on should have those episodes too.

Aakash Shah: [00:23:17] Catch you all next time.