My company currently does this using enterprise-grade software, and I'm deeply familiar with the problem area. The article quotes the guy as hearing the problem and saying:
--“My instant solution was, ok we have these big devices nowadays… why can’t we use them to capture these invoices, use technology to convert that image into text and extract information out of these invoices,” he adds. “It was pretty much a bottoms up approach.”--
Yeah... no. I'm afraid it is much, much more challenging than he realizes. Here's a quick, non-exhaustive summary:
1. Most paper invoices are printed and then either mailed or faxed. Sometimes the vendor's accounting staff manually annotate (e.g. hand-writing) the printed paper with additional information that didn't exist on their accounting system (e.g. purchase order number). By the time the invoice arrives at the AP department, the quality of writing on it can be atrocious, which greatly affects the accuracy of OCR. So you need to use complex image processing on the invoice first, and then OCR it. You may also need ICR to read hand-written information.
2. A lot of vendors change their invoice formats on a regular basis (e.g. by changing to a different template on QuickBooks). This means that whatever machine-learning you use to process a vendor's invoices will be invalidated when the format changes.
3. Smaller vendors, such as mom-and-pop shops that restaurants sometimes deal with, are very undisciplined about their invoicing, and can forget to put key pieces of information on the invoice, or make typos.
Over the years, I've come to realize that the only way to efficiently deal with invoices is to make sure they never become paper-based. We have been pushing our customers and their vendors to use EDI instead, which is a LOT easier to control and standardize.
There are loads of companies that do that. I designed and implemented a system around 2009/2010 which did exactly the same thing for the energy market.
Certain subsets of the medical market have embedded players offering similar services.
To make real money from this you need marketshare; with that you can look for points ripe for arbitrage..
Why not just eliminate the paper to begin with? It's obviously a bit tougher as you need to tackle both sides of the market at once, but not having to mail invoices means the benefit of faster payments for suppliers (maybe not a benefit to the restaurant however) and could enable just-in-time ordering from the restaurant.
It might even flip the invoicing on its head. When the restaurant orders, it creates a transaction. The supplier delivers and the restaurant completes the transaction, generating a payment. Now the restaurant can order only what it needs, faster, and the supplier receives its payment faster.
This solution works best for industries where multiple raw materials are used and are transformed to something completely different. Manufacturing, healthcare and hotels is something can leverage a solution like this.
cool idea! i actually had this idea a few years ago and went around to a few restaurants asking if they would use it - most of the chefs i approached seemed luke warm to the idea so i never went forward - glad to see they are finally coming around!
--“My instant solution was, ok we have these big devices nowadays… why can’t we use them to capture these invoices, use technology to convert that image into text and extract information out of these invoices,” he adds. “It was pretty much a bottoms up approach.”--
Yeah... no. I'm afraid it is much, much more challenging than he realizes. Here's a quick, non-exhaustive summary:
1. Most paper invoices are printed and then either mailed or faxed. Sometimes the vendor's accounting staff manually annotate (e.g. hand-writing) the printed paper with additional information that didn't exist on their accounting system (e.g. purchase order number). By the time the invoice arrives at the AP department, the quality of writing on it can be atrocious, which greatly affects the accuracy of OCR. So you need to use complex image processing on the invoice first, and then OCR it. You may also need ICR to read hand-written information.
2. A lot of vendors change their invoice formats on a regular basis (e.g. by changing to a different template on QuickBooks). This means that whatever machine-learning you use to process a vendor's invoices will be invalidated when the format changes.
3. Smaller vendors, such as mom-and-pop shops that restaurants sometimes deal with, are very undisciplined about their invoicing, and can forget to put key pieces of information on the invoice, or make typos.
Over the years, I've come to realize that the only way to efficiently deal with invoices is to make sure they never become paper-based. We have been pushing our customers and their vendors to use EDI instead, which is a LOT easier to control and standardize.