Arka Works Ltd reposted this
"I need to compare product specs, but the AI isn’t reading the data correctly..." To achieve the highest levels of accuracy with LLMs, we must look at what we’re feeding them. LLMs love plain text, but they do even better with Markdown. Markdown is like plain text, but it includes the ability to communicate basic formatting. With most consumer apps we have all kinds of issues with PDFs when they are loaded as attachments. Often they aren’t read in full, or there are formatting issues with data, tables, diagrams. We are in the final stages of testing before integrating an outstanding new model called “Mistral OCR” into OmniChat.uk. Mistral OCR was launched in the last fortnight and we are planning to include it as an add-on for Business level subscribers as a solution for long document markdown conversion, something I'm asked for often... This tool will massively improve the accuracy and performance that we’ll be able to get from PDF attachment inputs because they will be scanned in full and converted into Markdown BEFORE they get fed to the LLM. OCR stands for “Optical Character Recognition”, or the ability for the model to detect and extract info from all manner of graphical input material and turn them into a digestible format for any LLM. Mistral OCR performs at 94.89% accuracy and with the ability to process up to 2,000 pages per minute, it beats current offerings from Google and Microsoft, particularly for handling very complex documents that as professionals we need to be able to work with. TDLR, don't use standard attachments... there is a better way.