Now you can index slides from all the presentations you have on your drive!
- Turn each slide into an image
- Use GPT-4o to extract data from the images
- Index the results
- Query the data
The team from Pathway build an open-source app that it doing exactly that.
Here's the link: https://lnkd.in/eGAY_EYu
Check it out and give them a star!
Thanks Pathway for letting me play with the app and for supporting our community!
We also have this problem. You have a ton of presentations and you want to find a particular slide there. Now we can index all these slides, parse them with an LLM, use GPT for all to describe these slides safety results, and then use a search engine to find what we want. The team from Pathway did exactly that. So they built an open source app and this is the repo. I will share the link to this repo in the post. So here they have a demo. Where they indexed all the slides from Google Drive, they can also connect. The app can also connect to SharePoint or local file system and then the slides are available for querying. So for example I can ask. What are the good qualities of an account manager, right? And then it quickly retrieves some results. So one of the presentations there talk about account manager. So we see the results, right? What's even more fun is indexing our own slides and then running this up locally. So I took this repo. I cloned it. And then I took this NF example, saved it as a copy and replaced Openai key and then pathway license key with my own. Put some documents, some data presentations in the data folder and then run docker compose up. And once you start, you'll see that it starts decomposing your slide decks into images and then sending in each image to open AI. So you will see something like this in locks. So you can see that it turns slight and image it gives it more structure, it describes what exactly is happening on that slide. Once it finishes indexing at least one document, you can start look for things you want. So I have it here. It already finished indexing one of the slide decks and then the indexing of the other is still in progress. And now I can look for. You know, let's say my name and then what I get here is a slide with me as an instructor. Or for example, why should I learn in public? Or why should I do learning in public? And here what we see is really my favorite slides from all the presentation about the courses we do where Michael shares his story, why he was doing learning in public and what kind of results he got and also an example of learning in a public post. Or for example, we can ask how do I get there and NYC TLC data set. And then we get slides with the relevant slides with the instructions where this data set is. This is awesome, right? And you can add more slides here by simply putting more slide decks PDF files to the data folder, right? So and then once you do that, it sees the new files and starts indexing them on the fly. So try it out and don't forget, don't forget to give them a star.
This is amazing and I so needed this! I have a script that takes a manual indexing I have, where I get to select which slides I want and makes a new slide deck from there. The most tedious part? THE INDEXING
P.S. Thats' also one of my fav slides there Michael Shoemaker, MBA
Alexey Thanks for sharing this! I see the Pathway has been growing since the time I presented it first time to the community as an open-source last year!
Impressive tool development! Any tips on maximizing the potential of this innovative app? Alexey Grigorev