Reverse Dependencies of pdf2image
The following projects have a declared dependency on pdf2image:
- abstract-images — This module, part of the `abstract_essentials` package, provides a collection of utility functions for working with images and PDFs, including loading and saving images, extracting text from images, capturing screenshots, processing PDFs, and more.
- afipcaeqrdecode — Package to decode and extract invoice metadata from an AFIP CAE qr code link
- agl-ocr-reader — OCR API: This OCR API is an application for extracting text from images and PDF files. It is built using Flask, a Python web framework. It utilizes the pytesseract OCR library, pymupdf and the PIL library for image processing.
- ai-object-detection — AI Object Detection
- aideml — Autonomous AI for Data Science and Machine Learning
- airbyte-cdk — A framework for writing Airbyte Connectors.
- amazon-textract-textractor — A package to use AWS Textract services.
- analysis-engine — Analysis for the UK Department for Transport's major projects portfolio
- analysta-index — Extension of Langchain loaders, llms and retrievers for Analysta
- anthropic-cli — A command-line tool for interacting with the Anthropic API
- appjsonify — An academic paper PDF to JSON conversion toolkit.
- arh — Я здесь за эту улицу стою. Пацаны мне всё, и я всё пацанам. Кто меня знает, тот в курсе.
- askdoc — Ask a personal doctor for your medical queries
- autogluon.multimodal — Fast and Accurate ML in 3 Lines of Code
- autoscab — apply for many of the same job
- awca — A toolkit for making ancient world citation analysis, text summarization, paraphrasing and OCR for PDF to CSV
- betty — Betty helps you visualize and publish your family history by building interactive genealogy websites out of your Gramps and GECOM family trees
- biocwl-dash — Viewer for Mount Sinai IIDSGT Precision Oncology reports.
- bisheng-unstructured — ETLs fro LLMs
- bluedot-rest-framework — no summary
- bpm-ai-core — Core AI abstractions and helpers.
- butler-sdk — Butler Python SDK
- bwscan — # bwscan
- camai-utils — Python utils for the Camai CHC COVID Datasystem.
- cli-pdf-viewer — PDF Viewer
- cliriculum — A python cli tool to rapidly create an html or PDF resume
- cloudinteractive-ai-insights — Collection of AI tools designed to assist with your assignments and projects.
- comod — Compartmental modelling Python package
- contexto — Librería para el procesamiento y análisis de texto con Python
- cornellGrading — Routines for interacting with Cornell installations of Canvas and Qualtrics
- DataXtractor — DataXtractor is a versatile Python library designed to simplify the extraction of valuable data from a variety of sources, including images and PDF documents. Whether you need to extract text, tables, or structured content, DataXtractor provides powerful and intuitive tools to streamline the process.
- decimer-segmentation — DECIMER Segmentation - Extraction of chemical structure depictions from scientific literature
- demogpt — Autonomous AI Agent for Gen-AI App Generation
- dfelf — Data File Elf
- disclosure-extractor — A data extraction tool from judge financial disclosures.
- django-doma — Simple Document Management for Django
- doc-curation — A package for curating doc file collections, with ability to sync with youtube and archive.org doc items.
- doc-loader — Given werkzeug.FileStorage, fastapi.UploadFile or str file path as input it converts any image files(.pdf, .jpg, .png, .tiff) into list of PIL or numpy objects
- doc-ocr — Text extractor from document
- doc-ocr-yakul — Text extractor from document
- docai-py — Butler Doc AI
- docbarcodes — Docbarcodes extracts 1D and 2D barcodes from scanned PDF documents or images.
- docile-benchmark — Tools to work with the DocILE dataset and benchmark
- docint — Extracting information from DOCuments INTelligently.
- docketanalyzer — no summary
- docqa — DocQA: An easy way to extract information from documents
- docquery — DocQuery: An easy way to extract information from documents
- docquery-test — DocQuery: An easy way to extract information from documents
- documentdataextraction — Proteus data extractor File
- doms_databasen — Scraper and PDF text processor for domsdatabasen.dk
- dp-PDF-Crawler — A custom Flask package with PDF processing tools
- easylatex2image — another latex converter 2 pictures
- ebs-iot-linuxnode — no summary
- eDOCr — OCR for Engineering Mechanical Drawings
- efficient-ocr — Efficient OCR
- EmberFactory — Software to (re)produce burning ember diagrams of the style used in IPCC reports.
- embermaker — Software library to (re)produce burning ember and related diagrams of the style used in IPCC reports
- emo-market-base — Marketlerden ürünleri kazıma işlemleri için temel pakettir
- enb — Experiment NoteBook (enb): efficient and reproducible science.
- Extractable — Extract tables from PDFs
- faker-file — Generate files with fake data.
- farm-haystack — LLM framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data.
- fast-form — Automatically process scanned forms
- file-editor — A python package to convert a image/pdf file to pdf/image file format
- filemac — Open source Python CLI toolkit for conversion, manipulation, Analysis
- fillpdf — A Library to fill and flatten pdfs
- form-analyzer — Python package to analyze scanned questionnaires and forms with AWS Textract and convert the results to an XLSX.
- form-tools — no summary
- formfyxer — A tool for learning about and pre-processing pdf forms.
- funcchain — 🔖 write prompts as python functions
- Gaon — AI library for Sidedrawer
- GaonOCR — A OCR library for T4 documents
- gdprCrawlerTest19 — GDPR document crawler
- gdprCrawlerTest20 — GDPR document crawler
- generativepoetry — A library primarily for procedurally generating visual poems
- geniusrise-vision — Huggingface bolts for geniusrise
- google-drive-ocr — Perform OCR using Google's Drive API v3
- gpt-pdf-md — A Python package that utilizes GPT-4V and other tools to convert PDFs into Markdown files.
- GPT-PDF-Reader — A Python package that utilizes GPT-4V and other tools to extract and process information from PDF files
- graph2img — graph2img: convert a graph to a png file.
- graphanime — create execution graph in GIF/PDF with LaTex
- gsif-pytools — A package with tools to aid Gator Student Investment Fund Portfolio Attribution Specialists.
- hammer-sh — A package containing useful methods for my masterthesis
- HoChiMinh — Ho Chi Minh is designed to extract textual information from tables presented in PDF, pictures or other format. Хошимин предназначен для извлечения текстовой информации из таблиц, представленных в PDF, картинках или ином формате.
- hocr-utils — Package containing utility function for hOCR and tesseract
- hte — Extracting content from spesific address books
- hycli — Hypatos cli tool to batch extract documents through the API and to compare the results.
- igs-toolbox — A toolbox to check whether files follow a predefined schema.
- img-processor — Python package for taking an image and doing a thing
- indian-electoral-roll-processor — no summary
- ipypdf — Jupyter widget for applying nlp to pdf documents
- jsonshower — Json Viewer with additional multimedia and highlighting support
- lamatic-airbyte-cdk — A framework for writing Airbyte Connectors.
- langchain-googledrive — This is a temporary project while I wait for my langchain [pull-request](https://github.com/hwchase17/langchain/pull/5135) to be validated.
- langroid — Harness LLMs with Multi-Agent Programming
- latex-trim — Tool that allows you to include sub-selections of PDFs in latex documents.
- latex2image — Convert TeX to images on the command line.
- latex2readme — Convert LaTeX documents to README.md files
- linguappt — PPT generator for language learning
- llama-index-readers-confluence — llama-index readers confluence integration