Reverse Dependencies of pdfplumber
The following projects have a declared dependency on pdfplumber:
- aclpubcheck — no summary
- agent-llm — An Artificial Intelligence Automation Platform. AI Instruction management from various providers, has an adaptive memory, and a versatile plugin system with many commands including web browsing. Supports many AI providers and models and growing support every day.
- AGIpdf2json — This package can help user parse PDF files into text file and JSON file. Additionally, it can help user parse question-answer pairs into a JSONL document in prompt-completion format, that is supported by OpenAI
- agl-report-reader — Read and anonymize medical reports
- AI-ML-Formulas-Recognizer-Extraction — A package for AI recognition tasks developed by Minh Nguyen and Liam.
- ankigengpt — no summary
- appjsonify — An academic paper PDF to JSON conversion toolkit.
- aqua-parser — An amazing aquaparser-parser.
- arfindata — Python package for the aquisition and pre-treatment of China's listed companies' annual reports
- arxiv-astro-summarizer — Scrapes arXiv astro-ph paper, summarizes the abstract, and returns relavant papers according to a user input.
- bank-statement-reader-altara — no summary
- bankruptcy — A bankruptcy document parser.
- bany — A collection of scripts for personal finance
- beancount-cmb-importer — A beancount importer for CMB.
- bisheng-unstructured — ETLs fro LLMs
- cannlytics — 🔥 Cannlytics is a suite of tools that you can use to wrangle, standardize, and analyze cannabis data
- cobralib — A utilities module that contains classes and functions that simplify interfaces with files and databases.
- comwares — This project provides middlewares for a startup company.
- conc_test_report — Generate a concise and brief summary of all concrete test result PDFs, to aid in fast and efficient review
- cv-xtractor — A Python package for extracting information from CVs (resumes).
- data-modori — LMOps Tool for Korean
- deepdoctection — Repository for Document AI
- depdf — PDF table & paragraph extractor
- desktop-env — The package provides a desktop environment for setting and evaluating desktop automation tasks.
- digital-nondigital-pdf-extraction — This module will return whether PDF is Digital, Non-Digital or Mixed.
- disclosure-extractor — A data extraction tool from judge financial disclosures.
- doc-extractor — no summary
- docint — Extracting information from DOCuments INTelligently.
- docqa — DocQA: An easy way to extract information from documents
- docquery — DocQuery: An easy way to extract information from documents
- docquery-test — DocQuery: An easy way to extract information from documents
- documentdataextraction — Proteus data extractor File
- DocumentInsightsGenerator — A package to generate comprehensive insights from documents using NLP techniques.
- dost — DOST is a Python based Utility platform as an Open Source project. We strive to liberate humans from mundane, repetitive tasks, giving them more time to use their intellect and creativity to solve higher-order business challenges and perform knowledge work.
- dp-PDF-Crawler — A custom Flask package with PDF processing tools
- ebank — no summary
- ebanktool — no summary
- efficient-ocr — Efficient OCR
- ezlocalai — ezlocalai is an easy to set up local multimodal artificial intelligence server with OpenAI Style Endpoints.
- fgts-pdf-dados — Extrai dados de PDFs do FGTS e grava tudo em arquivo CSV pronto para usar com o Inverstorzilla.
- finance-analytics — extract and analyze Bank statements
- friday-agent — An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
- getpaper — getpaper - papers download made easy!
- gpt-pdf-organizer — no summary
- grag — A simple package for implementing RAG
- invoice-parser — Tools for parsing and extracting information from invoices.
- invoice2data — Python parser to extract data from pdf invoice
- irspdf — A simple information retrieval system for pdf documents
- k-parse-tool — parse and extract data from HTML
- langroid — Harness LLMs with Multi-Agent Programming
- leitor-pdf — no summary
- lemon-rag — no summary
- llm-parse — Parse data from documents optimised for downstream llm tasks.
- llmvm-cli — Command Line LLM with client-side tools support.
- LWD-utils — rename-version of PaperCrawlerUtil
- makerbean — A small educational purpose package
- mctinctools — Common tools for our organization.
- metabook — rename and organize your pdf book collection
- ml-access-key-extractor — Biblioteca para extrair chave de acesso na nota fiscal em PDF
- mmda — MMDA - multimodal document analysis
- MordinezNLP — Powerfull python tool for modern NLP processing
- msds-tdm — MSDS Package
- Ocrversion1 — no summary
- Ocrversion2 — no summary
- os-copilot — An self-improving embodied conversational agents seamlessly integrated into the operating system to automate our daily tasks.
- paddle-pipelines — Paddle-Pipelines: An End to End Natural Language Proceessing Development Kit Based on PaddleNLP
- paper2remarkable — Easily download an academic paper and send it to the reMarkable
- PaperCrawlerUtil — a collection of utils
- papermage — Papermage. Casting magic over scientific PDFs.
- pdf-oralia — no summary
- pdf-scout — automatically create bookmarks in a PDF file
- pdf-scrapper — Pdf Scrapping interface
- pdf-to-wordcloud — Generates a word cloud from a given PDF
- pdf2vectors — A package to interact with vectors DB
- pdflayoutxt — This library helps in extracting text from searchable pdf files by keeping the layout intact.
- phl-courts-scraper — A Python utility to scrape docket sheets and court summaries for Philadelphia courts.
- plagdef — A tool which makes life hard for students who try to make theirs simple.
- PowerScrape — A comprehensive and versatile Python module for web scraping.
- project-to-installer — no summary
- ptol — A Pipeline for Obtaining Relevant Literature Based on Given Keywords
- py-data-juicer — A One-Stop Data Processing System for Large Language Models.
- py-data-modori — LMOps Tool for Korean
- py-financas — Py Finanças é um pacote python que simplifica obtenção e uso de dados do sistema financeiro brasileiro.
- Pyostie — A python package to OCR data and extract text with insights too.
- pyrhubarb — A Python framework for multi-modal document understanding with generative AI
- python-core — there is no description available
- querent — The Asynchronous Data Dynamo and Graph Neural Network Catalyst
- refuel-autolabel — Label, clean and enrich text datasets with LLMs
- RegScale-CLI — Command Line Interface (CLI) for bulk processing/loading data into RegScale
- resume-parser — A resume parser used for extracting information from resumes
- resume-parser-upd — A resume parser used for extracting information from resumes
- sapiensqa — SapiensQA (Question and Answer) is a proprietary Machine Learning algorithm for creating Natural Language Processing models where the answers are previously known.
- secrets-to-paper — A command line tool to help with key-to-paper and paper-to-key.
- shinc-lib-sofs — SOF interpreter
- short-activist-predictor — Short activists prediction
- smart-cv — Tools to retrieve and check information and generate information based on CVs
- sparv-pipeline — SprÃ¥kbanken's text analysis tool
- start-ocr — Applying pdfplumber + opencv + pytesseract to extract content and metadata from formal PDF files.
- tarmtextract — A package that allows to extract text from pdf/word file which is save in aws s3
- test-data-modori — LMOps Tool for Korean
1
2