Projects
- pdf-regenerator — PDF Regenerator是一款开源辅助阅读工具,可帮助重新生成 有辅助信息的PDF 文件。
- pdf-renamer — A python library/command-line tool to automatically rename the pdf files of scientific publications by looking up the publication metadata on the web.
- pdf_reports — Create nice-looking PDF reports from HTML content.
- pdf-scout — automatically create bookmarks in a PDF file
- pdf-scrap — This package extracts important keywords from a pdf document!!!
- pdf-scraper — no summary
- pdf-scrapper — Pdf Scrapping interface
- pdf-segregation — This module will return whether PDF is Digital, Non-Digital or Mixed.
- pdf-shuffle — A PDF page/image randomizer, or flashcard quiz from a PDF.
- pdf-slashannots — Redact PDF annotation metadata to control disclosure of personal data
- pdf-slicer — GUI for easy PDF splitting
- pdf-split-tool — Pdf Split Tool
- pdf-splitter — Split a PDF file by page ranges or extract all PDF pages to multiple PDF files
- pdf-statement-reader — PDF Statement Reader
- pdf-struct — Logical structure analysis of visually structured documents.
- pdf-subheadings — no summary
- pdf-table-extractor — Extract table data from PDFs
- pdf-table2json — PDF Table to JSON Converter
- pdf-template — Library wrapping pdftk to fill and sign PDFs
- pdf_text_overlay — Python library to write text on top of PDF
- Pdf-Thing — no summary
- pdf-to-cb — PDF to Comic Book format
- pdf-to-markdown — Convert PDF files into markdown files
- pdf-to-scan — A small script to make your pdfs seem like scanned
- pdf-to-txt-nirbhay.py — A small package to extract text from a pdf and save it in a .txt file.
- pdf-to-wordcloud — Generates a word cloud from a given PDF
- pdf-toc — a pdf ToC CLI tool
- pdf.tocgen — Automatically generate table of contents for pdf files
- pdf-tools-0311 — pdf office tool.
- pdf-tools-sdk — no summary
- pdf-tools-sdk-test — no summary
- pdf-translation-api — Utility to translate PDF files using Google Cloud Translate API
- pdf-unlocker — no summary
- pdf-watermark — A python CLI tool to add watermarks to a PDF
- pdf-wrangler — PDFMiner Wrapper for extractions
- pdf2anki — A Python package to create Anki cards from PDFs using OpenAI.
- pdf2bib — A python library/command-line tool to quickly and automatically generate BibTeX data starting from the pdf file of a scientific publication.
- pdf2chem — A curator for chemistry-related pdf files
- pdf2dataset — Easily convert a subdirectory with big volume of PDF documents into a dataset, supports extracting text and images
- pdf2dcm — A PDF to Dicom Converter
- pdf2df — Extract data from pdf to a dataframe
- pdf2docx — Open source Python library converting pdf to docx.
- pdf2docx-converter — A Python package to convert PDFs to Word documents (DOCX).
- pdf2docx-headless — Open source Python library converting pdf to docx.
- pdf2doi — A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
- pdf2doi4frappe-library — extract the DOI or ISBN from a .pdf file
- pdf2ebook — PDF to ebook
- pdf2emb-nlp — NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to a given search query
- pdf2embeddings — NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to a given search query.
- pdf2excel — no summary
- pdf2htmldir — convert pdf to html
- pdf2image — A wrapper around the pdftoppm and pdftocairo command line tools to convert PDF to a PIL Image list.
- pdf2image-cli — pdf2image port to a CLI version
- pdf2images — Convert PDF file to image files ROBUSTLY.
- pdf2imgs — sql for excel
- pdf2index — no summary
- pdf2john — A modern refactoring of the legacy pdf2john library
- pdf2jpg — Wrapper to convert PDF files into jpg
- pdf2mbox — Extracts email metadata and text from a PDF file
- pdf2mp3 — Converts PDF to MP3 using Google Text-to-Speech
- pdf2packet — Combines and sorts PDFs with Cover Sheets into Result Packets
- pdf2ppt — A tool to convert PDF documents to PPTX format with an adjustable DPI setting.
- pdf2pptx — Utility to convert a PDF slideshow to Powerpoint PPTX.
- pdf2pptx-cli — convert pdf to 1200 dpi image ppt
- pdf2sb — Upload PDF file to Gyazo as images then convert Scrapbox format
- pdf2table — pdf2table is a powerful Python tool designed to streamline the extraction of tabular data from PDF documents.
- pdf2tables — extract tables from pdf using camelot, if page is image-base, use ocr to extract
- pdf2textbox — A PDF-to-text converter based on pdfminer2
- pdf2textlib — A package to extract text from PDF
- pdf2txt — A better pdf to text extraction toolkit
- pdf2txt-pkg-jeff — Converts a PDF to Text
- pdf2up — A small utility to generate fairly high resolution preview images of PDFs suitable for viewing or sharing to social media
- pdf2vectors — A package to interact with vectors DB
- pdf2word — Convert pdf to docx
- pdf2xlsx — Invoice extraction from zip, order detail transformation
- pdf34 — no summary
- pdf417 — PDF417 2D barcode generator for Python
- pdf417as-str — Create pdf417 barcode by special font without using images
- pdf417decoder — A PDF417 barcode decoder
- pdf417decoder-with-opencv-python-headless — A PDF417 barcode decoder
- pdf417gen — PDF417 2D barcode generator for Python
- PDF4Cat — PDF4Cat Simple and Power tool for processing pdf docs using PyMuPDF
- pdf4llm — PyMuPDF Utilities for LLM/RAG
- pdf4md — convert markdown to pdf
- pdf4py — A PDF parser written in Python3 with no external dependencies.
- pdfa-learning — A Python project template.
- PDFAgent — Production Ready PDF Agent
- pdfalyzer — A PDF analysis toolkit. Scan a PDF with relevant YARA rules, visualize its inner tree-like data structure in living color (lots of colors), force decodes of suspicious font binaries, and more.
- PdfandEmail — Any format to pdf converter and Attach PDF in email
- pdfannots — Tool to extract and pretty-print PDF annotations for reviewing
- pdfathom — Query PDFs in natural language from the command-line
- PdfAutoNup — Convert PDF files to 'n-up' PDF files, guessing the output layout.
- pdfbook — Rearrange pages in PDFs for printing books
- pdfbookmarker — Add bookmarks to existing PDF files
- pdfbrain — Parsing PDF files with pdfium
- pdfbta — no summary
- pdfbwan — no summary
- pdfCatalog — Build catalogs for PDF documents automatically.
- PdfCC — PDF cropper & compressor: removes unwanted noise from pdf and compresses them
- pdfchain — A graphical user interface for the PDF Toolkit