papermage

View on PyPIReverse Dependencies (0)

0.20.0 papermage-0.20.0-py3-none-any.whl

Wheel Details

Project: papermage
Version: 0.20.0
Filename: papermage-0.20.0-py3-none-any.whl
Download: [link]
Size: 109225
MD5: 9c9c7bae908bf675fff14da5a1c16aeb
SHA256: 74c91813286032f4c9d3bb6d554293988737e959bd4b9a771b4a24378b898348
Uploaded: 2024-04-05 22:30:13 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: papermage
Version: 0.20.0
Summary: Papermage. Casting magic over scientific PDFs.
Author-Email: Kyle Lo <kylel[at]allenai.org>, Luca Soldaini <luca[at]soldaini.net>, Shannon Zejiang Shen <zejiangshen[at]gmail.com>, Ben Newman <blnewman[at]stanford.edu>, Russell Authur <russell.authur[at]gmail.com>, Stefan Candra <stefanc[at]allenai.org>, Yoganand Chandrasekhar <yogic[at]allenai.org>, Regan Huff <reganh[at]allenai.org>, Amanpreet Singh <amans[at]allenai.org>, Chris Wilhelm <chrisw[at]allenai.org>, Angele Zamarron <angelez[at]allenai.org>
Maintainer-Email: Kyle Lo <kylel[at]allenai.org>, Luca Soldaini <luca[at]soldaini.net>
Project-Url: Homepage, https://www.github.com/allenai/papermage
Project-Url: Repository, https://www.github.com/allenai/papermage
Project-Url: Bug Tracker, https://www.github.com/allenai/papermage/issues
License: Apache-2.0
Classifier: Development Status :: 3 - Alpha
Classifier: Typing :: Typed
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.8
Requires-Dist: tqdm
Requires-Dist: pdf2image
Requires-Dist: pdfplumber (==0.7.4)
Requires-Dist: requests
Requires-Dist: numpy (>=1.23.2)
Requires-Dist: scipy (>=1.9.0)
Requires-Dist: ncls (==0.0.68)
Requires-Dist: necessary (>=0.3.2)
Requires-Dist: grobid-client-python (==0.0.5)
Requires-Dist: charset-normalizer
Requires-Dist: decontext (==0.1.6); extra == "decontext"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: mypy (>=0.971); extra == "dev"
Requires-Dist: thefuzz[speedup]; extra == "predictors"
Requires-Dist: scikit-learn (>=1.3.0); extra == "predictors"
Requires-Dist: xgboost (>=1.6.2); extra == "predictors"
Requires-Dist: spacy (>=3.4.2); extra == "predictors"
Requires-Dist: pysbd (==0.3.4); extra == "predictors"
Requires-Dist: tokenizers (==0.13.3); extra == "predictors"
Requires-Dist: torch (>=2.0.1); extra == "predictors"
Requires-Dist: torchvision (>=0.15.2); extra == "predictors"
Requires-Dist: layoutparser (==0.3.4); extra == "predictors"
Requires-Dist: transformers (==4.31.0); extra == "predictors"
Requires-Dist: smashed (==0.1.10); extra == "predictors"
Requires-Dist: pytorch-lightning (>=2.0.5); extra == "predictors"
Requires-Dist: springs (==1.13.0); extra == "predictors"
Requires-Dist: wandb (>=0.15.7); extra == "predictors"
Requires-Dist: seqeval (==1.2.2); extra == "predictors"
Requires-Dist: effdet (==0.3.0); extra == "predictors"
Requires-Dist: vila (==0.5.0); extra == "predictors"
Requires-Dist: optimum[onnxruntime] (==1.10.0); extra == "production"
Requires-Dist: layoutparser (==0.3.4); extra == "visualizers"
Provides-Extra: decontext
Provides-Extra: dev
Provides-Extra: predictors
Provides-Extra: production
Provides-Extra: visualizers
Description-Content-Type: text/markdown
License-File: LICENSE
[Description omitted; length: 6438 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.43.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
requirements.txt sha256=ed-qNOU1tXUa5Pb476aKqHPJ_6omOXcrL7xVITFkj-Q 33
examples/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
examples/how_predictors_help_each_other.py sha256=MImXpiKT74YrWriSyVGoJM3lu1wcC1rJrZao0uV4IkM 1424
examples/improving_sections.py sha256=_eSZHKa6v2pxaFywyRlbZZKXb0WtyIWhcLXiGPH50RY 578
examples/visualize_a_doc.py sha256=w8Kj6gP84o4rZhYvZHbuqDM-kQrmbIZCfiWfgevqHx4 2480
papermage/__init__.py sha256=6WhnBQfUTTpHMOPmVkWXcw2h984kqvOIG-UaLZn-AV0 93
papermage/py.typed sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
papermage/magelib/__init__.py sha256=Dm6TPnbK9VH3qJtVDjokQaj7o_sJDUniF5jsk4Dk7nw 1802
papermage/magelib/box.py sha256=vljywu72UrncWlTGMl24KH4rihvcG4JweSjfIdmT0rk 5054
papermage/magelib/document.py sha256=liRbP5kttKMc889z3K39uAPio3bJ4cuRHLio88uAAWU 6858
papermage/magelib/entity.py sha256=iX3F2tAWIiNmbFFtXrapGf7uhh5Nx42jm5boVNKrYTg 7641
papermage/magelib/image.py sha256=IdrCXPlXBkRNm918cArceyoFv5Lebtu_oqRMqjRYv38 3963
papermage/magelib/indexer.py sha256=L1WdxKABFqwYCAvWEISkqck2a2xe3HqTmpo7wXLM_FY 6250
papermage/magelib/layer.py sha256=zhli0AJ0acBARAvYVDoHIF4DcQy4UBFvtwGXZSy4XMw 4263
papermage/magelib/metadata.py sha256=thAaYGWldQ2lebZ-Ycr_2uhixj_VttLWaCb6FAQP4eE 15532
papermage/magelib/names.py sha256=2a63ttqGVbh0nOpryasqwiTRAGRiLz3LwlkNNCKGHW0 877
papermage/magelib/span.py sha256=V8o2pZiJBqObXnbEKcxyaDFSVE-oGujm8YVQzxcoHRU 2051
papermage/parsers/__init__.py sha256=7cjm4mc02q0egvXS8BQCMNks9slpTudirftnDpuOMxk 102
papermage/parsers/grobid_parser.py sha256=ajM0bR_CExv9BEkdSiY6UBl1Ggjp_d_Ki677Ece1VlQ 14955
papermage/parsers/parser.py sha256=UQIL-4CPHIFdaskzgqmJBIGF0O0VItDyhtfWXSAip5k 595
papermage/parsers/pdfplumber_parser.py sha256=c7m4_hxXdf3bYExYyBDzjKm0ro4tTTPCprS75WLobE0 18459
papermage/predictors/__init__.py sha256=Dd3apT8Sf513RP3qVNxA2akhqAcPQdsbC73WJW0yhis 987
papermage/predictors/block_predictors.py sha256=fg3Q1rlSxLgPmocaZTQrlbcoNVsgpwcekuspHvg737s 448
papermage/predictors/formula_predictors.py sha256=88ABd7jE9tr_8a27qo9a9qPwg_JYmyeZyf1_C1xs2rw 436
papermage/predictors/sentence_predictors.py sha256=OV2xUl6mxmY1IjjOPZvN5oa_S8rKI-Slc4yGxD2peUA 3463
papermage/predictors/span_qa_predictors.py sha256=eY6-uAWg5QA8dr0U4QOdL-Dxx8Ryc_MohpF9ceOnFrA 3831
papermage/predictors/token_predictors.py sha256=wFlBGsUZVe4VDAl4k716e6Pl9ExAvqJ3X0hWfu8NfIw 1742
papermage/predictors/vila_predictors.py sha256=nTZMLpa_5KWvbT-156NDfD0c2ZhcR4CKEVo71sjecYE 9228
papermage/predictors/word_predictors.py sha256=HNWpFSVIZ5VO9YSyKJlsR6vwBvXpe6wtmMnEfBNW3Dg 22885
papermage/predictors/base_predictors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
papermage/predictors/base_predictors/api_predictors.py sha256=gSG5gt5Kmi6iq-AIMEDM6E11N4sxuc6obj8SZzWCXek 17
papermage/predictors/base_predictors/base_predictor.py sha256=rjb_W4HjRZSwlStYxRn-5MzI4fEM1wLIp7l3-RCsJCI 1351
papermage/predictors/base_predictors/hf_predictors.py sha256=t0WvGyrtYd-TwNt3_0HE1OgYuX8B0wrVSsrrC2UZq30 18228
papermage/predictors/base_predictors/lp_predictors.py sha256=yKcZvHtSMikAab1quAfLyT6A8pBGMWS2dM34JARBGmM 3528
papermage/predictors/base_predictors/sklearn_predictors.py sha256=gSG5gt5Kmi6iq-AIMEDM6E11N4sxuc6obj8SZzWCXek 17
papermage/predictors/base_predictors/spacy_predictors.py sha256=gSG5gt5Kmi6iq-AIMEDM6E11N4sxuc6obj8SZzWCXek 17
papermage/rasterizers/__init__.py sha256=tDo4cO1mOG9UaC1IvO7LAhG2C6ts5QWqTPn8e-3A2HE 105
papermage/rasterizers/rasterizer.py sha256=CVc9ji1FyGPdxaVznaWecTY5fp7AYq14TX0oxNlaDSc 2042
papermage/recipes/__init__.py sha256=XSxaz1QCIAX8m3dHyM1qdvuQ4CZeg6K0YQojB8A0QZ8 305
papermage/recipes/core_recipe.py sha256=N0OsyC_X5yU-l-aiVMLWDXp_-peu-7i7OWdzQIytKDQ 5697
papermage/recipes/minimal_pdf_recipe.py sha256=JVe5b0dGgr7yaEQkWrPGkR72R-9Myuhd1yRbk8O4BPs 2690
papermage/recipes/recipe.py sha256=oNihzkDJZFIPEHG6ETKtwSaEHPG-dEF0mkNw0FehjQk 1299
papermage/recipes/text_recipe.py sha256=zwkZVH3_crQsnFZInj1DM4ocmtPuqlsfdSISCHoEyGQ 1178
papermage/trainers/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
papermage/trainers/bio_tagger_predictor_trainer.py sha256=z7_zaRzZgMZ6NLenWSwqnaR-cxMznE-Ot2RWecc8tSw 21793
papermage/utils/__init__.py sha256=W2AwKCJTaoTBdvAiHcnoLRBjGhV7qQ42_7vVaOdNc2Y 171
papermage/utils/annotate.py sha256=uMxaalwIlP6lb-vGHiU9m-mwbF-E_hNqF-IDxDpXX6Y 1883
papermage/utils/merge.py sha256=hHCqF8JYR1JOgXbAL9sDC5Ctnnhwl5U02ol4KAtvb1o 3720
papermage/utils/text.py sha256=dFUTn-8YhTFc-1tZmkD72Jy5-j9z5eN1Z4Q4-8LRU5c 307
papermage/utils/version.py sha256=zsoHJAAFX_mbtSwvvpMXNRYTQnmBZHRFylNcrczbbGQ 1025
papermage/visualizers/__init__.py sha256=dhHcX0s8kyLK1Teg6CRSbq2vDomCF_sYNrGBnrohtUY 83
papermage/visualizers/visualizer.py sha256=zMzXxDYIiXj7UYjPNX2txF81R_77PGDYP2Lufc3iQqo 1812
scripts/evaluation.py sha256=H6TflgUwJv_XKPzuuDwbPjX3SBkK7X2kk5FM_TOzJu8 5027
scripts/get_thumbnails_for_papers.py sha256=etpUcqAIr9dhxTvK8DIFBNWUf1nX4Kk-bZQ7tSkHeSY 802
tests/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_magelib/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_magelib/test_box.py sha256=E4GzEbhpF83op_PFOp_ObhDgO1nDs2Irc-M4yhhLxB0 4929
tests/test_magelib/test_document.py sha256=EV9nyjOBEGwQ3mad6TsDEODkRiY3Qe_Xdf4AW3eJPGs 11905
tests/test_magelib/test_entity.py sha256=yPzJbluxcHLYjKTfpYDHyM9in2qq-9yUOhmal-6Xojg 4073
tests/test_magelib/test_image.py sha256=Kaj0l2yQeo_mKtOygfFodyUkpXGbifN3B9DTU5zc-rI 2535
tests/test_magelib/test_indexer.py sha256=Oa9eIT8e66xkYCABr3lAldSxs_X6OH9_A5HXYnYaLts 4378
tests/test_magelib/test_layer.py sha256=kXzN6KQLaacsOmXxhLVpGFZ8_4UQs744Z0vJg6Kh11o 1152
tests/test_magelib/test_metadata.py sha256=lglLhnzw1anhQ5kkFo-OBIQh826KtLmTZBDgvopYdrE 1853
tests/test_magelib/test_span.py sha256=4Ozv0d6da0V8TyWr7zI7SfF0_idrsJynzoF3oOTh46o 1938
tests/test_parsers/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_parsers/test_pdf_plumber_parser.py sha256=r97o2KqmWVd9KYGm8Rxrw_uE3d3yy8naRrhixeG1Z1s 8971
tests/test_predictors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_predictors/test_bio_tagger_predictor.py sha256=Yg0fIs3uMNy0jSdNPdlQhBg2LlAxa1_eRURLaz1WvqA 5151
tests/test_predictors/test_block_predictors.py sha256=mPvPVvl6QTCZflMmmY7hHknHEnoFVQe_HWS75ls-nQg 421
tests/test_predictors/test_formula_predictors.py sha256=2MK408XVmTNehNdnKOMUIEI_Z6H_0WoAc1tF7FRLBuU 411
tests/test_predictors/test_sentence_predictor.py sha256=PiGdgCZhtMuukNZNcyuAqOiwrfEjtxOy5ycUvrs3PEo 3293
tests/test_predictors/test_span_qa_predictors.py sha256=ODwyemzShcEpi7xHoxaRIMUbFHZxgQZ48k68JdZALco 4010
tests/test_predictors/test_vila_predictor.py sha256=zqGwXAHIMBzZomUivtiBaJvPKPr_QZ8PUo-JvcO_oOo 2636
tests/test_predictors/test_whitespace_predictor.py sha256=2mN2VPNtozEhtlVbmLPwfjjmCILjIAhaSiYStSZgxlg 2220
tests/test_predictors/test_word_predictor.py sha256=RguPDkOwaUxHET1OjEmhCSFnBmWmyUjMSAQqc5FGMMw 4957
tests/test_predictors/test_base_predictors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_predictors/test_base_predictors/test_lp_predictors.py sha256=3IT1RqNBCYRAms2bbw0tkrBGTJWigwFGEuJR9annF-s 1771
tests/test_rasterizers/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_rasterizers/test_pdf2image_rasterizer.py sha256=cdQd7flkobOQMZb5J_-RPvqebcujkPY0MHoOX9Wz4IQ 1048
tests/test_recipes/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_recipes/test_core_recipe.py sha256=wvaZVS5Wwm7r_ZXhP9IX0SQfjIjmkHqeT8wZCzkrO80 1096
tests/test_recipes/test_minimal_pdf_recipe.py sha256=6yzlZcyJI1MMW7BojYBnSkRKaQw4hM2V1Rz7ig0BghA 1899
tests/test_recipes/test_text_recipe.py sha256=_oqSSPaZfHK4NqtHNc4SE1OE4v1VszspKMBcCqaFuEk 746
tests/test_trainers/test_entity_classification_predictor_trainer.py sha256=B17jdr8RP1EhkIU0wFtvGPHnqNLBusg7KtqMOsm8bn4 7683
tests/test_utils/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_utils/test_merge.py sha256=m0JHxcVvKZIRSjmD9XJjZHbmsybzbBh4xA2Qj03XJXM 3492
tests/test_visualizers/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
tests/test_visualizers/test_visualizer.py sha256=oDZv8RntSjPChc45MLGO6SUcuOSVWJFC2yJgBS9QV7k 890
papermage-0.20.0.dist-info/LICENSE sha256=UbiP0-niTtzc4d0UWk_KLc0SkRQH8AsG36b_YA-XM8E 11358
papermage-0.20.0.dist-info/METADATA sha256=qj2fXaUO2OZXpzLg1r2hanF_tXu3wxh4-zIQyx2wq3w 9330
papermage-0.20.0.dist-info/WHEEL sha256=GJ7t_kWBFywbagK5eo9IoUwLW6oyOeTKmQ-9iHFVNxQ 92
papermage-0.20.0.dist-info/top_level.txt sha256=iVlNFjPPIQaRAa8zCYLLLogYL4HJ2KQ1PSWNu3LtRs0 33
papermage-0.20.0.dist-info/RECORD

top_level.txt

examples
papermage
scripts
tests