newspaper4k

View on PyPIReverse Dependencies (0)

0.9.3.1 newspaper4k-0.9.3.1-py3-none-any.whl

Wheel Details

Project: newspaper4k
Version: 0.9.3.1
Filename: newspaper4k-0.9.3.1-py3-none-any.whl
Download: [link]
Size: 296617
MD5: d92f4ee4c9c12ea8bf50c033a4a9b09a
SHA256: 42a03b7915d92941a9fe4cc8dab47240219560e0cb8ecb5a291dc5a913eb8aa4
Uploaded: 2024-03-18 21:56:43 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: newspaper4k
Version: 0.9.3.1
Summary: Simplified python article discovery & extraction.
Author: Andrei Paraschiv
Author-Email: andrei[at]thephpfactory.com
Home-Page: https://github.com/AndyTheFactory/newspaper4k
Project-Url: Documentation, https://newspaper4k.readthedocs.io/en/latest/
Project-Url: Repository, https://github.com/AndyTheFactory/newspaper4k
License: MIT
Keywords: nlp,scraping,newspaper,article,curation,extraction
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing
Classifier: Topic :: Text Processing :: Markup :: HTML
Requires-Python: >=3.8,<4.0
Requires-Dist: Pillow (>=4.0.0)
Requires-Dist: PyYAML (>=5.1)
Requires-Dist: beautifulsoup4 (>=4.9.3)
Requires-Dist: cloudscraper (>=1.2.0); extra == "cloudflare" or extra == "all"
Requires-Dist: feedparser (>=6.0.0)
Requires-Dist: gnews (>=0.3.6); extra == "gnews" or extra == "all"
Requires-Dist: indic-nlp-library (>=0.90); extra == "bn" or extra == "hi" or extra == "np" or extra == "ta" or extra == "all"
Requires-Dist: jieba (>=0.42.1); extra == "zh" or extra == "all"
Requires-Dist: lxml (>=4.2.0)
Requires-Dist: nltk (>=3.6.6)
Requires-Dist: numpy (<2.0,>=1.24); python_version >= "3.8" and python_version < "3.11"
Requires-Dist: numpy (>=1.25); python_version >= "3.11"
Requires-Dist: pandas (>=1.4); python_version >= "3.8" and python_version < "3.11"
Requires-Dist: pandas (>=2.1.0); python_version >= "3.11"
Requires-Dist: pythainlp (>=2.3.2); extra == "th" or extra == "all"
Requires-Dist: python-dateutil (>=2.6.1)
Requires-Dist: requests (>=2.26.0)
Requires-Dist: tinysegmenter (>=0.4); extra == "ja" or extra == "all"
Requires-Dist: tldextract (>=2.0.1)
Provides-Extra: all
Provides-Extra: bn
Provides-Extra: cloudflare
Provides-Extra: gnews
Provides-Extra: hi
Provides-Extra: ja
Provides-Extra: np
Provides-Extra: ta
Provides-Extra: th
Provides-Extra: zh
Description-Content-Type: text/markdown
[Description omitted; length: 12193 characters]

WHEEL

Wheel-Version: 1.0
Generator: poetry-core 1.9.0
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
newspaper/__init__.py sha256=dV6BtphiAnK7j_dLR0vuf9Syj6w7UAfA0SOIeIqHpyc 2269
newspaper/__main__.py sha256=OznpJ_snRAfodtoWlBFADgcTgSGg5711Je_7-TVmXSk 303
newspaper/api.py sha256=E4g4XcRTG2-5GJvNom_Gq4NoVm8leLc7KyouhL911Kk 4054
newspaper/article.py sha256=ns-rMl0rgiIUtikEz-d2vyhgEJm6Ha3J8jpOKsNj1FQ 32590
newspaper/cleaners.py sha256=oyrYmmk3dDeldfqOcoJl0ysgv89o0vcnVIXGMQ3aZ4M 11248
newspaper/cli.py sha256=vX6-RWWrIkDY9dkzANb39ns12v2xrzHhYi3k2_Ovkvk 8926
newspaper/configuration.py sha256=a8yqMitAsW1gnPxCBvrfWo5MUoFBxDsPIGo17O35GCA 15531
newspaper/exceptions.py sha256=DhbiYcJ3DBqt-fDqAx1JoGtfGOizXwAwwy4YIlSTPes 339
newspaper/extractors/__init__.py sha256=HGPmmDdG55r8Mqbxo_s_Z4cpKEeXJ7Adrhca2Kwq1dc 315
newspaper/extractors/articlebody_extractor.py sha256=DOtLxveM1g6S2okeJS3C3gLcZFA_cCyiAhvp7kDf_Xo 16624
newspaper/extractors/authors_extractor.py sha256=5xZBNSKtVNgckw5mmN0IiXhODXCCH2Qga0jwsRpujnE 6975
newspaper/extractors/categories_extractor.py sha256=f1GYmd6wR-1jWQcgfU96myywBDJylrnqBnZdJ2ay5yA 7170
newspaper/extractors/content_extractor.py sha256=5XDXxbtJJfdbWLv1ccVaNW0o8wQF013Aydvz2Kz76XA 7463
newspaper/extractors/defines.py sha256=IQ-auV5qJPTJ7oM4zN1GD3rQfqiqIN_XacpmufEe4e4 6367
newspaper/extractors/image_extractor.py sha256=eebY32st2WWWEOTmAo0TfhjZ-MaUMDnGn03gQVcQPX4 8617
newspaper/extractors/metadata_extractor.py sha256=x_iQ0WNUPL-HUbx44xoGyhubCFEJG6Y97S3evJtZdZI 6468
newspaper/extractors/pubdate_extractor.py sha256=CLJcyesyDFmfV66NmbKf8ZgBaauJXJ2UF4jBuHGh-Hk 4704
newspaper/extractors/title_extractor.py sha256=15SrwpkbxUgTR8SsT8ycwfV2J0A1hsO5XWd8Pdlrjg0 5740
newspaper/extractors/videos_extractor.py sha256=q3hsLJk2MRZQ53MMrg1jkFxWbUI6iWoiS14UUnt99SM 5351
newspaper/google_news.py sha256=YiQeKkf6XttHsuCIxYuqj4k_T7OuYkeCEZSrWs8jO44 11987
newspaper/languages/__init__.py sha256=yvgNdNjanSvPGIJUt6WmrP3LuhG7xD2u57_haQ2luBg 9672
newspaper/languages/ar.py sha256=jQK2rUg2qjmQblzlP8r1-5Wb5NcBuf76RbIvoCfso-I 1506
newspaper/languages/bn.py sha256=rM7mDP7HpXdkrrGJXnPWOZSERVXcPCbUlNWSRDCIu0M 1906
newspaper/languages/hi.py sha256=7s6RKJUr1Cb-RT1F5WZEtu7FVZwPmSijp1jQ6DXU3kQ 1900
newspaper/languages/ja.py sha256=XBRyItcneArtFrV0GtFWv79_YfUnF1FrdpTjnKbaDAY 1483
newspaper/languages/ko.py sha256=6qF5z-I1lpvZMWTFX-oa2EhoVBJvOpf46Pdu_5NpmYM 1561
newspaper/languages/my.py sha256=IH6ZdQozzzD_Z4iWu4b-Iq3N93Nr0B4NAzw7iqfgU4Q 1602
newspaper/languages/ne.py sha256=UrTKWJOez2YgwqoGzlimdFZC_L0mVzfbWWR9DRm9fYI 1905
newspaper/languages/ta.py sha256=XyTyAGG0vY6iFc6FJ4wO68B-6hwmQV08Q4z5m4ePZ6U 1900
newspaper/languages/th.py sha256=ydw6D5Vib3mPZYJN0VJPl7iFY2BQK42zkQuRjXL5358 1420
newspaper/languages/zh.py sha256=MP4CFNxAwFnko4_9a1dnSy1G4U5dbLaCCjoUlcr4OxQ 1437
newspaper/mthreading.py sha256=TCm3qcvd5v4m7dHASMcPiCDzwB2e5JcExrJrfvAo2Kc 2088
newspaper/network.py sha256=Q3C_O_ro3o_8Z2PbS6KvZ9zKk67_Ju35UlKwZr1ZCws 11634
newspaper/nlp.py sha256=xN3tyCmEVLXBFVN4GWmFuFX3uCncD10lPnSJuOeriXY 6658
newspaper/outputformatters.py sha256=09eYDefiSrXWZdvXOwj91KgeKNzOmXFELFgK2EF5dII 11064
newspaper/parsers.py sha256=5x51xxNFsossQukuamMScDFT3kpN3LcYSp7zp6occLo 13647
newspaper/resources/misc/google_sources.txt sha256=pNu1maJmuEfRU9NoA5zKA_Qky-vJuKcbSMO6bANb-Fg 349280
newspaper/resources/misc/popular_sources.txt sha256=zsdp64fEGJ891ZkysnrGJk2C3HBgvi-qI3gDA-Sm18k 4407
newspaper/resources/misc/useragents.txt sha256=VxHNiGmGw-RpGL8FzdqaXhSQq_m2JxYsZvub8Qv5wZI 4241
newspaper/resources/text/stopwords-af.txt sha256=60yPPPKNgfcxUeAm7RlT02nQEgGgtTlobVCsH5VKgbA 194
newspaper/resources/text/stopwords-ar.txt sha256=MI4IgpiJ_9wioBXoWCalGyeStFkuMlEwD9wSxnRIbgE 1452
newspaper/resources/text/stopwords-as.txt sha256=DEDlOXFoEkbhBTLEqgv0X3vFfuDJN46f6S9D30MtQ-o 1480
newspaper/resources/text/stopwords-be.txt sha256=Sa08G0yBPKJhz2Om1kQ1BQCMc9XvPWQv5enRH1kC858 936
newspaper/resources/text/stopwords-bg.txt sha256=eiIwYk1TU8YcYYPbMPjUzZSZlgd7gl5o7d0LIthzqHQ 2409
newspaper/resources/text/stopwords-bn.txt sha256=oTcoBWj00-FXBJG9WJ5tzTUYENk4I8pRA-_FQwTGT1U 5444
newspaper/resources/text/stopwords-br.txt sha256=dV9CobI-42ifFBrWFfAWEKTUnaJz6lhqwU1UnENg2Xg 7583
newspaper/resources/text/stopwords-ca.txt sha256=mqhx0DTLe9IUnG66q-kKw9KoJ22WIa6aRBogoDFItJ4 1559
newspaper/resources/text/stopwords-cs.txt sha256=lvGOkvERKQGdEcWFdlW2Fjg1JbF1QgjCjM3FNLNm0I4 650
newspaper/resources/text/stopwords-cy.txt sha256=QfAaR6qS83AGq88542KIuFTqh9xPGEqA7xzpvmnKc0I 2652
newspaper/resources/text/stopwords-da.txt sha256=A1tQ6LutIdwsN09YSqZM33DcWFbX6-RndIA29lSEW7M 484
newspaper/resources/text/stopwords-de.txt sha256=tl02XGxiNYyDhaAnvHyf-vI7UJlS2RuJNwW_Jfg3RCM 5968
newspaper/resources/text/stopwords-el.txt sha256=MrNPZccmGguHmadO9WC762dhS2uGJnhBj8kdQoQ-jvU 13903
newspaper/resources/text/stopwords-en.txt sha256=ZOBwf94UJ5GCzV0nDGa3cWbTBfGoeajlW4dM3l6iXvw 4262
newspaper/resources/text/stopwords-eo.txt sha256=XgOG0UdWWNEOXM35MMAxTy8vyDvPI1H21C8hoaPAwCM 854
newspaper/resources/text/stopwords-es.txt sha256=g1uQlrf5_Sk_3oyzxPkA8p1gWOYB-5LvSmoiE91yHMI 2185
newspaper/resources/text/stopwords-et.txt sha256=TQBb3Q388dQ_ZVRjlPQqnrfONORKIC2cBB7UM3mYGa8 189
newspaper/resources/text/stopwords-eu.txt sha256=9agvEls74mENV7--XoOsgi7A98WhNL4oU6g9hmzaryg 576
newspaper/resources/text/stopwords-fa.txt sha256=lrNxTp0sv0QKXpgnyEV8xyflxabJQZGdzfSnmzICPsM 7885
newspaper/resources/text/stopwords-fi.txt sha256=NH7nDTJ5u-hAIKGSe-NzAbXkb-No5uDSqvY3UND_fvo 464
newspaper/resources/text/stopwords-fr.txt sha256=I_rwIrFt7h2I8nE0yWS4QKB1OoOujpx0XFH40mx5Q04 2002
newspaper/resources/text/stopwords-ga.txt sha256=sMIuxHK5xSUw9mvro24HfrNdRul2LL73w7Hz_as1Slc 552
newspaper/resources/text/stopwords-gl.txt sha256=6CcC8sgPiIfwhy5wbyGLFGDmWyJ9qr0NlPzc_OUXmdQ 799
newspaper/resources/text/stopwords-gu.txt sha256=Y5V0wMoFXp7DaVhmWdz35dUcB_-iQL3xvtixH-rPHvs 2695
newspaper/resources/text/stopwords-ha.txt sha256=OMQjwF9_noQxABEknnXqQ2Dqxa9KsrJHTzF3X0Gml8A 156
newspaper/resources/text/stopwords-he.txt sha256=-GFMB-dXEfz-KUFt_jMUrua4PuMssvXNUTYO9tByUuI 1837
newspaper/resources/text/stopwords-hi.txt sha256=vhZ1hkEqxjtWtxysChLpEI_zElQTzhG5lcvtvC7XaKo 2789
newspaper/resources/text/stopwords-hr.txt sha256=QQpdS8w0mlmuvtj0GBWvs3FBpoqT3sbPqw3NNFrqIFM 871
newspaper/resources/text/stopwords-hu.txt sha256=tzx4glpJIa0XJ44a3ggUbrs4neUdfI6dWEb57dJjg7A 2336
newspaper/resources/text/stopwords-hy.txt sha256=lYA8ECIEJfy2kMn2I0OX0CmjAsOGjjouZHczW5xzfFM 297
newspaper/resources/text/stopwords-id.txt sha256=F-NDRwcG5UeZ5NWSapM5qtI4wtrLUwtK_GQNE64DVWw 10500
newspaper/resources/text/stopwords-is.txt sha256=pn2m5rsPtO_CsLby8xAqdCd0-0OOhJ9DLnKASqhCdMo 4827
newspaper/resources/text/stopwords-it.txt sha256=ykDV7p7nQxu1R3z81FWlpe_BMgF1V6Gjgvw7mF8pYQU 1696
newspaper/resources/text/stopwords-ja.txt sha256=962IY0fXtUlS0UeazgxF2RbKWNm_3b1aBPwmSyDhFFk 1007
newspaper/resources/text/stopwords-ka.txt sha256=MA534Ul4_biCliwNZSVLwGzBbPPBXeTguM0GWSCQT-U 4062
newspaper/resources/text/stopwords-kn.txt sha256=kG1qtGyljpAcZY41o8r14hd_J5NF7Eyw3O731hE59-E 1463
newspaper/resources/text/stopwords-ko.txt sha256=DSqMnLbnxymFA6iHoHF4VgrS49Rw0HjFuur35fZWcQ0 460
newspaper/resources/text/stopwords-ku.txt sha256=v11PvAAwKtPfhcncbD82tD3QwzanUsTTnlx3YKoajuw 580
newspaper/resources/text/stopwords-lb.txt sha256=Pk5EWZRN0rpNZITldVfVeiS6YI8qNYQats4kbiCPOYo 1004
newspaper/resources/text/stopwords-lt.txt sha256=eIPp3zR_feL3p_GQlp-Ll3bI93WRBO8EmuxAUNbb8so 764
newspaper/resources/text/stopwords-lv.txt sha256=oJ_9y6iJyECpLT65Ea2gV11yrkXRsFS1Sj5qAAkdF64 1026
newspaper/resources/text/stopwords-mk.txt sha256=CKEzV8NJDb4jq3C97P7MuSd8qwFmcYjODGXwC3HxS78 1504
newspaper/resources/text/stopwords-mn.txt sha256=_vTkGQMBKpaybZbV_I17OIQGKWo0_4zmVX3Z4GSgEk4 310
newspaper/resources/text/stopwords-mr.txt sha256=vW1HHQgg9anOgieaVrPnGu_DXtaeSNuJYbTaov4FzJU 1365
newspaper/resources/text/stopwords-ms.txt sha256=9XvoRhgkZPObpH6T7eag9LEtDJ0V6eNYiR-zONXHxvM 3572
newspaper/resources/text/stopwords-my.txt sha256=79BW7eJnSgfXwHddoISrZR7VjfOkuKuohpaSZkeafQk 7814
newspaper/resources/text/stopwords-nb.txt sha256=VuZbxq0aq66b4MkwezWQqtdPMxGOs8kO7IF2NSqPTSk 587
newspaper/resources/text/stopwords-ne.txt sha256=0Mchs9Vujo0oFZm-quvnXeitLv2hXYJidQDpJxkWnzM 4415
newspaper/resources/text/stopwords-nl.txt sha256=GfMWt-rO7i3IcHRCsPvZUXARVNkor3_dZrz0Tqadrkk 177
newspaper/resources/text/stopwords-no.txt sha256=9hp0ky1DpC463zMMTE6Se9DHuFGXRPUV6injyjlJngQ 514
newspaper/resources/text/stopwords-pa.txt sha256=gyuYH3ALhrst9LA3_r3gL31JiwItSdDaTorsIJTNtrY 6993
newspaper/resources/text/stopwords-pl.txt sha256=z1A4NOX5ZxFrWzUcBMKd92_EB4XgmEnWE78br2GykhU 2016
newspaper/resources/text/stopwords-ps.txt sha256=5mKPAFn_62bXCKq77s89z5ssPCKX5uLZ63Rd4NUi0cM 4410
newspaper/resources/text/stopwords-pt.txt sha256=I1xLWCygpgkd9ZZ8hYSjQ7jr4shuj-3ySSf2hkddwsc 3610
newspaper/resources/text/stopwords-rn.txt sha256=lQkGk0ZjadVPPC8-IBzNWlo47UavN4JKS0pCOLHX0CE 226
newspaper/resources/text/stopwords-ro.txt sha256=E4AZ5e-RhGmKcJxxzCfkUW4SKeRUUaYAWmqVFVgtjgU 1916
newspaper/resources/text/stopwords-ru.txt sha256=soQOPcfR18HOcSoZWzWFkajvSrLG9pj-A4yP1pmrneo 4958
newspaper/resources/text/stopwords-rw.txt sha256=1xlJP0cM6q2WONtdGQQkddIZo4EKrdxw02Po0RDdLtk 319
newspaper/resources/text/stopwords-si.txt sha256=VEGzRDHDhe2EbDWJXMq-nPFSf0S0Spg6jTtT4w71Yn4 2366
newspaper/resources/text/stopwords-sk.txt sha256=IN6F_M2dOETSoQR6TGxFKZXXFpNViIOMqGvGJE2Azh8 2243
newspaper/resources/text/stopwords-sl.txt sha256=H0K15wVGRZn43AeLZsRuF1p0NZvx30mL4o9yGM_RuoM 2436
newspaper/resources/text/stopwords-so.txt sha256=Q0XxZoRcjo1Fh50ILKX1T0-cOaziGLyMScAov8zqWjU 146
newspaper/resources/text/stopwords-sr.txt sha256=X1-W65P7NvABWA8VtINg974KbJn4P1bYhn9CztZ6NWk 2122
newspaper/resources/text/stopwords-st.txt sha256=DVUAHX_iKYZ7e0IwZhGnIRZ2Wcy7RkQ634-bVqjhSMk 110
newspaper/resources/text/stopwords-sv.txt sha256=j99ousYmploqqSZJt-x7BCOstegufTMg2d_ccJ5oXPY 3956
newspaper/resources/text/stopwords-sw.txt sha256=pHuLPf47kDGCjRsImLD-ush8c2nafOci1TID2GAzCxM 407
newspaper/resources/text/stopwords-ta.txt sha256=hEg5ALJX2zL5pA_2KkOQQVXssdlKi0Qh0GSzCegXsxM 1964
newspaper/resources/text/stopwords-te.txt sha256=6hb4yXDVBnQeXcVSj1pMGddn3RYdj0nqPJ3INH6MFNU 1000
newspaper/resources/text/stopwords-th.txt sha256=ZzOqQCVYz6sCTRuY7OaJ5eMbZiF0vFFZRmTpCLGwVtI 1420
newspaper/resources/text/stopwords-tl.txt sha256=vRbXbQ8G9W3Nh8J3pwnRj5937kjspPYT_ZOM102C89U 925
newspaper/resources/text/stopwords-tr.txt sha256=CQLIhb35bsYDvcLa-UpXNI3FuzvvQRGI6VqiGC7fe5k 1368
newspaper/resources/text/stopwords-tt.txt sha256=xEM_cpYXf29p68rV_rEcn2k63VN5XycSrBklyExbDRc 18436
newspaper/resources/text/stopwords-uk.txt sha256=jnA0kjaFBpFCu5C4mCxtLYDMvN7-ds-G-Y-NPK955_g 4030
newspaper/resources/text/stopwords-ur.txt sha256=7wFP7uTxsqxftIc6NzJzmhP2xrvwkEb_fz4FClzObJY 4782
newspaper/resources/text/stopwords-uz.txt sha256=xkLwdjvuSDlErEFfSrV0EXHESk1wGSpU0NpLJoWZ9CE 2758
newspaper/resources/text/stopwords-vi.txt sha256=038u06SJb4rzOIPZAgCkSzvK8t9LH8_clDWluHUEJVo 724
newspaper/resources/text/stopwords-yo.txt sha256=MrnmocqDmpQeWxPKMApoYCZYn2tu6isYAoNPGqY3MSw 348
newspaper/resources/text/stopwords-zh.txt sha256=H40b0HSxc1xjjSZ3k9HkHIXYs8GA5HFzV3-zG9C9vio 625
newspaper/resources/text/stopwords-zu.txt sha256=u2-x3vyIHATlthkK5FjY2tuP6ctG30Qkihq7UkM_eWQ 176
newspaper/settings.py sha256=mUqPfL8yYubMdHEYSpXSOPaA9DI0XHVTVcpvJ3mkfn0 2807
newspaper/source.py sha256=TSpTxAHXtGBNruT3stGQ6UZgug0FT6qrHKkb1B-HCco 22785
newspaper/text.py sha256=uA_WlJN11-PtsKhJlnH6cdVwuXsv4tM7y1G1yQmQlvE 5969
newspaper/urls.py sha256=jzDMu74lgo-sdw5bOrMBwLgRZiR_sqfESZOAsA9umx4 12258
newspaper/utils/__init__.py sha256=lB8FAmENKpnjuH4kwOHSTtAF44Btpu20KRBuoyhjQAI 6393
newspaper/utils/classes.py sha256=-5CowuRhv99mVBP0DLcwf-LXeGhHFrNsoCnPr4nuRIA 2968
newspaper/version.py sha256=uezdhQOKohVNVxhYuvTXImv47cBv9r_EA8M8o-sEuBg 317
newspaper4k-0.9.3.1.dist-info/LICENSE sha256=Zi6plaWPTRE4Eorh9EqRvM1N2xFEGdBERs8WVEK4IXw 1080
newspaper4k-0.9.3.1.dist-info/METADATA sha256=VRIlc4XSByCx5UU0slatb8nMP3UQzz0x82YJYktq3I8 14889
newspaper4k-0.9.3.1.dist-info/WHEEL sha256=sP946D7jFCHeNz5Iq4fL4Lu-PrWrFsgfLXbbkciIZwg 88
newspaper4k-0.9.3.1.dist-info/RECORD