news-please

View on PyPIReverse Dependencies (2)

1.5.44 news_please-1.5.44-py3-none-any.whl

Wheel Details

Project: news-please
Version: 1.5.44
Filename: news_please-1.5.44-py3-none-any.whl
Download: [link]
Size: 90336
MD5: acefb552345f13a452bea40a1e14ce05
SHA256: 97a730d6662bf26f975d96c12465ed4b08252fce57e87217a7b604c9f0b74b14
Uploaded: 2023-12-27 15:22:02 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: news-please
Version: 1.5.44
Summary: news-please is an open source easy-to-use news extractor that just works.
Author: Felix Hamborg
Author-Email: felix.hamborg[at]uni-konstanz.de
Home-Page: https://github.com/fhamborg/news-please
Download-Url: https://github.com/fhamborg/news-please
License: Apache License 2.0
Keywords: news crawler news scraper news extractor crawler extractor scraper information retrieval
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft
Classifier: Operating System :: POSIX :: Linux
Classifier: Programming Language :: Python :: 3.4
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Topic :: Internet
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Dist: Scrapy (>=1.1.0)
Requires-Dist: PyMySQL (>=0.7.9)
Requires-Dist: psycopg2-binary (>=2.8.4)
Requires-Dist: hjson (>=1.5.8)
Requires-Dist: elasticsearch (>=2.4)
Requires-Dist: beautifulsoup4 (>=4.3.2)
Requires-Dist: readability-lxml (>=0.6.2)
Requires-Dist: newspaper3k (>=0.2.8)
Requires-Dist: langdetect (>=1.0.7)
Requires-Dist: python-dateutil (>=2.4.0)
Requires-Dist: plac (>=0.9.6)
Requires-Dist: dotmap (>=1.2.17)
Requires-Dist: PyDispatcher (>=2.0.5)
Requires-Dist: warcio (>=1.3.3)
Requires-Dist: ago (>=0.0.9)
Requires-Dist: six (>=1.10.0)
Requires-Dist: lxml (>=3.3.5)
Requires-Dist: hurry.filesize (>=0.9)
Requires-Dist: bs4
Requires-Dist: faust-cchardet (>=2.1.18)
Requires-Dist: boto3
Requires-Dist: pywin32 (>=220); sys_platform == "win32"
License-File: LICENSE.txt
[Description omitted; length: 603 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.42.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
newsplease/NewsArticle.py sha256=BWpTNs7LCPZn6VZ2dNUfsVvR5QQeJ80svuT1LAp9grI 1627
newsplease/__init__.py sha256=6C4bHLTFFpqJpHai2WO0oLnEkpZwkI-RijnvLI74yEQ 6448
newsplease/__main__.py sha256=Rub5r-XqJYf37nxqFUKuI1xOwnDCl11W5Bw8oN20DGM 24157
newsplease/config.py sha256=1FPARLJzTz4dbM_AUm6Y4clI3Ifo1cStn2m56t67RBY 9098
newsplease/helper.py sha256=DF6HVhRhLdg6KJ60aYxXYFSIvF4rYDEfi6XofZqlPVI 1262
newsplease/single_crawler.py sha256=H8990_C05sHX1CMrzZel-HX8HD5vhdk5FePa5lONgkA 11221
newsplease/config/config.cfg sha256=KJka3bww9Yccyd1nnN4mDKL7dr86h_tD9s9bIsNJpY0 14641
newsplease/config/config_lib.cfg sha256=F4k9qbsAv7mCgYOorH64M5Cn4IFevqGTtdKgWRD4kEU 14640
newsplease/config/sitelist.hjson sha256=CcYIUWa9sOTAqSXz4Mcac_zchH9DxT3bsHsOQVEqaaw 2127
newsplease/crawler/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/crawler/commoncrawl_crawler.py sha256=1vcOf-qp7QgFZJDsYqXd6FitNTlrWx6SXLKl95-i27o 18927
newsplease/crawler/commoncrawl_extractor.py sha256=bFODqGHSg3MRpD7CCMqp22FPbfUwoyAMZ-M-qkVGdwM 15901
newsplease/crawler/items.py sha256=hVIrdX0sB90QnoQif4S_f5mFRUEfpAUaBE1xt2DnKbQ 1450
newsplease/crawler/response_decoder.py sha256=ZH3gltuwICu_Tu9dwvChDU8DyiQLwsd8WeQET7PFjcQ 1612
newsplease/crawler/simple_crawler.py sha256=ZdWaY6lFAvyGH-r5PZSaHt7b9d0575A2AYI69BWOXi4 3860
newsplease/crawler/spiders/__init__.py sha256=ULwecZkx3_NTphkz7y_qiazBeUoHFnCCWnKSjoDCZj0 161
newsplease/crawler/spiders/download_crawler.py sha256=WjrJUBPpACkl6YysmQqy6vK_U95o0fUybuf62ezr8oE 1242
newsplease/crawler/spiders/gdelt_crawler.py sha256=5qk-x_RMuvwfciR2zAbTLRJECdw4nEvW57ctylZ7Udo 3426
newsplease/crawler/spiders/recursive_crawler.py sha256=yq9FBXA6h90YsjaAj-3G02VF7OJsKhwncloDPnib3V4 1889
newsplease/crawler/spiders/recursive_sitemap_crawler.py sha256=Bz03A080JObpmiDW_F6HFygTymTguvygDb4vjD96Y-8 2133
newsplease/crawler/spiders/rss_crawler.py sha256=ASsEXFGYdYRqlUJtNfVQQJgKkvbWRZotTbG4XYBiWdw 3327
newsplease/crawler/spiders/sitemap_crawler.py sha256=dgW37588Eji_31rN-0CsP9fU2R_dEH-eopmJ73ljkXs 2037
newsplease/examples/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/examples/commoncrawl.py sha256=Av1Ap2YZGLpRPQuKpogklXu9rQGOf_Eyem6mpfnPpgw 9059
newsplease/examples/downloadfromfile.py sha256=Ry_ZmpQFWoBlZFnh_xnAmmry8a_uXfOAZ9qK9Cscih0 707
newsplease/examples/downloadfromurl.py sha256=hGjGoY1PFaN0b5Ut3As_0WOtwM7hHTqELRFwa4BoAfY 534
newsplease/helper_classes/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/helper_classes/class_loader.py sha256=-fuUlMyqaBwSsrF1bJnnMQgDz0Ot7NOxMI3uxER3VUo 702
newsplease/helper_classes/heuristics.py sha256=LQMj4bXXY2d9IDM6HOBx6FUw_kivm51FUn-v5tZJlWo 4819
newsplease/helper_classes/parse_crawler.py sha256=cOvVHt048ZH-9KFejnQfcCGlj_Tk7yES8d5VZh3Huc0 4847
newsplease/helper_classes/savepath_parser.py sha256=yTw_oAGKn6c_ruKJ4RGNKkTmLo8OVYJvaC13XM8D8Q8 11872
newsplease/helper_classes/url_extractor.py sha256=sP84R-iHkPerEUr2RRCZzhD2V3u_AjN1pzqOvCD5jwU 6551
newsplease/helper_classes/sub_classes/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/helper_classes/sub_classes/heuristics_manager.py sha256=bHSlXHOflCB8dT5zs6EDQTOxobehKRXiHNkSSdO3CdI 9811
newsplease/pipeline/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/pipelines.py sha256=9rmq95QW5pKNSY_yGF8_QNX-OgNO0OHn1Ru_f3Ci0uE 32549
newsplease/pipeline/extractor/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/article_candidate.py sha256=YSgQSUsKumDjr6D282N2nRw_N7k8xP1RyJ8-tcIhU7w 362
newsplease/pipeline/extractor/article_extractor.py sha256=nVqM3L8z3rSuXGbPE0tn9qQDuVlInPukni8H7w59EoM 2909
newsplease/pipeline/extractor/cleaner.py sha256=pfdE75LUrc3_gJLn4-5ibdKewoSZSlUumD25TgFyY4g 3909
newsplease/pipeline/extractor/comparer/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/comparer/comparer.py sha256=aSQDadWFZn2q45zLEKCCeqtuFW01BLaT_r1l6pRA6mI 1960
newsplease/pipeline/extractor/comparer/comparer_Language.py sha256=08ecx8B9nkRJ32B24X9xMhWajjaO39HEVLmCU2gBRE4 1934
newsplease/pipeline/extractor/comparer/comparer_author.py sha256=yTruRBP_4O3aRbSjFzREUCkg5_7C1CnlshzHfnssiuE 1349
newsplease/pipeline/extractor/comparer/comparer_date.py sha256=gJ6PqK2fZPkq_A9alJxCZDIVQ-yzbs8Z7T1hDgbxJl0 1274
newsplease/pipeline/extractor/comparer/comparer_description.py sha256=MieVaMIAL38VcvZPteQRrQbkKJupcWNUakoWzNUfsqA 1411
newsplease/pipeline/extractor/comparer/comparer_text.py sha256=cbRNgfpAgaOtJ_ktd5aXu5RlbQq5sLWUB4iePnvODuA 3329
newsplease/pipeline/extractor/comparer/comparer_title.py sha256=iyHdZEEi2gyQmsdYLFnuxT6q31eSna2olqjFWuyxCuI 3056
newsplease/pipeline/extractor/comparer/comparer_topimage.py sha256=mQjQ_J916FoRqBUpGITUMYYORaECIwAgxk9s8fAk6E0 1973
newsplease/pipeline/extractor/extractors/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
newsplease/pipeline/extractor/extractors/abstract_extractor.py sha256=pho-PF2T_DswaU_JK0K3ZO2Qe5V3fZwLBP1ZCr3JNIY 2011
newsplease/pipeline/extractor/extractors/date_extractor.py sha256=TBQFm3jEuXHp-I_ALH5i8prmr2J4Dbo-lr1m4uJebfE 8442
newsplease/pipeline/extractor/extractors/lang_detect_extractor.py sha256=vn4j0Ffnrs9B4U_7tq_KcEXITSRogzvSCFag6fpbtSI 2905
newsplease/pipeline/extractor/extractors/newspaper_extractor.py sha256=g-A-N6fHbrNEddaM4io_covsRkj4kEW_COFerM-ddTM 1897
newsplease/pipeline/extractor/extractors/newspaper_extractor_no_images.py sha256=wBPopCzLy7yoEH0LidttcI4swIq0TD9t1NqU42BLywc 178
newsplease/pipeline/extractor/extractors/readability_extractor.py sha256=80shyarFytLD916ZI_dUKpXN4CATa-x8_ZXgkikXhHo 1306
news_please-1.5.44.dist-info/LICENSE.txt sha256=xazLvYVG6Uw0rtJK_miaYXYn0Y7tWmxIJ35I21fCOFE 11356
news_please-1.5.44.dist-info/METADATA sha256=biMBGpXUHK01NyVK76yt-ewwIn0HKavCsmJn_LkvC9E 2405
news_please-1.5.44.dist-info/WHEEL sha256=oiQVh_5PnQM0E3gPdiz09WCNmwiHDMaGer_elqB3coM 92
news_please-1.5.44.dist-info/entry_points.txt sha256=Fpc0Ve-0092RkcfGKNjx6uxeZfFRaxZltNe4eeGn32U 111
news_please-1.5.44.dist-info/top_level.txt sha256=qaFdpp4zmVZSkpY7P4Yr4J5aP9v4D7FsU4Z_WUHEctY 11
news_please-1.5.44.dist-info/RECORD

top_level.txt

newsplease

entry_points.txt

news-please = newsplease.__main__:main
news-please-cc = newsplease.examples.commoncrawl:main