cc-net

View on PyPIReverse Dependencies (0)

0.0.0 cc_net-0.0.0-py3-none-any.whl

Wheel Details

Project: cc-net
Version: 0.0.0
Filename: cc_net-0.0.0-py3-none-any.whl
Download: [link]
Size: 61383
MD5: f8bd3705e5e90fc3cc0889d2d52f587d
SHA256: 18f59649809d840187b0ebeda2646ff5bc538a03218bde3826ad8434729cb125
Uploaded: 2019-10-30 22:07:57 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: cc-net
Version: 0.0.0
Summary: Tools to download and clean Common Crawl
Author: Guillaume Wenzek
Author-Email: guw[at]fb.com
Home-Page: https://github.com/facebookresearch/cc_net
Project-Url: Bug Tracker, https://github.com/facebookresearch/cc_net/issues
Project-Url: Source Code, https://github.com/facebookresearch/cc_net
Keywords: common crawl dataset
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python :: 3.7
Requires-Python: >=3.7
Requires-Dist: beautifulsoup4 (>=4.7.1)
Requires-Dist: pandas (>=0.23.4)
Requires-Dist: requests (>=2.22.0)
Requires-Dist: fasttext (>=0.9.1)
Requires-Dist: sentencepiece (>=0.1.82)
Requires-Dist: func-argparse (>=1.0.3)
Requires-Dist: sacremoses
Requires-Dist: mypy (>=0.730); extra == "dev"
Requires-Dist: pytest; extra == "dev"
Requires-Dist: black; extra == "dev"
Requires-Dist: isort; extra == "dev"
Requires-Dist: submitit; extra == "slurm"
Requires-Dist: sentence-splitter; extra == "tools"
Provides-Extra: dev
Provides-Extra: slurm
Provides-Extra: tools
[No description]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.32.3)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
cc_net/__init__.py sha256=JcPipJgiSziQWRMpEpG6fli3FLSWJ0UGhGuoknzEHRw 198
cc_net/__main__.py sha256=RIaYC1kd1YVdlwEAcyccVhvDFXoGZ4bzpj5wlL2tfbY 752
cc_net/dedup.py sha256=Hdcq71K2x_O_VuH2m41o_H7Wa_yHqrZAbsgOA3gOhOk 17323
cc_net/execution.py sha256=OrA9HkhGOxQI7oi9HifSC2j8JFI6u3wcZNVCEgukqVc 5274
cc_net/flat_hash_set.py sha256=UC592PMjEHk0rHBTgx-3zOn2vPP3sEyh3PzxBs4k_58 7425
cc_net/get_wiki_cirrus.py sha256=QQQhkujOHUNET4OehYEcnd5VcIv6rgY16bhQewawIiA 3798
cc_net/jsonql.py sha256=vuz4UOrMGgSTyJk3GLamZ8_FY8LObtwJ8qHuLEWYW1M 42262
cc_net/mine.py sha256=4Od3wYd8xIvq-CNdmz9HbUCCMEy2nUp9FUbNUvpl5Sw 17837
cc_net/minify.py sha256=M1Ji3f8LpmylXAaqWpQIHGXPzrUQ0g9nZMxz3xkG3fo 11276
cc_net/perplexity.py sha256=1j3TdKtZsuZNSJmOVCMimCQv3yK9_NUHGXHrB90fPYs 10943
cc_net/process_wet_file.py sha256=aal1c1s9ks1sZEPzqEufBZfkWjZQsrV3A4hmjumwN4Y 6293
cc_net/regroup.py sha256=vsAZVnBUMgF4NekF4uW9fMCMowSTu25Xd09XuDc1jOc 3511
cc_net/split_by_lang.py sha256=ihNv_TdINdKNNXZwVDKYdhXCCMHX5GnSGZT0mU5V93M 4760
cc_net/text_normalizer.py sha256=zZDEjm7nKXmFxdmsPWsJhyI2o2plG3s9a0J9EU3Fe3M 4524
cc_net/tokenizer.py sha256=xZPBSU5z-thHGHGSr2q3h_Qt3zPIloBAP8NCx0s7qxI 2429
cc_net/data/cutoff.csv sha256=YqL_PFh6EQYQ3X2HUHEsu6naCAiM5PFYeA00msNdQw0 21364
cc_net/data/test_stats.json sha256=8MLuKTB_LLMWC5Ezrx3ttzAGWDCa164So9-ZFtQ-xj4 1173
cc_net-0.0.0.dist-info/LICENSE sha256=o3spUuXPrgFTnlk6ZBY4-6pNXuot5nNmI60b3yGOYEE 19338
cc_net-0.0.0.dist-info/METADATA sha256=mCt01okdpgnF4l_fXTv8H7nEix1BbwWLY-mCB-XHJcI 1110
cc_net-0.0.0.dist-info/WHEEL sha256=_NOXIqFgOaYmlm9RJLPQZ13BJuEIrp5jx5ptRD5uh3Y 92
cc_net-0.0.0.dist-info/top_level.txt sha256=TGXdHaLHZU9hFVEQ3P0A6X7TVYl0JxJUBAXRaAhpQqw 7
cc_net-0.0.0.dist-info/RECORD

top_level.txt

cc_net