Reverse Dependencies of gcsfs
The following projects have a declared dependency on gcsfs:
- 3lc — 3LC Python Package - A tool for model-guided, interactive data debugging and enhancements
- acb — Asynchronous Component Base
- aind-data-transfer — Services for compression and transfer of aind-data to the cloud
- airbyte-source-file — Source implementation for File
- allRank — allRank is a framework for training learning-to-rank neural models
- amora — Amora Data Build Tool
- amtrak — no summary
- analytics-mesh — Facades and common functions necessary for data science and data engineering workflows
- anaml-client — Python SDK for Anaml
- anystore — Store and cache things anywhere
- apache-airflow-providers-google — Provider package apache-airflow-providers-google for Apache Airflow
- arize-phoenix — AI Observability and Evaluation
- arraylake — Python client for ArrayLake
- arti — no summary
- articat — articat: data artifact catalog
- bigframes — BigQuery DataFrames -- scalable analytics and machine learning with BigQuery
- bionic — A Python framework for building, running, and sharing data science workflows
- block-cascade — Library for model training in multi-cloud environment.
- brane — no summary
- buildflow — BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest.
- bytehub — ByteHub Timeseries Feature Store
- calitp — Shared code for the Cal-ITP data codebases
- calitp-data — Shared code for the Cal-ITP data codebases
- calitp-data-analysis — Shared code for querying Cal-ITP data in notebooks, primarily.
- calitp-data-infra — Shared code for developing data pipelines that process Cal-ITP data.
- calitp-map-utils — no summary
- catalystcoop.pudl — An open data processing pipeline for US energy data
- catalystcoop.pudl-catalog — A catalog of open data related to the US energy system.
- cdp-backend — Data storage utilities and processing pipelines to run on CDP server deployments.
- cdp-data — Data Utilities and Processing Generalized for All CDP Instances
- chelsa-cmip6 — This package contains function to create monthly high-resolution climatologies for a selected geographic area for min-, max-, and mean temperature, precipitation rate and bioclimatic variables from anomalies and using CHELSA V2.1 as baseline high resolution climatology. Only works for GCMs for which tas, tasmax, tasmin, and pr are available.
- classtree — A toolkit for hierarchical classification
- cleanvision — Find issues in image datasets
- cloudservice — Auto machine learning, deep learning library in Python.
- coiled-runtime — Simple and fast way to get started with Dask
- cromshell — Command Line Interface (CLI) for Cromwell servers
- cromshell-draft-release — Command Line Interface (CLI) for Cromwell servers
- cs-storage — A small package that is used by Compute Studio to read and write model results to google cloud storage.
- cubed — Bounded-memory serverless distributed N-dimensional array processing
- d6tflow — For data scientists and data engineers, d6tflow is a python library which makes building complex data science workflows easy, fast and intuitive.
- dapla-toolbelt — Dapla Toolbelt
- dask-bigquery — Dask + BigQuery integration
- data-describe — A Pythonic EDA Accelerator for Data Science
- data-science-common — UNDER CONSTRUCTION: A simple python library to facilitate analysis
- dataflowutil — no summary
- datapipe-core — `datapipe` is a realtime incremental ETL library for Python application
- datasett — no summary
- dbpal — A utility package for pushing around data
- deafrica-tools — Functions and algorithms for analysing Digital Earth Africa data.
- deepsensor — A Python package for modelling xarray and pandas data with neural processes.
- delta-lake-reader — Lightweight wrapper for reading Delta tables without Spark
- delta-sharing — Python Connector for Delta Sharing
- Djaizz — Artificial Intelligence (AI) in Django Applications
- dlt — dlt is an open-source python-first scalable data loading library that does not require any backend to run.
- dql-alpha — DQL
- dslibrary — Data Science Framework & Abstractions
- dvc-gs — gs plugin for dvc
- dvcx — DVCx
- easy-expectations — A package that simplifies usage of Great Expectations tool for Data Validation.
- easy-ge — A package that simplifies usage of Great Expectations tool for Data Validation.
- enigmx — enigmx package
- enrichsdk — Enrich Developer Kit
- etf-scraper — Scrape ETF and Mutual Fund holdings from major providers
- etils — Collection of common python utils
- etl-bq-tools — etl_bq_tools
- fastmeteo — Fast interpolation for ERA5 data with Zarr
- faux-data — Generate fake data from yaml templates
- fcast — A collection of python tools used for forecasting flood events and their impact on transportation infrastructure.
- Feast — Python SDK for Feast
- file-io — Deterministic File Lib to make working with Files across Object Storage easier
- findopendata — A search engine for Open Data.
- FireSpark — FireSpark data processing utility library
- flytekit — Flyte SDK for Python
- followthemoney-predict — no summary
- fondant — Fondant - Large-scale data processing made easy and reusable
- fsspec — File-system specification
- fv3config — FV3Config is used to configure and manipulate run directories for FV3GFS.
- gcp-data-ingestion — Utility Functions for Data Ingestion in GCP
- gcp-python-client-functions — One package that will handle the installation and client generation for the most used GCP Python Client Libraries + methods that avoid repetitive processes.
- gcpts — no summary
- gcscontents — A ContentsManager for managing Google Cloud APIs.
- georeader-spaceml — Lightweight reader for raster files
- ggenerator — A tool capable to generate fake data with a given specification defined as a JSON DSL
- giza-datasets — no summary
- goes-api — Python API for downloading and searching GOES-16/17 satellite data on local and cloud storage.
- google-datacatalog-rdbms-connector — Commons library for ingesting RDBMS metadata into Google Cloud Data Catalog
- graphite-datasets — tensorflow/datasets is a library of datasets ready to use with TensorFlow.
- graphium — Graphium: Scaling molecular GNNs to infinity.
- great-expectations — Always know what to expect from your data.
- great-expectations-cta — Always know what to expect from your data.
- haupt — Lineage metadata API, artifacts streams, sandbox, ML-API, and spaces for Polyaxon.
- hub-v1 — Activeloop Hub
- hydrafloods — HYDrologic Remote sensing Analysis for Floods
- hydromt — HydroMT: Automated and reproducible model building and analysis.
- hyperdb — Hyperdb provides wrapper functions for working with Tableau hyper datasources and moving data between Tableau Server, Google Cloud Platform and Microsoft Azure through a common interface
- idg-metadata-client — Ingestion Framework for OpenMetadata
- instackup — A package to ease interaction with cloud services, DB connections and commonly used functionalities in data analytics.
- jesspack — Project Description
- joblibgcs — a google cloud storage memory backend for joblib
- jouissance — jouissance