Datasets

The Galaxy Witness project provides a set of datasets and a TAP service to query them.

Currently, you can use several datasets and TAP service. Soon, we plan to add more datasets containing a large corpus of data.

Available datasets

In this section, we describe the datasets available in the terminal user interface and their characteristics.

  • Galaxies_400K (Galaxies_400K.csv)

  • Galaxies_1KK (Galaxies_1KK.csv)

TAP service

The TAP service is a web service that allows you to query the available datasets. In this section, we describe the TAP services available in the tterminal user interface and their characteristics.

Programming interface for datasets

class galaxywitness.datasets.Dataset(name: str)[source]

Class to handle prepared datasets

Parameters:

name (str) – name of dataset

add_new_dataset(name: str, url: str) None[source]

Add new dataset by name and by URL where it can be retrieved

Parameters:
  • name (str) – name of dataset

  • name – URL

change_dataset_to(name: str) None[source]

Change current dataset to another by name

Parameters:

name (str) – name of dataset

download(chunk_size=1024) None[source]

Download current prepared dataset

download_via_tap(size: int = 100000) None[source]

Download current prepared dataset via TAP

Parameters:

size (int) – size of dataset