Stats

Visualize distribution of records and environmental parameters from OBIS data in various forms.

pydwcviz.stats.get_records(scientificname=None, taxonid=None, areaid=None, datasetid=None, nodeid=None, startdate=None, enddate=None, startdepth=None, enddepth=None, geometry=None, redlist=None, hab=None, wrims=None, dropped=None, absence=None, flags=None, exclude=None, **kwargs)

Get basic statistics for occurrence records.

Parameters
  • scientificname – [string] Scientific name. Leave empty to include all taxa.

  • taxonid – [string] Taxon AphiaID.

  • areaid – [string] Area ID.

  • datasetid – [string] Dataset UUID.

  • nodeid – [string] Node UUID.

  • startdate – [string] Start date formatted as YYYY-MM-DD.

  • enddate – [string] End date formatted as YYYY-MM-DD.

  • startdepth – [integer] Start depth, in meters.

  • enddepth – [integer] End depth, in meters.

  • geometry – [string] Geometry, formatted as WKT or GeoHash.

  • redlist – [boolean] Red List species only, true/false.

  • hab – [boolean] HAB species only, true/false.

  • wrims – [boolean] WRiMS species only, true/false.

  • dropped – [string] Include dropped records (include) or get dropped records exclusively (true).

  • absence – [string] Include absence records (include) or get absence records exclusively (true).

  • flags – [string] Comma separated list of quality flags which need to be set.

  • exclude – [string] Comma separated list of quality flags to be excluded.

Usage:

from pydwcviz import stats
stats.get_records(scientificname="Mola mola")
pydwcviz.stats.get_qc(scientificname=None, taxonid=None, areaid=None, datasetid=None, nodeid=None, startdate=None, enddate=None, startdepth=None, enddepth=None, geometry=None, redlist=None, hab=None, wrims=None, dropped=None, absence=None, flags=None, exclude=None, **kwargs)

Get a QC summary, including missing or invalid values, number of records on land, number of non marine records and number of records without Aphia ID.

Parameters
  • scientificname – [string] Scientific name. Leave empty to include all taxa.

  • taxonid – [string] Taxon AphiaID.

  • areaid – [string] Area ID.

  • datasetid – [string] Dataset UUID.

  • nodeid – [string] Node UUID.

  • startdate – [string] Start date formatted as YYYY-MM-DD.

  • enddate – [string] End date formatted as YYYY-MM-DD.

  • startdepth – [integer] Start depth, in meters.

  • enddepth – [integer] End depth, in meters.

  • geometry – [string] Geometry, formatted as WKT or GeoHash.

  • redlist – [boolean] Red List species only, true/false.

  • hab – [boolean] HAB species only, true/false.

  • wrims – [boolean] WRiMS species only, true/false.

  • dropped – [string] Include dropped records (include) or get dropped records exclusively (true).

  • absence – [string] Include absence records (include) or get absence records exclusively (true).

  • flags – [string] Comma separated list of quality flags which need to be set.

  • exclude – [string] Comma separated list of quality flags to be excluded.

Usage:

from pydwcviz import stats
stats.get_qc(scientificname="Mola mola")
pydwcviz.stats.get_env(scientificname=None, taxonid=None, areaid=None, datasetid=None, nodeid=None, startdate=None, enddate=None, startdepth=None, enddepth=None, geometry=None, redlist=None, hab=None, wrims=None, dropped=None, absence=None, flags=None, exclude=None, **kwargs)

Get number of records per SST, SSS or depth bin.

Parameters
  • scientificname – [string] Scientific name. Leave empty to include all taxa.

  • taxonid – [string] Taxon AphiaID.

  • areaid – [string] Area ID.

  • datasetid – [string] Dataset UUID.

  • nodeid – [string] Node UUID.

  • startdate – [string] Start date formatted as YYYY-MM-DD.

  • enddate – [string] End date formatted as YYYY-MM-DD.

  • startdepth – [integer] Start depth, in meters.

  • enddepth – [integer] End depth, in meters.

  • geometry – [string] Geometry, formatted as WKT or GeoHash.

  • redlist – [boolean] Red List species only, true/false.

  • hab – [boolean] HAB species only, true/false.

  • wrims – [boolean] WRiMS species only, true/false.

  • dropped – [string] Include dropped records (include) or get dropped records exclusively (true).

  • absence – [string] Include absence records (include) or get absence records exclusively (true).

  • flags – [string] Comma separated list of quality flags which need to be set.

  • exclude – [string] Comma separated list of quality flags to be excluded.

Usage:

from pydwcviz import stats
stats.get_env(scientificname="Mola mola")
pydwcviz.stats.get_years(scientificname=None, taxonid=None, areaid=None, datasetid=None, nodeid=None, startdate=None, enddate=None, startdepth=None, enddepth=None, geometry=None, redlist=None, hab=None, wrims=None, dropped=None, absence=None, flags=None, exclude=None, **kwargs)

Get number of presence records per year.

Parameters
  • scientificname – [string] Scientific name. Leave empty to include all taxa.

  • taxonid – [string] Taxon AphiaID.

  • areaid – [string] Area ID.

  • datasetid – [string] Dataset UUID.

  • nodeid – [string] Node UUID.

  • startdate – [string] Start date formatted as YYYY-MM-DD.

  • enddate – [string] End date formatted as YYYY-MM-DD.

  • startdepth – [integer] Start depth, in meters.

  • enddepth – [integer] End depth, in meters.

  • geometry – [string] Geometry, formatted as WKT or GeoHash.

  • redlist – [boolean] Red List species only, true/false.

  • hab – [boolean] HAB species only, true/false.

  • wrims – [boolean] WRiMS species only, true/false.

  • dropped – [string] Include dropped records (include) or get dropped records exclusively (true).

  • absence – [string] Include absence records (include) or get absence records exclusively (true).

  • flags – [string] Comma separated list of quality flags which need to be set.

  • exclude – [string] Comma separated list of quality flags to be excluded.

Usage:

from pydwcviz import stats
stats.get_years(scientificname="Mola mola")
pydwcviz.stats.get_composition(scientificname=None, taxonid=None, areaid=None, datasetid=None, nodeid=None, startdate=None, enddate=None, startdepth=None, enddepth=None, geometry=None, redlist=None, hab=None, wrims=None, dropped=None, absence=None, flags=None, exclude=None, **kwargs)

Get an overview of taxonomic composition.

Parameters
  • scientificname – [string] Scientific name. Leave empty to include all taxa.

  • taxonid – [string] Taxon AphiaID.

  • areaid – [string] Area ID.

  • datasetid – [string] Dataset UUID.

  • nodeid – [string] Node UUID.

  • startdate – [string] Start date formatted as YYYY-MM-DD.

  • enddate – [string] End date formatted as YYYY-MM-DD.

  • startdepth – [integer] Start depth, in meters.

  • enddepth – [integer] End depth, in meters.

  • geometry – [string] Geometry, formatted as WKT or GeoHash.

  • redlist – [boolean] Red List species only, true/false.

  • hab – [boolean] HAB species only, true/false.

  • wrims – [boolean] WRiMS species only, true/false.

  • dropped – [string] Include dropped records (include) or get dropped records exclusively (true).

  • absence – [string] Include absence records (include) or get absence records exclusively (true).

  • flags – [string] Comma separated list of quality flags which need to be set.

  • exclude – [string] Comma separated list of quality flags to be excluded.

Usage:

from pydwcviz import stats
stats.get_composition(scientificname="Mola mola")
pydwcviz.stats.dist_years(data, interactive=False, **kwargs)

Get a bar graph of distribution of number of records per year

Parameters

data – [Dict] Ingest data grabbed from get_years() function.

Returns

a matplotlib Axes object or plotly Figure object

Usage:

from pydwcviz import stats
# return a matplotlib.pyplot plot when interactive=False
stats.dist_years(stats.get_years(taxonid = 1071), interactive=False)

# return a plotly object when interactive=True
stats.dist_years(stats.get_years(taxonid = 1071), interactive=True)
pydwcviz.stats.dist_env(data, parameter, interactive=False, **kwargs)

Get a distribution of environmental parameters: SST, SSS and depth

Parameters
  • data – [Dict] Ingest data grabbed from get_env() function.

  • parameter – [String] One of “sst”, “sss”, or “depth” to visualize its distribution

Returns

a matplotlib Axes object or plotly Figure object

Usage:

from pydwcviz import stats
# return a matplotlib.pyplot Axes object when interactive=False
stats.dist_env(stats.get_env(taxonid = 1071), parameter="sst", interactive=False)

# return a plotly object when interactive=True
stats.dist_env(stats.get_years(taxonid = 1071), parameter="sss", interactive=True)