Documentation for aalibrary.utils
The utils submodule provides powerful and necessary functions for interacting with cloud providers. These functions include obtaining meta-metadata about the data files, such as how many files exist in a particular cruise in NCEI and more...
Modules:
| Name | Description |
|---|---|
cloud_utils |
This file contains all utility functions for Active Acoustics. |
discrepancies |
This file is used to identify discrepancies between what data exists on |
frequency_data |
This module contains the FrequencyData class. |
gcp_utils |
This file contains code pertaining to auxiliary functions related to parsing |
helpers |
For helper functions. |
ices |
|
nc_reader |
This file is used to get header information out of a NetCDF file. The |
ncei_cache_daily_script |
Script to get all objects in the NCEI S3 bucket and cache it to BigQuery. |
ncei_utils |
This file contains code pertaining to auxiliary functions related to parsing |
sonar_checker |
|
timings |
"This script deals with the times associated with ingesting/preprocessing |
cloud_utils
This file contains all utility functions for Active Acoustics.
Functions:
| Name | Description |
|---|---|
bq_query_to_pandas |
Takes a SQL query and returns the end result as a DataFrame. |
check_existence_of_supplemental_files |
Checks the existence of supplemental files (idx, bot, etc.) for a raw |
check_if_file_exists_in_gcp |
Checks whether a particular file exists in GCP using the file path |
check_if_file_exists_in_s3 |
Checks to see if a file exists in an s3 bucket. Intended for use with |
check_if_netcdf_file_exists_in_gcp |
Checks if a netcdf file exists in GCP storage. If the bucket location is |
count_objects_in_s3_bucket_location |
Counts the number of objects within a bucket location. |
count_subdirectories_in_s3_bucket_location |
Counts the number of subdirectories within a bucket location. |
create_s3_objs |
Creates the s3 objects needed for using boto3 for a particular bucket. |
delete_file_from_gcp |
Deletes a file from the storage bucket. |
download_file_from_gcp |
Downloads a file from the blob storage bucket. |
download_file_from_gcp_as_string |
Downloads a file from the blob storage bucket as a text string. |
get_data_lake_directory_client |
Creates a data lake directory client. Returns an object of type |
get_object_key_for_s3 |
Creates an object key for a file within s3 given the parameters above. |
get_service_client_sas |
Gets an azure service client using an SAS (shared access signature) |
get_subdirectories_in_s3_bucket_location |
Gets a list of all the subdirectories in a specific bucket location |
list_all_folders_in_gcp_bucket_location |
Lists all of the folders in a GCP storage bucket location. |
list_all_objects_in_gcp_bucket_location |
Gets all of the files within a GCP storage bucket location. |
list_all_objects_in_s3_bucket_location |
Lists all of the objects in a s3 bucket location denoted by |
setup_gbq_client_objs |
Sets up Google Big Query client objects used to execute queries and |
setup_gcp_storage_objs |
Sets up Google Cloud Platform storage objects for use in accessing and |
upload_file_to_gcp_bucket |
Uploads a file to the blob storage bucket. |
bq_query_to_pandas(client=None, query='')
Takes a SQL query and returns the end result as a DataFrame.
check_existence_of_supplemental_files(file_name='', file_type='raw', ship_name='', survey_name='', echosounder='', debug=False)
Checks the existence of supplemental files (idx, bot, etc.) for a raw file. Will check for existence in all data sources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
str
|
The file name (includes extension). Defaults to "". |
''
|
file_type
|
str
|
The file type (do not include the dot "."). Defaults to "". |
'raw'
|
ship_name
|
str
|
The ship name associated with this survey. Defaults to "". |
''
|
survey_name
|
str
|
The survey name/identifier. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used to gather the data. Defaults to "". |
''
|
debug
|
bool
|
Whether or not to print debug statements. Defaults to False. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
RawFile |
RawFile
|
Returns a RawFile object, existence can be accessed as a boolean via the variable within. Ex. rf.idx_file_exists_in_ncei |
Source code in src\aalibrary\utils\cloud_utils.py
check_if_file_exists_in_gcp(bucket=None, file_path='')
Checks whether a particular file exists in GCP using the file path (blob).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bucket
|
Bucket
|
The bucket object used to check for the file. Defaults to None. |
None
|
file_path
|
str
|
The blob file path within the bucket. Defaults to "". |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
Bool |
bool
|
True if the file already exists, False otherwise. |
Source code in src\aalibrary\utils\cloud_utils.py
check_if_file_exists_in_s3(object_key='', s3_resource=None, s3_bucket_name='')
Checks to see if a file exists in an s3 bucket. Intended for use with NCEI, but will work with other s3 buckets as well.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
object_key
|
str
|
The object key (location of the object). Defaults to "". |
''
|
s3_resource
|
resource
|
The boto3 resource for this particular bucket. Defaults to None. |
None
|
s3_bucket_name
|
str
|
The bucket name. Defaults to "". |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the file exists within the bucket. False otherwise. |
Source code in src\aalibrary\utils\cloud_utils.py
check_if_netcdf_file_exists_in_gcp(file_name='', ship_name='', survey_name='', echosounder='', data_source='', gcp_storage_bucket_location='', gcp_bucket=None, debug=False)
Checks if a netcdf file exists in GCP storage. If the bucket location is not specified, it will use the helpers to parse the correct location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
str
|
The file name (includes extension). Defaults to "". |
''
|
ship_name
|
str
|
The ship name associated with this survey. Defaults to "". |
''
|
survey_name
|
str
|
The survey name/identifier. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used to gather the data. Defaults to "". |
''
|
data_source
|
str
|
The source of the file. Necessary due to the way the storage bucket is organized. Can be one of ["NCEI", "OMAO", "HDD"]. Defaults to "". |
''
|
gcp_storage_bucket_location
|
str
|
The string representing the blob's location within the storage bucket. Defaults to "". |
''
|
gcp_bucket
|
Bucket
|
The bucket object used for downloading. |
None
|
debug
|
bool
|
Whether or not to print debug statements. Defaults to False. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if the file exists in GCP, False otherwise. |
Source code in src\aalibrary\utils\cloud_utils.py
count_objects_in_s3_bucket_location(prefix='', bucket=None)
Counts the number of objects within a bucket location. NOTE: This DOES NOT include folders, as those do not count as objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
The bucket location. Defaults to "". |
''
|
bucket
|
resource
|
The bucket resource object. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The count of objects within the location. |
Source code in src\aalibrary\utils\cloud_utils.py
count_subdirectories_in_s3_bucket_location(prefix='', bucket=None)
Counts the number of subdirectories within a bucket location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
The bucket location. Defaults to "". |
''
|
bucket
|
resource
|
The bucket resource object. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The count of subdirectories within the location. |
Source code in src\aalibrary\utils\cloud_utils.py
create_s3_objs(bucket_name='noaa-wcsd-pds')
Creates the s3 objects needed for using boto3 for a particular bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bucket_name
|
str
|
The bucket you want to refer to. The default points to the NCEI bucket. Defaults to "noaa-wcsd-pds". |
'noaa-wcsd-pds'
|
Returns:
| Name | Type | Description |
|---|---|---|
Tuple |
Tuple
|
The s3 client (used for certain portions of the boto3 api), the s3 resource (newer, more used object for accessing s3 buckets), and the actual s3 bucket itself. |
Source code in src\aalibrary\utils\cloud_utils.py
delete_file_from_gcp(gcp_bucket, blob_file_path)
Deletes a file from the storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gcp_bucket
|
bucket
|
The bucket object used for downloading from. |
required |
blob_file_path
|
str
|
The blob's file path. Ex. "data/itds/logs/execute_rasp_ii/temp.csv" NOTE: This must include the file name as well as the extension. |
required |
Raises: AssertionError: If the file does not exist in GCP. Exception: If there is an error deleting the file.
Source code in src\aalibrary\utils\cloud_utils.py
download_file_from_gcp(gcp_bucket, blob_file_path, local_file_path, debug=False)
Downloads a file from the blob storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gcp_bucket
|
bucket
|
The bucket object used for downloading from. |
required |
blob_file_path
|
str
|
The blob's file path. Ex. "data/itds/logs/execute_rasp_ii/temp.csv" NOTE: This must include the file name as well as the extension. |
required |
local_file_path
|
str
|
The local file path you wish to download the blob to. |
required |
debug
|
bool
|
Whether or not to print debug statements. |
False
|
Source code in src\aalibrary\utils\cloud_utils.py
download_file_from_gcp_as_string(gcp_bucket, blob_file_path)
Downloads a file from the blob storage bucket as a text string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gcp_bucket
|
bucket
|
The bucket object used for downloading from. |
required |
blob_file_path
|
str
|
The blob's file path. Ex. "data/itds/logs/execute_rasp_ii/temp.csv" NOTE: This must include the file name as well as the extension. |
required |
Source code in src\aalibrary\utils\cloud_utils.py
get_data_lake_directory_client(config_file_path='')
Creates a data lake directory client. Returns an object of type DataLakeServiceClient.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_file_path
|
str
|
The location of the config file.
Needs a |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
DataLakeServiceClient |
DataLakeServiceClient
|
An object of type DataLakeServiceClient, with connection to the connection string described in the config. |
Source code in src\aalibrary\utils\cloud_utils.py
get_object_key_for_s3(file_url='', file_name='', ship_name='', survey_name='', echosounder='')
Creates an object key for a file within s3 given the parameters above.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_url
|
str
|
The entire url to the file resource in s3. Starts with "https://" or "s3://". Defaults to "". NOTE: If this is specified, there is no need to provide the other parameters. |
''
|
file_name
|
str
|
The file name (includes extension). Defaults to "". |
''
|
ship_name
|
str
|
The ship name associated with this survey. Defaults to "". |
''
|
survey_name
|
str
|
The survey name/identifier. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used to gather the data. Defaults to "". |
''
|
Source code in src\aalibrary\utils\cloud_utils.py
get_service_client_sas(account_name, sas_token)
Gets an azure service client using an SAS (shared access signature) token. The token must be created in Azure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
account_name
|
str
|
The name of the account you are trying to create a service client with. This is usually a storage account that is attached to the container. |
required |
sas_token
|
str
|
The complete SAS token. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DataLakeServiceClient |
DataLakeServiceClient
|
An object of type DataLakeServiceClient, with connection to the container/file the SAS allows access to. |
Source code in src\aalibrary\utils\cloud_utils.py
get_subdirectories_in_s3_bucket_location(prefix='', s3_client=None, return_full_paths=False, bucket_name='noaa-wcsd-pds')
Gets a list of all the subdirectories in a specific bucket location (called a prefix). The return can be with full paths (root to folder inclusive), or just the folder names.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
The bucket folder location. Defaults to "". |
''
|
s3_client
|
client
|
The bucket client object. Defaults to None. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
bucket_name
|
str
|
The bucket name. Defaults to "noaa-wcsd-pds". |
'noaa-wcsd-pds'
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the subdirectory. Whether
these are full paths or just folder names are specified by the
|
Source code in src\aalibrary\utils\cloud_utils.py
list_all_folders_in_gcp_bucket_location(location='', gcp_bucket=None, return_full_paths=True)
Lists all of the folders in a GCP storage bucket location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
location
|
str
|
The blob location you would like to get the folders of. Defaults to "". |
''
|
gcp_bucket
|
bucket
|
The gcp bucket to use. Defaults to None. |
None
|
return_full_paths
|
bool
|
Whether or not to return full paths. Defaults to True. |
True
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing the folder names or full paths. |
Source code in src\aalibrary\utils\cloud_utils.py
list_all_objects_in_gcp_bucket_location(location='', gcp_bucket=None)
Gets all of the files within a GCP storage bucket location.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
location
|
str
|
The location to search for files. Defaults to "". Ex. "NCEI/Reuben_Lasker/RL2107" |
''
|
gcp_bucket
|
bucket
|
The gcp bucket to use. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing all URIs for each file in the bucket. |
Source code in src\aalibrary\utils\cloud_utils.py
list_all_objects_in_s3_bucket_location(prefix='', s3_resource=None, return_full_paths=False, bucket_name='noaa-wcsd-pds')
Lists all of the objects in a s3 bucket location denoted by prefix.
Returns a list containing str. You get full paths if you specify the
return_full_paths parameter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prefix
|
str
|
The bucket location. Defaults to "". |
''
|
s3_resource
|
resource
|
The bucket resource object. Defaults to None. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
bucket_name
|
str
|
The bucket name. Defaults to "noaa-wcsd-pds". |
'noaa-wcsd-pds'
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing either the objects name or
path, dependent on the |
Source code in src\aalibrary\utils\cloud_utils.py
setup_gbq_client_objs(location='US', project_id='ggn-nmfs-aa-dev-1')
Sets up Google Big Query client objects used to execute queries and such.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
location
|
str
|
The location of the big-query tables/database. This is usually set when creating the database in big query. Defaults to "US". |
'US'
|
project_id
|
str
|
The project id that the big query instance belongs to. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
Returns:
| Name | Type | Description |
|---|---|---|
Tuple |
Tuple[Client, GCSFileSystem]
|
The big query client object, along with an object for the Google Cloud Storage file system. |
Source code in src\aalibrary\utils\cloud_utils.py
setup_gcp_storage_objs(project_id='ggn-nmfs-aa-dev-1', gcp_bucket_name='ggn-nmfs-aa-dev-1-data')
Sets up Google Cloud Platform storage objects for use in accessing and modifying storage buckets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
project_id
|
str
|
The project id of the project you want to access. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
gcp_bucket_name
|
str
|
The name of the exact bucket you want to access. Defaults to "ggn-nmfs-aa-dev-1-data". |
'ggn-nmfs-aa-dev-1-data'
|
Returns:
| Type | Description |
|---|---|
Tuple[Client, str, bucket]
|
Tuple[storage.Client, str, storage.Client.bucket]: The storage client, followed by the GCP bucket name (str) and then the actual bucket object itself (which will be executing the commands used in this api). |
Source code in src\aalibrary\utils\cloud_utils.py
upload_file_to_gcp_bucket(bucket, blob_file_path, local_file_path, debug=False)
Uploads a file to the blob storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bucket
|
bucket
|
The bucket object used for uploading. |
required |
blob_file_path
|
str
|
The blob's file path. Ex. "data/itds/logs/execute_code_files/temp.csv" NOTE: This must include the file name as well as the extension. |
required |
local_file_path
|
str
|
The local file path you wish to upload to the blob. |
required |
debug
|
bool
|
Whether or not to print debug statements. |
False
|
Source code in src\aalibrary\utils\cloud_utils.py
discrepancies
This file is used to identify discrepancies between what data exists on local versus what exists on the cloud. It considers the following things when comparing: * Number of files per cruise * File Name/Types * File Sizes * Checksum
Functions:
| Name | Description |
|---|---|
compare_local_cruise_files_to_cloud |
Compares the locally stored cruise files (per echosounder) to what |
get_local_file_size |
Gets the size of a local file in bytes. |
get_local_sha256_checksum |
Calculates the SHA256 checksum of a file. |
compare_local_cruise_files_to_cloud(local_cruise_file_path='', ship_name='', survey_name='', echosounder='')
Compares the locally stored cruise files (per echosounder) to what exists on the cloud by number of files, file sizes, and checksums. Reports any discrepancies in the console.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local_cruise_file_path
|
str
|
The folder path for the locally stored cruise data. Defaults to "". |
''
|
ship_name
|
str
|
The ship name that the cruise falls under. Defaults to "". |
''
|
survey_name
|
str
|
The survey/cruise name. Defaults to "". |
''
|
echosounder
|
str
|
The specific echosounder you want to check. Defaults to "". |
''
|
Source code in src\aalibrary\utils\discrepancies.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | |
get_local_file_size(local_file_path)
Gets the size of a local file in bytes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local_file_path
|
str
|
The local file path. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The size of the file in bytes. |
Source code in src\aalibrary\utils\discrepancies.py
get_local_sha256_checksum(local_file_path, chunk_size=65536)
Calculates the SHA256 checksum of a file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
local_file_path
|
str
|
The path to the file. |
required |
chunk_size
|
int
|
The size of chunks to read the file in (in bytes). Larger chunks can be more efficient for large files. |
65536
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The SHA256 checksum of the file as a hexadecimal string. |
Source code in src\aalibrary\utils\discrepancies.py
frequency_data
This module contains the FrequencyData class.
Classes:
| Name | Description |
|---|---|
FrequencyData |
Given some dataset 'Sv', list all frequencies available. This class |
Functions:
| Name | Description |
|---|---|
main |
Opens a sample netCDF file and constructs a FrequencyData object to |
FrequencyData
Given some dataset 'Sv', list all frequencies available. This class offers methods which help map out frequencies and channels plus additional utilities.
Methods:
| Name | Description |
|---|---|
__init__ |
Initializes class object and parses the frequencies available |
construct_frequency_list |
Parses the frequencies available in the xarray 'Sv' |
construct_frequency_map |
Either using a channel_list or a frequency_list this function |
construct_frequency_pair_combination_list |
Returns a list of tuple elements containing frequency combinations |
construct_frequency_set_combination_list |
Constructs a list of available frequency set permutations. |
powerset |
Generates combinations of elements of iterables ; |
print_frequency_list |
Prints each frequency element available in Sv. |
print_frequency_pair_combination_list |
Prints frequency combination list one element at a time. |
print_frequency_set_combination_list |
Prints frequency combination list one element at a time. |
Source code in src\aalibrary\utils\frequency_data.py
15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | |
__init__(Sv)
Initializes class object and parses the frequencies available within the echodata object (xarray.Dataset) 'Sv'.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
Sv
|
Dataset
|
The 'Sv' echodata object. |
required |
Source code in src\aalibrary\utils\frequency_data.py
construct_frequency_list()
Parses the frequencies available in the xarray 'Sv'
Source code in src\aalibrary\utils\frequency_data.py
construct_frequency_map(frequencies_provided=True)
Either using a channel_list or a frequency_list this function provides one which satisfies all requirements of this class structure. In particular the channels and frequencies involved have to be known and mapped to one another.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
frequencies_provided
|
boolean
|
was a frequency_list provided at object creation? If so then 'True' if a channel_list instead was used then 'False'. |
True
|
Source code in src\aalibrary\utils\frequency_data.py
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 | |
construct_frequency_pair_combination_list()
Returns a list of tuple elements containing frequency combinations which is useful for the KMeansOperator class.
Returns:
| Type | Description |
|---|---|
List[Tuple]
|
list |
Source code in src\aalibrary\utils\frequency_data.py
construct_frequency_set_combination_list()
Constructs a list of available frequency set permutations. Example : [ ('18 kHz',), ('38 kHz',), ('120 kHz',), ('200 kHz',), ('18 kHz', '38 kHz'), ('18 kHz', '120 kHz'), ('18 kHz', '200 kHz'), ('38 kHz', '120 kHz'), ('38 kHz', '200 kHz'), ('120 kHz', '200 kHz'), ('18 kHz', '38 kHz', '120 kHz'), ('18 kHz', '38 kHz', '200 kHz'),('18 kHz', '120 kHz', '200 kHz'), ('38 kHz', '120 kHz', '200 kHz'), ('18 kHz', '38 kHz', '120 kHz', '200 kHz')]
Returns:
| Type | Description |
|---|---|
List[Tuple]
|
list |
Source code in src\aalibrary\utils\frequency_data.py
powerset(iterable)
Generates combinations of elements of iterables ; powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
iterable
|
_type_
|
A list. |
required |
Returns combinations of elements of iterables.
Source code in src\aalibrary\utils\frequency_data.py
print_frequency_list()
Prints each frequency element available in Sv.
Source code in src\aalibrary\utils\frequency_data.py
print_frequency_pair_combination_list()
Prints frequency combination list one element at a time.
Source code in src\aalibrary\utils\frequency_data.py
print_frequency_set_combination_list()
Prints frequency combination list one element at a time.
Source code in src\aalibrary\utils\frequency_data.py
main()
Opens a sample netCDF file and constructs a FrequencyData object to extract frequency information from it.
Source code in src\aalibrary\utils\frequency_data.py
gcp_utils
This file contains code pertaining to auxiliary functions related to parsing through our google storage bucket.
Functions:
| Name | Description |
|---|---|
get_all_echosounders_in_a_survey_in_storage_bucket |
Gets all of the echosounders in a survey in a GCP storage bucket. |
get_all_ship_names_in_gcp_bucket |
Gets all of the ship names within a GCP storage bucket. |
get_all_survey_names_from_a_ship_in_storage_bucket |
Gets all of the survey names from a particular ship in a GCP storage |
get_all_surveys_in_storage_bucket |
Gets all of the surveys in a GCP storage bucket. |
get_all_echosounders_in_a_survey_in_storage_bucket(ship_name='', survey_name='', project_id='ggn-nmfs-aa-dev-1', gcp_bucket_name='ggn-nmfs-aa-dev-1-data', gcp_bucket=None, return_full_paths=False)
Gets all of the echosounders in a survey in a GCP storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys from. Will get normalized to GCP standards. Defaults to None. |
''
|
survey_name
|
str
|
The survey name/identifier. Defaults to "". |
''
|
project_id
|
str
|
The GCP project ID that the storage bucket resides in. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
gcp_bucket_name
|
str
|
The GCP storage bucket name. Defaults to "ggn-nmfs-aa-dev-1-data". |
'ggn-nmfs-aa-dev-1-data'
|
gcp_bucket
|
bucket
|
The GCP storage bucket
client object.
If none, one will be created for you based on the |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the survey names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing the echosounder names that exist in a survey. |
Source code in src\aalibrary\utils\gcp_utils.py
get_all_ship_names_in_gcp_bucket(project_id='ggn-nmfs-aa-dev-1', gcp_bucket_name='ggn-nmfs-aa-dev-1-data', gcp_bucket=None, return_full_paths=False)
Gets all of the ship names within a GCP storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
project_id
|
str
|
The GCP project ID that the storage bucket resides in. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
gcp_bucket_name
|
str
|
The GCP storage bucket name. Defaults to "ggn-nmfs-aa-dev-1-data". |
'ggn-nmfs-aa-dev-1-data'
|
gcp_bucket
|
bucket
|
The GCP storage bucket
client object.
If none, one will be created for you based on the |
None
|
return_full_paths
|
bool
|
Whether or not you want a full
path from bucket root to the subdirectory returned. Set to false
if you only want the subdirectory names listed. Defaults to False.
NOTE: You can set this parameter to |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing the ship names. |
Source code in src\aalibrary\utils\gcp_utils.py
get_all_survey_names_from_a_ship_in_storage_bucket(ship_name='', project_id='ggn-nmfs-aa-dev-1', gcp_bucket_name='ggn-nmfs-aa-dev-1-data', gcp_bucket=None, return_full_paths=False)
Gets all of the survey names from a particular ship in a GCP storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys from. Will get normalized to GCP standards. Defaults to None. |
''
|
project_id
|
str
|
The GCP project ID that the storage bucket resides in. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
gcp_bucket_name
|
str
|
The GCP storage bucket name. Defaults to "ggn-nmfs-aa-dev-1-data". |
'ggn-nmfs-aa-dev-1-data'
|
gcp_bucket
|
bucket
|
The GCP storage bucket
client object.
If none, one will be created for you based on the |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the survey names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing the survey names. |
Source code in src\aalibrary\utils\gcp_utils.py
get_all_surveys_in_storage_bucket(project_id='ggn-nmfs-aa-dev-1', gcp_bucket_name='ggn-nmfs-aa-dev-1-data', gcp_bucket=None, return_full_paths=False)
Gets all of the surveys in a GCP storage bucket.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
project_id
|
str
|
The GCP project ID that the storage bucket resides in. Defaults to "ggn-nmfs-aa-dev-1". |
'ggn-nmfs-aa-dev-1'
|
gcp_bucket_name
|
str
|
The GCP storage bucket name. Defaults to "ggn-nmfs-aa-dev-1-data". |
'ggn-nmfs-aa-dev-1-data'
|
gcp_bucket
|
bucket
|
The GCP storage bucket
client object.
If none, one will be created for you based on the |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the survey names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings containing the survey names. |
Source code in src\aalibrary\utils\gcp_utils.py
helpers
For helper functions.
Functions:
| Name | Description |
|---|---|
check_for_assertion_errors |
Checks for errors in the kwargs provided. |
create_azure_config_file |
Creates an empty config file for azure storage keys. |
get_all_objects_in_survey_from_ncei |
Gets all of the object keys from a ship survey from the NCEI database. |
get_all_ship_objects_from_ncei |
Gets all of the object keys from a ship from the NCEI database. |
get_file_name_from_url |
Extracts the file name from a given storage bucket url. Includes the |
get_file_paths_via_json_link |
This function helps in getting the links from a json request, parsing |
get_netcdf_gcp_location_from_raw_gcp_location |
Gets the netcdf location of a raw file within GCP. |
normalize_ship_name |
Normalizes a ship's name. This is necessary for creating a deterministic |
parse_correct_gcp_storage_bucket_location |
Calculates the correct gcp storage location based on data source, file |
parse_variables_from_ncei_file_url |
Gets the file variables associated with a file url in NCEI. |
check_for_assertion_errors(**kwargs)
Checks for errors in the kwargs provided.
Source code in src\aalibrary\utils\helpers.py
create_azure_config_file(download_directory='')
Creates an empty config file for azure storage keys.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
download_directory
|
str
|
The directory to store the azure config file. Defaults to "". |
''
|
Source code in src\aalibrary\utils\helpers.py
get_all_objects_in_survey_from_ncei(ship_name='', survey_name='', s3_bucket=None)
Gets all of the object keys from a ship survey from the NCEI database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The name of the ship. Must be title-case and have spaces substituted for underscores. Defaults to "". |
''
|
survey_name
|
str
|
The name of the survey. Must match what we have in the NCEI database. Defaults to "". |
''
|
s3_bucket
|
resource
|
The boto3 bucket resource for the bucket that the ship data resides in. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings. Each one being an object key (path to the object inside of the bucket). |
Source code in src\aalibrary\utils\helpers.py
get_all_ship_objects_from_ncei(ship_name='', bucket=None)
Gets all of the object keys from a ship from the NCEI database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The name of the ship. Must be title-case and have spaces substituted for underscores. Defaults to "". |
''
|
bucket
|
resource
|
The boto3 bucket resource for the bucket that the ship data resides in. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings. Each one being an object key (path to the object inside of the bucket). |
Source code in src\aalibrary\utils\helpers.py
get_file_name_from_url(url='')
Extracts the file name from a given storage bucket url. Includes the file extension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url
|
str
|
The full url of the storage object. Defaults to "". Example: "https://noaa-wcsd-pds.s3.amazonaws.com/data/raw/Reuben_La sker/RL2107/EK80/2107RL_CW-D20210813-T220732.raw" |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The file name. Example: 2107RL_CW-D20210813-T220732.raw |
Source code in src\aalibrary\utils\helpers.py
get_file_paths_via_json_link(link='')
This function helps in getting the links from a json request, parsing the contents of that url into a json object. The output is a json of the filename, and the cloud path link (s3 bucket link). Code from: https://www.ngdc.noaa.gov/mgg/wcd/S3_download.html
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
link
|
str
|
The link to the json url. Defaults to "". |
''
|
Source code in src\aalibrary\utils\helpers.py
get_netcdf_gcp_location_from_raw_gcp_location(gcp_storage_bucket_location='')
Gets the netcdf location of a raw file within GCP.
Source code in src\aalibrary\utils\helpers.py
normalize_ship_name(ship_name='')
Normalizes a ship's name. This is necessary for creating a deterministic
file structure within our GCP storage bucket.
The ship name is returned as a Title_Cased_And_Snake_Cased ship name, with
no punctuation.
Ex. HENRY B. BIGELOW will return Henry_B_Bigelow
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship name string. Defaults to "". |
''
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The formatted and normalized version of the ship name. |
Source code in src\aalibrary\utils\helpers.py
parse_correct_gcp_storage_bucket_location(file_name='', file_type='', ship_name='', survey_name='', echosounder='', data_source='', is_metadata=False, is_survey_metadata=False, debug=False)
Calculates the correct gcp storage location based on data source, file type, and if the file is metadata or not.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_name
|
str
|
The file name (includes extension). Defaults to "". |
''
|
file_type
|
str
|
The file type (not include the dot "."). Defaults to "". |
''
|
ship_name
|
str
|
The ship name associated with this survey. Defaults to "". |
''
|
survey_name
|
str
|
The survey name/identifier. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used to gather the data. Defaults to "". |
''
|
data_source
|
str
|
The source of the data. Can be one of ["NCEI", "OMAO"]. Defaults to "". |
''
|
is_metadata
|
bool
|
Whether or not the file is a metadata file. Necessary since files that are considered metadata (metadata json, or readmes) are stored in a separate directory. Defaults to False. |
False
|
is_survey_metadata
|
bool
|
Whether or not the file is a
metadata file associated with a survey. The files are stored at
the survey level, in the |
False
|
debug
|
bool
|
Whether or not to print debug statements. Defaults to False. |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
The correctly parsed GCP storage bucket location. |
Source code in src\aalibrary\utils\helpers.py
217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 | |
parse_variables_from_ncei_file_url(url='')
Gets the file variables associated with a file url in NCEI. File urls in NCEI follow this template: data/raw/{ship_name}/{survey_name}/{echosounder}/{file_name}
NOTE: file_name will include the extension.
Source code in src\aalibrary\utils\helpers.py
ices
Functions:
| Name | Description |
|---|---|
correct_dimensions_ices |
Extracts angle data from echopype DataArray. |
echopype_ek60_raw_to_ices_netcdf |
Writes echodata Beam_group ds to a Beam_groupX netcdf file. |
echopype_ek80_raw_to_ices_netcdf |
Writes echodata Beam_group ds to a Beam_groupX netcdf file. |
ragged_data_type_ices |
Transforms a gridded 4 dimensional variable from an Echodata object |
write_ek60_beamgroup_to_netcdf |
Writes echopype Beam_group ds to a Beam_groupX netcdf file. |
write_ek80_beamgroup_to_netcdf |
Writes echodata Beam_group ds to a Beam_groupX netcdf file. |
correct_dimensions_ices(echodata, variable_name='')
Extracts angle data from echopype DataArray.
Args: echodata (echopype.DataArray): Echopype echodata object containing data. variable_name (str): The name of the variable that needs to be transformed to a ragged array representation.
Returns: np.array that returns array with correct dimension as specified by ICES netcdf convention.
Source code in src\aalibrary\utils\ices.py
echopype_ek60_raw_to_ices_netcdf(echodata, export_file)
Writes echodata Beam_group ds to a Beam_groupX netcdf file.
Args: echodata (echopype.echodata): Echopype echodata object containing beam_group_data. (echopype.DataArray): Echopype DataArray to be written. export_file (str or Path): Path to the NetCDF file.
Source code in src\aalibrary\utils\ices.py
1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 | |
echopype_ek80_raw_to_ices_netcdf(echodata, export_file)
Writes echodata Beam_group ds to a Beam_groupX netcdf file.
Args: echodata (echopype.echodata): Echopype echodata object containing beam_group_data. (echopype.DataArray): Echopype DataArray to be written. export_file (str or Path): Path to the NetCDF file.
Source code in src\aalibrary\utils\ices.py
1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 | |
ragged_data_type_ices(echodata, variable_name='')
Transforms a gridded 4 dimensional variable from an Echodata object into a ragged array representation.
Args: echodata (echopype.Echodata): Echopype echodata object containing a variable in the Beam_group1. variable_name (str): The name of the variable that needs to be transformed to a ragged array representation.
Returns: ICES complain np array of type object.
Source code in src\aalibrary\utils\ices.py
write_ek60_beamgroup_to_netcdf(echodata, export_file)
Writes echopype Beam_group ds to a Beam_groupX netcdf file.
Parameters: ed (echopype.DataArray): Echopype DataArray to be written. export_file (str or Path): Path to the output NetCDF file.
Source code in src\aalibrary\utils\ices.py
625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 | |
write_ek80_beamgroup_to_netcdf(echodata, export_file)
Writes echodata Beam_group ds to a Beam_groupX netcdf file.
Args: echodata (echopype.echodata): Echopype echodata object containing beam_group_data. (echopype.DataArray): Echopype DataArray to be written. export_file (str or Path): Path to the NetCDF file.
Source code in src\aalibrary\utils\ices.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 | |
nc_reader
This file is used to get header information out of a NetCDF file. The code reads a .nc file and returns a dict with all of the attributes gathered.
Functions:
| Name | Description |
|---|---|
get_netcdf_header |
Reads a NetCDF file and returns its header as a dictionary. |
get_netcdf_header(file_path)
Reads a NetCDF file and returns its header as a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_path
|
str
|
Path to the NetCDF file. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Dictionary containing global attributes, dimensions, and |
dict
|
variables. |
Source code in src\aalibrary\utils\nc_reader.py
ncei_cache_daily_script
Script to get all objects in the NCEI S3 bucket and cache it to BigQuery. Ideally, should run every time a file is updated, however, it is set to run daily via a cronjob.
Cron job command: 0 1 * * * /usr/bin/python3 /path/to/aalibrary/src/aalibrary/utils/test.py
ncei_utils
This file contains code pertaining to auxiliary functions related to parsing through NCEI's s3 bucket.
Functions:
| Name | Description |
|---|---|
check_if_tugboat_metadata_json_exists_in_survey |
Checks whether a Tugboat metadata JSON file exists within a survey. |
download_single_file_from_aws |
Safely downloads a file from AWS storage bucket, aka the NCEI |
download_specific_folder_from_ncei |
Downloads a specific folder and all of its contents from NCEI to a local |
get_all_echosounders_in_a_survey |
Gets all of the echosounders in a particular survey from NCEI. |
get_all_echosounders_that_exist_in_ncei |
Gets a list of all possible echosounders from NCEI. |
get_all_file_names_from_survey |
Gets all of the file names from a particular NCEI survey. |
get_all_file_names_in_a_surveys_echosounder_folder |
Gets all of the file names from a particular NCEI survey's echosounder |
get_all_metadata_files_in_survey |
Gets all of the metadata file names from a particular NCEI survey. |
get_all_raw_file_names_from_survey |
Gets all of the file names from a particular NCEI survey. |
get_all_ship_names_in_ncei |
Gets all of the ship names from NCEI. This is based on all of the |
get_all_survey_names_from_a_ship |
Gets a list of all of the survey names that exist under a ship name. |
get_all_surveys_in_ncei |
Gets a list of all of the possible survey names from NCEI. |
get_checksum_sha256_from_s3 |
Gets the SHA-256 checksum of the s3 object. |
get_closest_ncei_formatted_ship_name |
Gets the closest NCEI formatted ship name to the given ship name. |
get_echosounder_from_raw_file |
Gets the echosounder used for a particular raw file. |
get_file_size_from_s3 |
Gets the file size of an object in s3. |
get_folder_size_from_s3 |
Gets the folder size in bytes from S3. |
get_random_raw_file_from_ncei |
Creates a test raw file for NCEI. This is used for testing purposes |
search_ncei_file_objects_for_string |
Searches NCEI for a file type's object keys that contain a particular |
search_ncei_objects_for_string |
Searches NCEI for object keys that contain a particular string. This |
check_if_tugboat_metadata_json_exists_in_survey(ship_name='', survey_name='', s3_bucket=None)
Checks whether a Tugboat metadata JSON file exists within a survey. Returns the file's object key or None if it does not exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
s3_bucket
|
resource
|
The bucket resource object. Defaults to None. |
None
|
Returns: Union[str, None]: Returns the file's object key string or None if it does not exist.
Source code in src\aalibrary\utils\ncei_utils.py
download_single_file_from_aws(file_url='', download_location='')
Safely downloads a file from AWS storage bucket, aka the NCEI repository.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_url
|
str
|
The file url. Defaults to "". |
''
|
download_location
|
str
|
The local download location for the file. Defaults to "". |
''
|
Source code in src\aalibrary\utils\ncei_utils.py
download_specific_folder_from_ncei(folder_prefix='', download_directory='', debug=False)
Downloads a specific folder and all of its contents from NCEI to a local directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
folder_prefix
|
str
|
The folder's path in the s3 bucket. Ex. 'data/raw/Reuben_Lasker/' Defaults to "". |
''
|
download_directory
|
str
|
The directory you want to download the folder and all of its contents to. Defaults to "". |
''
|
debug
|
bool
|
Whether or not to print debug information. Defaults to False. |
False
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_echosounders_in_a_survey(ship_name='', survey_name='', s3_client=None, return_full_paths=False)
Gets all of the echosounders in a particular survey from NCEI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the echosounder name. Whether
these are full paths or just folder names are specified by the
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_echosounders_that_exist_in_ncei(s3_client=None)
Gets a list of all possible echosounders from NCEI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the echosounder name. Whether
these are full paths or just folder names are specified by the
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_file_names_from_survey(ship_name='', survey_name='', s3_resource=None, return_full_paths=False)
Gets all of the file names from a particular NCEI survey.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
s3_resource
|
resource
|
The resource used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the echosounder name. Whether
these are full paths or just folder names are specified by the
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_file_names_in_a_surveys_echosounder_folder(ship_name='', survey_name='', echosounder='', s3_resource=None, return_full_paths=False)
Gets all of the file names from a particular NCEI survey's echosounder folder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used. Defaults to "". |
''
|
s3_resource
|
resource
|
The resource used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the file name. Whether
these are full paths or just file names are specified by the
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_metadata_files_in_survey(ship_name='', survey_name='', s3_resource=None, return_full_paths=False)
Gets all of the metadata file names from a particular NCEI survey.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
s3_resource
|
resource
|
The resource used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the metadata file name.
Whether these are full paths or just folder names are specified by
the |
Source code in src\aalibrary\utils\ncei_utils.py
get_all_raw_file_names_from_survey(ship_name='', survey_name='', echosounder='', s3_resource=None, return_full_paths=False)
Gets all of the file names from a particular NCEI survey.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
survey_name
|
str
|
The survey name exactly as it is in NCEI. Defaults to "". |
''
|
echosounder
|
str
|
The echosounder used. Defaults to "". |
''
|
s3_resource
|
resource
|
The resource used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being the raw file name. Whether
these are full paths or just folder names are specified by the
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_ship_names_in_ncei(normalize=False, s3_client=None, return_full_paths=False)
Gets all of the ship names from NCEI. This is based on all of the
folders listed under the data/raw/ prefix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
normalize
|
bool
|
Whether or not to normalize the ship_name attribute to how GCP stores it. Defaults to False. |
False
|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Source code in src\aalibrary\utils\ncei_utils.py
get_all_survey_names_from_a_ship(ship_name='', s3_client=None, return_full_paths=False)
Gets a list of all of the survey names that exist under a ship name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship's name you want to get all surveys
from. Defaults to None.
NOTE: The ship's name MUST be spelled exactly as it is in NCEI. Use
the |
''
|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
List[str]: A list of strings, each being the survey name. Whether
these are full paths or just folder names are specified by the
return_full_paths parameter.
Source code in src\aalibrary\utils\ncei_utils.py
get_all_surveys_in_ncei(s3_client=None, return_full_paths=False)
Gets a list of all of the possible survey names from NCEI.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
return_full_paths
|
bool
|
Whether or not you want a full path from bucket root to the subdirectory returned. Set to false if you only want the subdirectory names listed. Defaults to False. |
False
|
Returns:
List[str]: A list of strings, each being the survey name. Whether
these are full paths or just folder names are specified by the
return_full_paths parameter.
Source code in src\aalibrary\utils\ncei_utils.py
get_checksum_sha256_from_s3(object_key, s3_resource)
Gets the SHA-256 checksum of the s3 object.
get_closest_ncei_formatted_ship_name(ship_name='', s3_client=None)
Gets the closest NCEI formatted ship name to the given ship name.
NOTE: Only use if the data_source=="NCEI".
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ship_name
|
str
|
The ship name to search the closest match for. Defaults to "". |
''
|
s3_client
|
client
|
The client used to perform this operation. Defaults to None, but creates a client for you instead. |
None
|
Returns:
| Type | Description |
|---|---|
Union[str, None]
|
Union[str, None]: The NCEI formatted ship name or None, if none matched. |
Source code in src\aalibrary\utils\ncei_utils.py
get_echosounder_from_raw_file(file_name='', ship_name='', survey_name='', echosounders=None, s3_client=None, s3_resource=None, s3_bucket=None)
Gets the echosounder used for a particular raw file.
Source code in src\aalibrary\utils\ncei_utils.py
get_file_size_from_s3(object_key, s3_resource)
Gets the file size of an object in s3.
get_folder_size_from_s3(folder_prefix, s3_resource)
Gets the folder size in bytes from S3.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
folder_prefix
|
str
|
The object key prefix of the folder in S3. |
required |
s3_resource
|
resource
|
The resource used to perform this operation. Defaults to None, but creates a client for you instead. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
The total size of the folder in bytes. |
Source code in src\aalibrary\utils\ncei_utils.py
get_random_raw_file_from_ncei()
Creates a test raw file for NCEI. This is used for testing purposes only. Retries automatically if an error occurs.
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list object with strings denoting each parameter required for creating a raw file object. Ex. [ random_ship_name, random_survey_name, random_echosounder, random_raw_file, ] |
Source code in src\aalibrary\utils\ncei_utils.py
search_ncei_file_objects_for_string(search_param='', file_extension='.raw')
Searches NCEI for a file type's object keys that contain a particular string. This string can be anything, such as an echosounder name, ship name, survey name, or even a partial file name. The file type can be specified by the file_extension parameter. NOTE: This function takes a long time to run, as it has to search through ALL of NCEI's objects.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
search_param
|
str
|
The string to search for. Defaults to "". |
''
|
file_extension
|
str
|
The file extension to filter results by. Defaults to ".raw". |
'.raw'
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being an object key that contains the search parameter. |
Source code in src\aalibrary\utils\ncei_utils.py
search_ncei_objects_for_string(search_param='')
Searches NCEI for object keys that contain a particular string. This string can be anything, such as an echosounder name, ship name, survey name, or even a partial file name. NOTE: This function takes a long time to run, as it has to search through ALL of NCEI's objects. NOTE: Use a folder name as the search_param to get all object keys that contain that folder name. (e.g. '/EK80/')
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
search_param
|
str
|
The string to search for. Defaults to "". |
''
|
Returns:
| Type | Description |
|---|---|
List[str]
|
List[str]: A list of strings, each being an object key that contains the search parameter. |
Source code in src\aalibrary\utils\ncei_utils.py
sonar_checker
Modules:
| Name | Description |
|---|---|
ek_date_conversion |
Code originally developed for pyEcholab |
ek_raw_io |
Code originally developed for pyEcholab |
ek_raw_parsers |
Code originally developed for pyEcholab |
log |
|
misc |
|
sonar_checker |
|
ek_date_conversion
Code originally developed for pyEcholab (https://github.com/CI-CMG/pyEcholab) by Rick Towler rick.towler@noaa.gov at NOAA AFSC.
Contains functions to convert date information.
TODO: merge necessary function into ek60.py or group everything into a class TODO: fix docstring
Functions:
| Name | Description |
|---|---|
nt_to_unix |
:param nt_timestamp_tuple: Tuple of two longs representing the NT date |
unix_to_nt |
Given a date, return the 2-element tuple used for timekeeping with SIMRAD echosounders |
datetime_to_unix(datetime_obj)
:param datetime_obj: datetime object to convert
:type datetime_obj: :class:datetime.datetime
:param tz: Timezone to use for converted time -- if None, uses timezone information contained within datetime_obj :type tz: :class:datetime.tzinfo
from pytz import utc from datetime import datetime epoch = datetime(1970, 1, 1, tzinfo=utc) assert datetime_to_unix(epoch) == 0
Source code in src\aalibrary\utils\sonar_checker\ek_date_conversion.py
nt_to_unix(nt_timestamp_tuple, return_datetime=True)
:param nt_timestamp_tuple: Tuple of two longs representing the NT date :type nt_timestamp_tuple: (long, long)
:param return_datetime: Return a datetime object instead of float :type return_datetime: bool
Returns a datetime.datetime object w/ UTC timezone calculated from the nt time tuple
lowDateTime, highDateTime = nt_timestamp_tuple
The timestamp is a 64bit count of 100ns intervals since the NT epoch broken into two 32bit longs, least significant first:
dt = nt_to_unix((19496896L, 30196149L)) match_dt = datetime.datetime(2011, 12, 23, 20, 54, 3, 964000, pytz_utc) assert abs(dt - match_dt) <= dt.resolution
Source code in src\aalibrary\utils\sonar_checker\ek_date_conversion.py
unix_to_datetime(unix_timestamp)
:param unix_timestamp: Number of seconds since unix epoch (1/1/1970) :type unix_timestamp: float
:param tz: timezone to use for conversion (default None = UTC) :type tz: None or tzinfo object (see datetime docs)
:returns: datetime object :raises: ValueError if unix_timestamp is not of type float or datetime
Returns a datetime object from a unix timestamp. Simple wrapper for
:func:datetime.datetime.fromtimestamp
from pytz import utc from datetime import datetime epoch = unix_to_datetime(0.0, tz=utc) assert epoch == datetime(1970, 1, 1, tzinfo=utc)
Source code in src\aalibrary\utils\sonar_checker\ek_date_conversion.py
unix_to_nt(unix_timestamp)
Given a date, return the 2-element tuple used for timekeeping with SIMRAD echosounders
Simple conversion
dt = datetime.datetime(2011, 12, 23, 20, 54, 3, 964000, pytz_utc) assert (19496896L, 30196149L) == unix_to_nt(dt)
Converting back and forth between the two standards:
orig_dt = datetime.datetime.now(tz=pytz_utc) nt_tuple = unix_to_nt(orig_dt)
converting back may not yield the exact original date,
but will be within the datetime's precision
back_to_dt = nt_to_unix(nt_tuple) d_mu_seconds = abs(orig_dt - back_to_dt).microseconds mu_sec_resolution = orig_dt.resolution.microseconds assert d_mu_seconds <= mu_sec_resolution
Source code in src\aalibrary\utils\sonar_checker\ek_date_conversion.py
ek_raw_io
Code originally developed for pyEcholab (https://github.com/CI-CMG/pyEcholab) by Rick Towler rick.towler@noaa.gov at NOAA AFSC.
Contains low-level functions called by ./ek_raw_parsers.py
Classes:
| Name | Description |
|---|---|
RawSimradFile |
A low-level extension of the built in python file object allowing the reading/writing |
RawSimradFile
Bases: BufferedReader
A low-level extension of the built in python file object allowing the reading/writing of SIMRAD RAW files on datagram by datagram basis (instead of at the byte level.)
Calls to the read method return parse datagrams as dicts.
Methods:
| Name | Description |
|---|---|
__next__ |
Returns the next datagram (synonymous with self.read(1)) |
iter_dgrams |
Iterates through the file, repeatedly calling self.next() until |
peek |
Returns the header of the next datagram in the file. The file position is |
prev |
Returns the previous datagram 'behind' the current file pointer position |
read |
:param k: Number of datagrams to read |
readall |
Reads the entire file from the beginning and returns a list of datagrams. |
readline |
aliased to self.next() |
readlines |
aliased to self.read(-1) |
seek |
Performs the familiar 'seek' operation using datagram offsets |
skip |
Skips forward to the next datagram without reading the contents of the current one |
skip_back |
Skips backwards to the previous datagram without reading it's contents |
tell |
Returns the current file pointer offset by datagram number |
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 | |
__next__()
iter_dgrams()
Iterates through the file, repeatedly calling self.next() until the end of file is reached
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
peek()
Returns the header of the next datagram in the file. The file position is reset back to the original location afterwards.
:returns: [dgram_size, dgram_type, (low_date, high_date)]
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
prev()
Returns the previous datagram 'behind' the current file pointer position
read(k)
:param k: Number of datagrams to read :type k: int
Reads the next k datagrams. A list of datagrams is returned if k > 1. The entire file is read from the CURRENT POSITION if k < 0. (does not necessarily read from beginning of file if previous datagrams were read)
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
readall()
Reads the entire file from the beginning and returns a list of datagrams.
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
readline()
readlines()
seek(offset, whence)
Performs the familiar 'seek' operation using datagram offsets instead of raw bytes.
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
skip()
Skips forward to the next datagram without reading the contents of the current one
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
skip_back()
Skips backwards to the previous datagram without reading it's contents
Source code in src\aalibrary\utils\sonar_checker\ek_raw_io.py
ek_raw_parsers
Code originally developed for pyEcholab (https://github.com/CI-CMG/pyEcholab) by Rick Towler rick.towler@noaa.gov at NOAA AFSC.
The code has been modified to handle split-beam data and channel-transducer structure from different EK80 setups.
Classes:
| Name | Description |
|---|---|
SimradAnnotationParser |
ER60 Annotation datagram contains the following keys: |
SimradBottomParser |
Bottom Detection datagram contains the following keys: |
SimradConfigParser |
Simrad Configuration Datagram parser operates on dictionaries with the following keys: |
SimradDepthParser |
ER60 Depth Detection datagram (from .bot files) contain the following keys: |
SimradNMEAParser |
ER60 NMEA datagram contains the following keys: |
SimradRawParser |
Sample Data Datagram parser operates on dictionaries with the following keys: |
SimradAnnotationParser
Bases: _SimradDatagramParser
ER60 Annotation datagram contains the following keys:
type: string == 'TAG0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
text: Annotation
The following methods are defined:
from_string(str): parse a raw ER60 Annotation datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 | |
SimradBottomParser
Bases: _SimradDatagramParser
Bottom Detection datagram contains the following keys:
type: string == 'BOT0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
datetime: datetime.datetime object of NT date converted to UTC
transceiver_count: long uint with number of transceivers
depth: [float], one value for each active channel
The following methods are defined:
from_string(str): parse a raw ER60 Bottom datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 | |
SimradConfigParser
Bases: _SimradDatagramParser
Simrad Configuration Datagram parser operates on dictionaries with the following keys:
type: string == 'CON0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
survey_name [str]
transect_name [str]
sounder_name [str]
version [str]
spare0 [str]
transceiver_count [long]
transceivers [list] List of dicts representing Transducer Configs:
ME70 Data contains the following additional values (data contained w/in first 14
bytes of the spare0 field)
multiplexing [short] Always 0
time_bias [long] difference between UTC and local time in min.
sound_velocity_avg [float] [m/s]
sound_velocity_transducer [float] [m/s]
beam_config [str] Raw XML string containing beam config. info
Transducer Config Keys (ER60/ES60/ES70 sounders): channel_id [str] channel ident string beam_type [long] Type of channel (0 = Single, 1 = Split) frequency [float] channel frequency equivalent_beam_angle [float] dB beamwidth_alongship [float] beamwidth_athwartship [float] angle_sensitivity_alongship [float] angle_sensitivity_athwartship [float] angle_offset_alongship [float] angle_offset_athwartship [float] pos_x [float] pos_y [float] pos_z [float] dir_x [float] dir_y [float] dir_z [float] pulse_length_table [float[5]] spare1 [str] gain_table [float[5]] spare2 [str] sa_correction_table [float[5]] spare3 [str] gpt_software_version [str] spare4 [str]
Transducer Config Keys (ME70 sounders): channel_id [str] channel ident string beam_type [long] Type of channel (0 = Single, 1 = Split) reserved1 [float] channel frequency equivalent_beam_angle [float] dB beamwidth_alongship [float] beamwidth_athwartship [float] angle_sensitivity_alongship [float] angle_sensitivity_athwartship [float] angle_offset_alongship [float] angle_offset_athwartship [float] pos_x [float] pos_y [float] pos_z [float] beam_steering_angle_alongship [float] beam_steering_angle_athwartship [float] beam_steering_angle_unused [float] pulse_length [float] reserved2 [float] spare1 [str] gain [float] reserved3 [float] spare2 [str] sa_correction [float] reserved4 [float] spare3 [str] gpt_software_version [str] spare4 [str]
from_string(str): parse a raw config datagram (with leading/trailing datagram size stripped)
to_string(dict): Returns raw string (including leading/trailing size fields) ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260 1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440 1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500 1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560 1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 | |
SimradDepthParser
Bases: _SimradDatagramParser
ER60 Depth Detection datagram (from .bot files) contain the following keys:
type: string == 'DEP0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
transceiver_count: [long uint] with number of transceivers
depth: [float], one value for each active channel
reflectivity: [float], one value for each active channel
unused: [float], unused value for each active channel
The following methods are defined:
from_string(str): parse a raw ER60 Depth datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 | |
SimradFILParser
Bases: _SimradDatagramParser
EK80 FIL datagram contains the following keys:
type: string == 'FIL1'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
stage: int
channel_id: string
n_coefficients: int
decimation_factor: int
coefficients: np.complex64
The following methods are defined:
from_string(str): parse a raw EK80 FIL datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 | |
SimradIDXParser
Bases: _SimradDatagramParser
ER60/EK80 IDX datagram contains the following keys:
type: string == 'IDX0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
ping_number: int
distance : float
latitude: float
longitude: float
file_offset: int
The following methods are defined:
from_string(str): Parse a raw ER60/EK80 IDX datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string (including leading/trailing size
fields) ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 | |
SimradMRUParser
Bases: _SimradDatagramParser
EK80 MRU datagram contains the following keys:
type: string == 'MRU0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
heave: float
roll : float
pitch: float
heading: float
Version 1 contains (from https://www3.mbari.org/products/mbsystem/formatdoc/KongsbergKmall/EMdgmFormat_RevH/html/kmBinary.html): # noqa
Status word See 1) uint32 4U Latitude deg double 8F Longitude deg double 8F Ellipsoid height m float 4F Roll deg float 4F Pitch deg float 4F Heading deg float 4F Heave m float 4F Roll rate deg/s float 4F Pitch rate deg/s float 4F Yaw rate deg/s float 4F North velocity m/s float 4F East velocity m/s float 4F Down velocity m/s float 4F Latitude error m float 4F Longitude error m float 4F Height error m float 4F Roll error deg float 4F Pitch error deg float 4F Heading error deg float 4F Heave error m float 4F North acceleration m/s2 float 4F East acceleration m/s2 float 4F Down acceleration m/s2 float 4F Delayed heave: - - - UTC seconds s uint32 4U UTC nanoseconds ns uint32 4U Delayed heave m float 4F
The following methods are defined:
from_string(str): parse a raw EK80 MRU datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string (including
leading/trailing size fields) ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 | |
SimradNMEAParser
Bases: _SimradDatagramParser
ER60 NMEA datagram contains the following keys:
type: string == 'NME0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
nmea_string: full (original) NMEA string
The following methods are defined:
from_string(str): parse a raw ER60 NMEA datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 | |
SimradRawParser
Bases: _SimradDatagramParser
Sample Data Datagram parser operates on dictionaries with the following keys:
type: string == 'RAW0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
channel [short] Channel number
mode [short] 1 = Power only, 2 = Angle only 3 = Power & Angle
transducer_depth [float]
frequency [float]
transmit_power [float]
pulse_length [float]
bandwidth [float]
sample_interval [float]
sound_velocity [float]
absorption_coefficient [float]
heave [float]
roll [float]
pitch [float]
temperature [float]
heading [float]
transmit_mode [short] 0 = Active, 1 = Passive, 2 = Test, -1 = Unknown
spare0 [str]
offset [long]
count [long]
power [numpy array] Unconverted power values (if present)
angle [numpy array] Unconverted angle values (if present)
from_string(str): parse a raw sample datagram (with leading/trailing datagram size stripped)
to_string(dict): Returns raw string (including leading/trailing size fields) ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620 1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680 1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 1803 1804 1805 1806 1807 1808 1809 1810 1811 1812 1813 1814 1815 1816 1817 | |
SimradXMLParser
Bases: _SimradDatagramParser
EK80 XML datagram contains the following keys:
type: string == 'XML0'
low_date: long uint representing LSBytes of 64bit NT date
high_date: long uint representing MSBytes of 64bit NT date
timestamp: datetime.datetime object of NT date, assumed to be UTC
subtype: string representing Simrad XML datagram type:
configuration, environment, or parameter
[subtype]: dict containing the data specific to the XML subtype.
The following methods are defined:
from_string(str): parse a raw EK80 XML datagram
(with leading/trailing datagram size stripped)
to_string(): Returns the datagram as a raw string
(including leading/trailing size fields)
ready for writing to disk
Source code in src\aalibrary\utils\sonar_checker\ek_raw_parsers.py
725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 | |
log
Functions:
| Name | Description |
|---|---|
verbose |
Set the verbosity for echopype print outs. |
verbose(logfile=None, override=False)
Set the verbosity for echopype print outs. If called it will output logs to terminal by default.
Parameters
logfile : str, optional
Optional string path to the desired log file.
override: bool
Boolean flag to override verbosity,
which turns off verbosity if the value is False.
Default is False.
Returns
None
Source code in src\aalibrary\utils\sonar_checker\log.py
misc
Functions:
| Name | Description |
|---|---|
camelcase2snakecase |
Convert string from CamelCase to snake_case |
depth_from_pressure |
Convert pressure to depth using UNESCO 1983 algorithm. |
camelcase2snakecase(camel_case_str)
Convert string from CamelCase to snake_case e.g. CamelCase becomes camel_case.
Source code in src\aalibrary\utils\sonar_checker\misc.py
depth_from_pressure(pressure, latitude=30.0, atm_pres_surf=0.0)
Convert pressure to depth using UNESCO 1983 algorithm.
UNESCO. 1983. Algorithms for computation of fundamental properties of seawater (Pressure to Depth conversion, pages 25-27). Prepared by Fofonoff, N.P. and Millard, R.C. UNESCO technical papers in marine science, 44. http://unesdoc.unesco.org/images/0005/000598/059832eb.pdf
Parameters
pressure : Union[float, FloatSequence] Pressure in dbar latitude : Union[float, FloatSequence], default=30.0 Latitude in decimal degrees. atm_pres_surf : Union[float, FloatSequence], default=0.0 Atmospheric pressure at the surface in dbar. Use the default 0.0 value if pressure is corrected to be 0 at the surface. Otherwise, enter a correction for pressure due to air, sea ice and any other medium that may be present
Returns
depth : NDArray[float] Depth in meters
Source code in src\aalibrary\utils\sonar_checker\misc.py
sonar_checker
Functions:
| Name | Description |
|---|---|
is_AD2CP |
Check if the provided file has a .ad2cp extension. |
is_AZFP |
Check if the specified XML file contains an |
is_AZFP6 |
Check if the provided file has a .azfp extension. |
is_EK60 |
Check if a raw data file is from Simrad EK60 echosounder. |
is_EK80 |
Check if a raw data file is from Simrad EK80 echosounder. |
is_ER60 |
Check if a raw data file is from Simrad EK60 echosounder. |
is_AD2CP(raw_file)
Check if the provided file has a .ad2cp extension.
Parameters: raw_file (str): The name of the file to check.
Returns: bool: True if the file has a .ad2cp extension, False otherwise.
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
is_AZFP(raw_file)
Check if the specified XML file contains an
Parameters: raw_file (str): The base name of the XML file (with or without extension).
Returns:
bool: True if
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
is_AZFP6(raw_file)
Check if the provided file has a .azfp extension.
Parameters: raw_file (str): The name of the file to check.
Returns: bool: True if the file has a .azfp extension, False otherwise.
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
is_EK60(raw_file, storage_options)
Check if a raw data file is from Simrad EK60 echosounder.
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
is_EK80(raw_file, storage_options)
Check if a raw data file is from Simrad EK80 echosounder.
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
is_ER60(raw_file, storage_options)
Check if a raw data file is from Simrad EK60 echosounder.
Source code in src\aalibrary\utils\sonar_checker\sonar_checker.py
timings
"This script deals with the times associated with ingesting/preprocessing data from various sources. It works as follows: * A large file (usually 1 GB) is selected to repeatedly be downloaded and uploaded to a GCP bucket. * Download and upload times are recorded for each of these n iterations. * The average of these times are presented.
Functions:
| Name | Description |
|---|---|
time_ingestion_and_upload_from_ncei |
Used for timing the ingestion from the NCEI AWS S3 bucket. |
time_ingestion_and_upload_from_ncei(n=10, ncei_file_url='https://noaa-wcsd-pds.s3.amazonaws.com/data/raw/Reuben_Lasker/RL2107/EK80/2107RL_CW-D20210813-T220732.raw', ncei_bucket='noaa-wcsd-pds', download_location='./')
Used for timing the ingestion from the NCEI AWS S3 bucket.