Learning to Code: Get ESA Data
During my years-long journey to learn to code, progress has been slow. With the help of #ChatGPT4, the barriers to learning have crumbled. This experience is not unique to me; everyone is learning to code, creating web apps and chatbots, dipping their toes into Python, JavaScript...and questionable jailbreaking to test the boundaries of these new sleepless #AI assistants.
Developing web apps and #chatbots is fun practice, but when the fun is over and #YouTube stops boosting "vanilla latte" #webapp tutorials...what then? Leverage your personal skills, interests, and expertise. See how you can contribute to a larger, more complex project at work...see if you can automate a simple task you do everyday and enjoy a practical benefit of #LLMs.
I am a Space Domain Awareness (#SDA) profession - a field with global participation where data science fundamentals are just so valuable. Instead of showing off my latest chatbot-webapp...(ahem...I did make one this week), I developed a script that allows users to retrieve data from the European Space Agency's (#ESA) "#discosweb" database. A large repository containing #satellite names, aliases, launch information, size and shape estimates, and more. A great data set to start with to learn about "what's up there". This is an incremental step toward #automating personal workflows, ensuring I stay informed about SDA developments.
The script has two versions below: verbose and concise. The verbose version includes comments and print statements to let you know how the data pull is working, good for #learning! The concise version is half the length (36 lines), combines variables, and streamlines the while loop for efficiency. Take a look and give me your feedback...or take the script and use it on your own #datascience journey! I am enthusiastic about the potential for growth and forthcoming opportunities!!
Recommended by LinkedIn
Cheers to you on your #DataScience journeys!
# CONCISE
import requests
import csv
import time
from functools import lru_cache
@lru_cache(maxsize=128)
def get_data(api_token, url):
headers = {"Authorization": f"Bearer {api_token}", "DiscosWeb-Api-Version": "2"}
return requests.get(url, headers=headers)
def save_to_file(data, filename):
with open(filename, 'w', encoding='utf-8', newline='') as f:
writer = csv.DictWriter(f, fieldnames=list(data[0].keys()))
writer.writeheader()
writer.writerows(data)
def main(starting_page, save_interval, max_iterations, api_token):
base_url = "https://discosweb.esoc.esa.int"
objects_endpoint = f"{base_url}/api/objects?page[number]={starting_page}"
response, all_data, current_iteration = get_data(api_token, objects_endpoint), [], 0
try:
while response.status_code == 200 and current_iteration < max_iterations:
content = response.json()
all_data += [item.get("attributes", {}) for item in content.get("data", [])]
next_page_url, current_iteration = content['links'].get('next'), current_iteration + 1
time.sleep(60 / 20) if next_page_url else None
response = get_data(api_token, f"{base_url}{next_page_url}") if next_page_url else None
if current_iteration % save_interval == 0 and all_data:
temp_filename = f"esa_data_{starting_page}_{starting_page + current_iteration - 1}.csv"
save_to_file(all_data, temp_filename)
print(f"Data saved to {temp_filename}")
except KeyboardInterrupt:
print("Interrupted by user. Saving current data...")
finally:
filename = f"esa_data_{starting_page}-{starting_page + current_iteration - 1}.csv"
save_to_file(all_data, filename) if all_data else print("Failed to retrieve data. Status code:", response.status_code)
if __name__ == "__main__":
starting_page, save_interval, max_iterations, api_token = 1, 50, 2500, ""
# VERBOSE
# VERBOSE
import request
import csv
import time
from functools import lru_cache
@lru_cache(maxsize=128)
def get_data(api_token, url):
headers = {
"Authorization": f"Bearer {api_token}",
"DiscosWeb-Api-Version": "2"
}
response = requests.get(url, headers=headers)
return response
def extract_data(content):
data = content.get("data", [])
return [item.get("attributes", {}) for item in data]
def save_to_file(data, filename):
fieldnames = list(data[0].keys())
with open(filename, 'w', encoding='utf-8', newline='') as f:
writer = csv.DictWriter(f, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(data)
starting_page = 1939
save_interval = 20
max_iterations = 1000
api_token = ""
obj_per_page = 60
max_calls_per_min = 20
sleep_time = (obj_per_page/max_calls_per_min)
obj_per_min = obj_per_page * max_calls_per_min
print(f"Sleep time between queries: {sleep_time} seconds")
total_objects = 77156
requested_num_obj = (obj_per_page * max_iterations)
min_to_complete = requested_num_obj/obj_per_min
objs_not_gotten = total_objects - requested_num_obj
print(f"Total objects: {total_objects}")
print(f"Objects requested: {requested_num_obj}")
print(f"Objects not requested: {objs_not_gotten}")
print(f"Minutes to complete: {min_to_complete}")
def main(starting_page, save_interval, max_iterations, api_token):
current_iteration = 0
base_url = "https://discosweb.esoc.esa.int"
objects_endpoint = f"{base_url}/api/objects?page[number]={starting_page}"
response = get_data(api_token, objects_endpoint)
all_data = []
try:
if response.status_code == 200:
content = response.json()
all_data += extract_data(content) # if 200 response then extracted JSON from call is added to the list "all_data"
next_page_url = content['links'].get('next')
current_page = starting_page # whatever the starting page, current page is same...starting page not changed
# Pagination: Get data from all pages
while next_page_url and current_iteration < max_iterations:
current_iteration += 1
print(f"Fetching data from {base_url}{next_page_url}")
time.sleep(sleep_time)
response = get_data(api_token, f"{base_url}{next_page_url}")
if response.status_code == 200:
content = response.json()
all_data += extract_data(content)
next_page_url = content['links'].get('next')
current_page += 1 # if the next page gets 200 response, increment "current page" variable by 1
if current_iteration % save_interval == 0:
print(F"Start Page: {starting_page}")
print(f"Stop Page: {current_page - 1}")
print(f"Curr Page: {current_page}")
temp_filename = f"esa_data_{starting_page}_{current_page - 1}.csv"
save_to_file(all_data, temp_filename)
print(f"Data saved to {temp_filename}")
else:
print("Failed to retrieve data from the next page. Status code:", response.status_code)
break
else:
print("Failed to retrieve data. Status code:", response.status_code)
except KeyboardInterrupt:
print("Interrupted by user. Saving current data...")
finally:
if all_data:
filename = f"esa_data_{starting_page}-{current_page - 1}.csv"
save_to_file(all_data, filename)
print(f"All data saved to {filename}")
else:
print("Failed to retrieve data. Status code:", response.status_code)
if __name__ == "__main__":
main(starting_page, save_interval, max_iterations, api_token)s
Fork and spoon operator from Sector 7G
1yQuestion: do you understand what every line of code does? I know a few people starting to implement GPT code and was curious if people just start running code without understanding what it actually does. Seems like it could be a security risk?
Software Developer | Supra Coder | Space Systems Operator | USSF
1yI’m currently studying python in my spare time. With the goal of developing a wrapper to script out actions in STK. Getting the data you need in STK can be a pain, hopefully I’ll be able to make it easy for operators to get the job done quickly. Thanks for sharing you journey!
Chief Engineer at SkySentry, LLC
1yGood work. Inquiring minds create. Closed minds destroy. (Feel free to use that.). ESA exposes much metadata. USSF includes almost none. I marvel that TLE’s still include check sums and parameters needed only by PAVE PAWS west. Those are non-information. Just getting ESA data is good but not enough. How would you register US and ESA info to common coordinate framework and time scale?
CEO and Founder at Exa Research
1yNice job sharing your progress. You may want to sanitize the API key though. Otherwise you risk everyone using your API key rather than their own.
Chief Growth Officer & GM
1yI’m doing scheduling Tetris for Space Sympoisum, similar?