Learning to Code: Get ESA Data

Sean Allen

Operator by trade, engineer at heart, PM by necessity

Published Mar 25, 2023

During my years-long journey to learn to code, progress has been slow. With the help of #ChatGPT4, the barriers to learning have crumbled. This experience is not unique to me; everyone is learning to code, creating web apps and chatbots, dipping their toes into Python, JavaScript...and questionable jailbreaking to test the boundaries of these new sleepless #AI assistants.

Developing web apps and #chatbots is fun practice, but when the fun is over and #YouTube stops boosting "vanilla latte" #webapp tutorials...what then? Leverage your personal skills, interests, and expertise. See how you can contribute to a larger, more complex project at work...see if you can automate a simple task you do everyday and enjoy a practical benefit of #LLMs.

I am a Space Domain Awareness (#SDA) profession - a field with global participation where data science fundamentals are just so valuable. Instead of showing off my latest chatbot-webapp...(ahem...I did make one this week), I developed a script that allows users to retrieve data from the European Space Agency's (#ESA) "#discosweb" database. A large repository containing #satellite names, aliases, launch information, size and shape estimates, and more. A great data set to start with to learn about "what's up there". This is an incremental step toward #automating personal workflows, ensuring I stay informed about SDA developments.

The script has two versions below: verbose and concise. The verbose version includes comments and print statements to let you know how the data pull is working, good for #learning! The concise version is half the length (36 lines), combines variables, and streamlines the while loop for efficiency. Take a look and give me your feedback...or take the script and use it on your own #datascience journey! I am enthusiastic about the potential for growth and forthcoming opportunities!!

Recommended by LinkedIn

Devin Debunked, AI Coding Assistant, and GPT Explained…

HackerRank 5 months ago

Issue #300 - The ML Engineer 🤖

Alejandro Saucedo 1 month ago

Python AI and Machine Learning for Production &…

Free Online Courses With Printable Certificates 1 year ago

Cheers to you on your #DataScience journeys!

# CONCISE
import requests
import csv
import time
from functools import lru_cache
@lru_cache(maxsize=128)
def get_data(api_token, url):
    headers = {"Authorization": f"Bearer {api_token}", "DiscosWeb-Api-Version": "2"}
    return requests.get(url, headers=headers)
def save_to_file(data, filename):
    with open(filename, 'w', encoding='utf-8', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=list(data[0].keys()))
        writer.writeheader()
        writer.writerows(data)
def main(starting_page, save_interval, max_iterations, api_token):
    base_url = "https://discosweb.esoc.esa.int"
    objects_endpoint = f"{base_url}/api/objects?page[number]={starting_page}"
    response, all_data, current_iteration = get_data(api_token, objects_endpoint), [], 0
    try:
        while response.status_code == 200 and current_iteration < max_iterations:
            content = response.json()
            all_data += [item.get("attributes", {}) for item in content.get("data", [])]
            next_page_url, current_iteration = content['links'].get('next'), current_iteration + 1
            time.sleep(60 / 20) if next_page_url else None
            response = get_data(api_token, f"{base_url}{next_page_url}") if next_page_url else None
            if current_iteration % save_interval == 0 and all_data:
                temp_filename = f"esa_data_{starting_page}_{starting_page + current_iteration - 1}.csv"
                save_to_file(all_data, temp_filename)
                print(f"Data saved to {temp_filename}")
    except KeyboardInterrupt:
        print("Interrupted by user. Saving current data...")
    finally:
        filename = f"esa_data_{starting_page}-{starting_page + current_iteration - 1}.csv"
        save_to_file(all_data, filename) if all_data else print("Failed to retrieve data. Status code:", response.status_code)
if __name__ == "__main__":
    starting_page, save_interval, max_iterations, api_token = 1, 50, 2500, ""

# VERBOSE

# VERBOSE
import request
import csv
import time
from functools import lru_cache

@lru_cache(maxsize=128)
def get_data(api_token, url):
    headers = {
        "Authorization": f"Bearer {api_token}",
        "DiscosWeb-Api-Version": "2"
    }
    response = requests.get(url, headers=headers)
    return response

def extract_data(content):
    data = content.get("data", [])
    return [item.get("attributes", {}) for item in data]

def save_to_file(data, filename):
    fieldnames = list(data[0].keys())

    with open(filename, 'w', encoding='utf-8', newline='') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(data)

starting_page = 1939
save_interval = 20
max_iterations = 1000
api_token = ""

obj_per_page = 60
max_calls_per_min = 20
sleep_time = (obj_per_page/max_calls_per_min)
obj_per_min = obj_per_page * max_calls_per_min
print(f"Sleep time between queries: {sleep_time} seconds")

total_objects = 77156
requested_num_obj = (obj_per_page * max_iterations)
min_to_complete = requested_num_obj/obj_per_min
objs_not_gotten = total_objects - requested_num_obj
print(f"Total objects: {total_objects}")
print(f"Objects requested: {requested_num_obj}")
print(f"Objects not requested: {objs_not_gotten}")
print(f"Minutes to complete: {min_to_complete}")

def main(starting_page, save_interval, max_iterations, api_token):
    current_iteration = 0
    base_url = "https://discosweb.esoc.esa.int"
    objects_endpoint = f"{base_url}/api/objects?page[number]={starting_page}"
    response = get_data(api_token, objects_endpoint)
    all_data = []

    try:
        if response.status_code == 200:
            content = response.json()
            all_data += extract_data(content) # if 200 response then extracted JSON from call is added to the list "all_data"
            next_page_url = content['links'].get('next')
            current_page = starting_page # whatever the starting page, current page is same...starting page not changed

            # Pagination: Get data from all pages
            while next_page_url and current_iteration < max_iterations:
                current_iteration += 1
                print(f"Fetching data from {base_url}{next_page_url}")
                time.sleep(sleep_time)
                response = get_data(api_token, f"{base_url}{next_page_url}")

                if response.status_code == 200:
                    content = response.json()
                    all_data += extract_data(content)
                    next_page_url = content['links'].get('next')
                    current_page += 1 # if the next page gets 200 response, increment "current page" variable by 1

                    if current_iteration % save_interval == 0:
                        print(F"Start Page: {starting_page}")
                        print(f"Stop Page: {current_page - 1}")
                        print(f"Curr Page: {current_page}")
                        temp_filename = f"esa_data_{starting_page}_{current_page - 1}.csv"
                        save_to_file(all_data, temp_filename)
                        print(f"Data saved to {temp_filename}")
                else:
                    print("Failed to retrieve data from the next page. Status code:", response.status_code)
                    break
        else:
            print("Failed to retrieve data. Status code:", response.status_code)

    except KeyboardInterrupt:
        print("Interrupted by user. Saving current data...")

    finally:
        if all_data:
            filename = f"esa_data_{starting_page}-{current_page - 1}.csv"
            save_to_file(all_data, filename)
            print(f"All data saved to {filename}")
        else:
            print("Failed to retrieve data. Status code:", response.status_code)

if __name__ == "__main__":
    main(starting_page, save_interval, max_iterations, api_token)s

Jason Broussard

Fork and spoon operator from Sector 7G

Question: do you understand what every line of code does? I know a few people starting to implement GPT code and was curious if people just start running code without understanding what it actually does. Seems like it could be a security risk?

1 Reaction

Lorenzo Ross

Software Developer | Supra Coder | Space Systems Operator | USSF

I’m currently studying python in my spare time. With the goal of developing a wrapper to script out actions in STK. Getting the data you need in STK can be a pain, hopefully I’ll be able to make it easy for operators to get the job done quickly. Thanks for sharing you journey!

1 Reaction

David Finkleman

Chief Engineer at SkySentry, LLC

Good work. Inquiring minds create. Closed minds destroy. (Feel free to use that.). ESA exposes much metadata. USSF includes almost none. I marvel that TLE’s still include check sums and parameters needed only by PAVE PAWS west. Those are non-information. Just getting ESA data is good but not enough. How would you register US and ESA info to common coordinate framework and time scale?

1 Reaction

Tom Johnson

CEO and Founder at Exa Research

Nice job sharing your progress. You may want to sanitize the API key though. Otherwise you risk everyone using your API key rather than their own.

2 Reactions

John Moberly

Chief Growth Officer & GM

I’m doing scheduling Tetris for Space Sympoisum, similar?

1 Reaction

See more comments

To view or add a comment, sign in

SACT - Earthquakes, AskSage, and Launches

Apr 5, 2023

Learning to Code: Get ESA Data

Sean Allen

Operator by trade, engineer at heart, PM by necessity

Recommended by LinkedIn

More articles by this author

Insights from the community

Others also viewed

🥇Top ML Papers of the Week

Issue #218 - THE ML ENGINEER 🤖

Issue #191 - THE ML ENGINEER 🤖

6 Essential Headline Skill Areas for (Gen)AI work: A Guide for Hiring Managers

Issue #173 - THE ML ENGINEER 🤖

Issue #164 - THE ML ENGINEER 🤖

Week12 & Week13 of #100WeeksofAzureDataAI🍓

What are the Top 10 Data Science and AI Books of 2020

Re-Introducing DSPyGen: A Revolutionary Approach to AI Development

Train and Evaluate Classification Models with Scikit-learn to Predict Categories

Explore topics

Recommended by LinkedIn

SACT - Earthquakes, AskSage, and Launches

Apr 5, 2023

Insights from the community

Others also viewed

🥇Top ML Papers of the Week

Issue #218 - THE ML ENGINEER 🤖

Issue #191 - THE ML ENGINEER 🤖

6 Essential Headline Skill Areas for (Gen)AI work: A Guide for Hiring Managers

Issue #173 - THE ML ENGINEER 🤖

Issue #164 - THE ML ENGINEER 🤖

Week12 & Week13 of #100WeeksofAzureDataAI🍓

What are the Top 10 Data Science and AI Books of 2020

Re-Introducing DSPyGen: A Revolutionary Approach to AI Development

Train and Evaluate Classification Models with Scikit-learn to Predict Categories

Explore topics