Tony's Blog

What is an API?

Tony Kipkemboi — Fri, 29 Sep 2023 17:42:29 GMT

What are APIs and why do they matter?

Imagine youre ordering food at a restaurant. You look at the menu, pick your dishes, and tell the waiter your order. The waiter then goes to the kitchen, puts in your order, and brings back the prepared dishes.

The waiter serves as an interface between you and the kitchen. You dont need to know how the kitchen prepares the foodyou simply rely on the waiter to communicate your order and deliver the results.

This is similar to how APIs work in software. The API serves as an interface between two applications. One app makes a request to the API (the order), without needing to know how its implemented on the server side. The API then returns the response (the dishes).

For example, a weather app shows forecasts without knowing how weather data is aggregated and analyzed on the back-end. It simply calls the weather API to get the needed results.

So in essence, APIs act as intermediaries that handle communication and data exchange between different applications, abstracting away underlying complexities.

A brief history of API development

APIs have evolved significantly from their early beginnings to the integral role they play today.

In the 1960s and 1970s, APIs were mostly internal, providing connectivity between mainframe systems and custom software within organizations. For example, American Airlines developed an API in the 1960s for its Sabre airline reservation system.

The advent of service-oriented architecture in the 1990s led to the growth of web APIs, also called web services. Companies like eBay (1997), Amazon (2002), PayPal (2000), and Salesforce (2000) offered public APIs for payments, shopping carts, and infrastructure services.

The launch of the Google Maps API in 2005 and the Twitter API in 2006 accelerated the opening of public web APIs. They enabled new mashups and applications built on established platforms. Facebook, YouTube, and others followed this open API model.

The mobile app explosion in the late 2000s further drove API adoption. Uber (2009), Airbnb (2008), and Instagram (2010) relied on APIs to connect mobile apps to back-end services. The client-server separation became a standard pattern.

Today, API-first companies like Twilio (2008), Stripe (2010), and Plaid (2013) are disrupting industries by offering core functionality through APIs rather than traditional applications. The API economy continues to thrive.

Key API concepts and architecture

While APIs come in many forms, there are some common architectural concepts and components that apply to many web-based APIs.

At a high level, APIs allow client applications to access data or functionality from a server via API calls. The client makes requests to the APIs endpoints (URLs) and receives back responses.

For web APIs, requests and responses are typically sent over HTTP or HTTPS. APIs use HTTP request methods like GET, POST, PUT, and DELETE to perform operations. Response codes like 200 OK, 400 Bad Request, and 500 Server Error indicate the request status.

Most modern web APIs return data in lightweight JSON format rather than XML. Developer documentation and SDKs make it easier to integrate with APIs.

Other common API architecture components include:

Authentication like API keys to identify applications
Rate limiting to prevent abuse
Versioning to evolve APIs without breaking changes
Caching to improve performance
Status monitoring to track uptime

These architectural concepts power the seamless exchange of data between applications through modern web APIs.

Benefits of APIs

There are many benefits to building software and connecting systems using APIs:

ModularityAPIs allow code to be separated into reusable modules with clearly defined interfaces for communication. This breaks down system complexity.
Developer ExperienceWell-designed APIs improve developer experience by being easy to integrate with. API documentation and SDKs make APIs more accessible.
ScalabilityAPIs enable systems to scale by separating front-end and back-end components across servers. Back-ends can be expanded as needed.
Code ReuseAPIs allow code to be reused across multiple platforms. For example, a payment API can be integrated across web, mobile, etc.
InnovationPublic APIs enable new products and services by allowing developers to tap into functionality. The API economy thrives on this ecosystem.

There are also business benefits like additional revenue streams from API monetization and faster time-to-market by building on existing APIs. Overall, APIs done well provide a multitude of architecture and business benefits.

Introduction to REST and GraphQL

There are two leading architectural approaches for designing web APIsREST and GraphQL.

REST(Representational State Transfer) is one of the most prevalent API architectures. REST APIs rely on standard HTTP methods and status codes to access and manipulate textual data representations.

In a REST API, clients make requests to predefined endpoints at URLs representing individual resources. REST uses HTTP features like caching, content negotiation, and hypermedia controls.

GraphQL is a newer API architecture that was developed to address shortcomings of REST. Instead of accessing pre-set endpoints, GraphQL APIs allow clients to declare and retrieve structured data through a single endpoint using a query language.

In GraphQL, the client requests define the structure of the response rather than the server. This allows retrieving data in flexible shapes optimized for the client. GraphQL also uses a strongly typed schema system.

While REST is better suited for simple use cases, GraphQL improves efficiency for accessing complex nested data and enabling frequent client-driven changes. Well dive deeper into the technical comparison in upcoming posts.

This introduces the high-level concepts of REST and GraphQL. Both play major roles in modern API architecture and have their merits depending on your use case.

Summary

APIs have become a critical part of enabling functionality, data access, and connectivity in the software landscape. As usage continues to grow, API architecture has evolved from early internal origins to the wide array of public APIs fueling innovative applications we see today.

Core API architecture concepts like endpoints, HTTP methods, status codes, and authentication power the seamless data exchanges between clients and servers. Benefits like modularity, code reuse, and scalability have made APIs integral to modern software design.

REST and GraphQL have emerged as two leading architectural styles for crafting APIs optimized for different use cases. RESTs simplicity and wide adoption makes it a solid default choice, while GraphQL offers efficiency gains for complex data structures.

Build a Streamr Node Dashboard with Streamlit using Python

Tony Kipkemboi — Thu, 22 Jun 2023 21:17:14 GMT

Lets say this is the first time youve heard or read about Streamr Network and Streamlit; if you are already aware, then bear with me as I give a TLDR:

Streamr Network: is a decentralized, peer-to-peer network for real-time data publishing and subscription, providing a scalable, robust, and secure platform for data exchange without reliance on a central server.
Streamlit: is an open-source Python library for rapidly creating and deploying interactive web apps for data science and machine learning without needing web development skills.

Streamr Node Dashboard is an application built using Streamlit inspired by BrubeckScan (R.I.P), a Streamr node and rewards monitoring dApp built and maintained by Streamr community member Adam Phi Vo. The application is built around the concept of a Streamr Node, an entity in the network that processes and stores data. We will go through the process of building this application step-by-step.

The main features of the application are:

Fetching data from the Streamr API endpoints
Displaying details about a specific Streamr node
Displaying payouts and the latest claimed reward codes for a Streamr node

Dependencies

The dependencies for this project are fairly standard for a Python data app:

concurrent.futures - for running multiple requests concurrently
io - for handling byte streams
logging - for logging messages
math and re - for numerical and regular expression operations, respectively
datetime - for handling datetime objects
pytz - for handling timezone conversions
requests - for making HTTP requests
streamlit - for the web app framework
PIL (Pillow) and reportlab, svglib - for handling images and SVGs
config - is a custom module containing configuration parameters (like API base URLs)

  import logging  import math  import re  import pytz  import requests  import config  import streamlit as st  import concurrent.futuresimport io  from datetime import datetime  from typing import Optional  from PIL import Image  from reportlab.graphics import renderPM  from svglib.svglib import svg2rlg

Streamlit App Configuration

Streamlits set_page_config function is used to customize the app's page settings, including the title, icon, layout, initial sidebar state, and menu items.

# Streamlit page config MUST be the first Streamlit command# used in your app, and MUST only be set oncest.set_page_config(    page_title="Streamr BrubeckScan Dashboard App",    page_icon=":lightning:",    layout="wide",    initial_sidebar_state="expanded",    menu_items={        'Get help': 'https://www.thedataengineerblog.com/',        'About': "# This is a Streamlit clone version of the official Streamr BrubeckScan dashboard."})# Set up logginglogging.basicConfig(filename='app.log',                     level=logging.INFO,                    format='%(asctime)s - %(levelname)s - %(message)s')

Fetching Data

The fetch_data() function fetches data from a given endpoint and handles any errors that might occur:

def fetch_data(endpoint: str) -> dict:    """    Fetch data from a given endpoint.    Args:    endpoint: The URL of the endpoint to fetch data from.    Returns:    The JSON response from the endpoint as a dictionary.    Returns None if the request fails.    """    try:        response = requests.get(endpoint)        response.raise_for_status()        return response.json()    except requests.exceptions.RequestException as e:        logging.error(f"Request to {endpoint} failed: {e}")        return None

The fetch_node_data function is a specialized function for fetching data about a specific Streamr node from the Streamr API:

def fetch_node_data(node_address: str) -> dict:    """    Fetch data for a specific Streamr node.    Args:    node_address: The Ethereum address of the Streamr node.    Returns:    The data for the Streamr node as a dictionary.    Returns None if the request fails.    """    logging.info(f"Fetching node data for address {node_address}")    return fetch_data(f"{config.API_BASE}/nodes/{node_address}")

The get_metrics_data function fetches metrics data for a specific Streamr node. It uses a ThreadPoolExecutor from the concurrent.futures module to fetch data from multiple endpoints concurrently; this is overkill, but why not? 😂:

def get_metrics_data(node_address: str) -> dict:    """    Fetch metrics data for a specific Streamr node.    Args:    node_address: The Ethereum address of the Streamr node.    Returns:    The metrics data for the Streamr node as a dictionary.    Returns None if any of the requests fail.    """    logging.info(f"Getting metrics data for node {node_address}")    data = {        "acc_rewards": f"{config.DATA_REWARDS_BASE}/{node_address}",        "claimed_rewards": f"{config.CLAIMED_REWARDS_BASE}/{node_address}",        "apr_apy": config.APR_APY_BASE    }    with concurrent.futures.ThreadPoolExecutor() as executor:        future_to_url = {executor.submit(            fetch_data, url): key for key, url in data.items()}        results = {future_to_url[future]: future.result(        ) for future in concurrent.futures.as_completed(future_to_url)}    # Exclude any endpoints that failed to respond    return {k: v for k, v in results.items() if v is not None}

Data Transformation

The functions below are handy for displaying times in a users local timezone. The goal is to make it easier for users to understand when the node events occurred in their timezone. We will use the functions to build the dashboard display later.

def convert_time_to_user_tz(time_str: str, user_tz: str) -> str:    """    Convert a time string to a given timezone and format it.    Args:    time_str: The time string to convert. It should be in ISO 8601 format (i.e., "YYYY-MM-DDTHH:MM:SS.sssZ").    user_tz: The timezone to convert the time to.    Returns:    The time converted to the user's timezone and formatted as a string.    """    utc = pytz.timezone('UTC')    user_tz = pytz.timezone(user_tz)    # Convert the string to a datetime object    dt = datetime.strptime(time_str, "%Y-%m-%dT%H:%M:%S.%fZ")    # Set the timezone to UTC (since the original time is in UTC)    dt = utc.localize(dt)    # Convert to user selected timezone    dt_user_tz = dt.astimezone(user_tz)    # Format the time in the desired way (12-hour time)    formatted_time = dt_user_tz.strftime("%I:%M:%S %p")    return formatted_timedef convert_dt_to_user_tz(dt: datetime, user_tz: str) -> str:    """    Convert a datetime object to a given timezone and format it.    Args:    dt: The datetime object to convert. It should be naive (i.e., timezone-unaware).    user_tz: The timezone to convert the datetime to.    Returns:    The datetime converted to the user's timezone and formatted as a string.    """    utc = pytz.timezone('UTC')    user_tz = pytz.timezone(user_tz)    # Set the timezone to UTC (since the original time is in UTC)    dt = utc.localize(dt)    # Convert to user selected timezone    dt_user_tz = dt.astimezone(user_tz)    # Format the datetime in the desired way (day, date, time, and timezone)    formatted_time = dt_user_tz.strftime("%a, %d %b %Y %H:%M:%S %Z")    return formatted_time

Display Functions

The check_statusmethod returns a boolean status of the node. OK means the node is up and operational while NO means not operational.

def check_status(status: bool) -> str:    """    Check the status of a Streamr node.    Args:    status: The status of the Streamr node.    Returns:    A string representing the status of the Streamr node.    """    return ":green[OK]" if status else ":red[NO]"

display_node_info() function shows specific metrics about the Streamr node.

def display_node_info(node_address: str, node_data: dict) -> None:    """    Display information about a specific Streamr node.    Args:    node_address: The Ethereum address of the Streamr node.    node_data: The data for the Streamr node.    Returns:     None    """    st.divider()    col1, col2, col3 = st.columns(3)    col1.image(node_data['data']['node']['identiconURL'],               caption='Node Identicon')    col2.metric("Node Address", node_address[:4] + "...")    col1.markdown(        f"Status: **{check_status(node_data['data']['node']['status'])}**")    col3.metric("Staked $DATA", node_data['data']['node']['staked'])    col2.metric("To be Received", round(        node_data['data']['node']['toBeReceived'], 2))    col2.metric("Total rewards", node_data['data']['node']['rewards'])    col3.metric("Claim Count", node_data['data']['node']['claimCount'])    col3.metric("Percentage of received claims %", round(        node_data['data']['node']['claimPercentage'], 2))

When you run a Streamr Broker node, you periodically receive reward codesat random intervals. These reward codes then verify your node's activity and eligibility to receive rewards. Read more detailed info on Mining on Streamr.

The display_latest_codes function displays the latest reward codes received for the Streamr node.

def display_latest_codes(node_data: dict, col: st.delta_generator.DeltaGenerator) -> None:    """    Display the latest claimed reward codes for a Streamr node.    Args:    node_data: The data for the Streamr node.    col: The Streamlit column to display the codes in.    Returns:    None    """    all_timezones = pytz.all_timezones    selected_tz = col.selectbox("Select your timezone",                                 all_timezones,                                 index=all_timezones.index('US/Eastern')    )    for code in node_data['data']['node']['claimedRewardCodes']:        formatted_time = convert_time_to_user_tz(code['claimTime'],                                                  selected_tz        )        col.write(f"{code['id']}  {formatted_time}")

display_payouts shows the historical payouts for the node; $DATA token rewards earned by running the node.

def display_payouts(node_data: dict) -> None:    """    Display the payouts for a Streamr node.    Args:    node_data: The data for the Streamr node.    Returns:    None    """    # Create placeholders for headers    st.divider()    header1, header2 = st.columns(2)    header1.header("Payouts")    header2.header("Latest codes")    # Create columns for the contents    cols = st.columns([4, 2, 12])    utc = pytz.timezone('UTC')    payouts = node_data['data']['node']['payouts']    payouts.reverse()    for payout in payouts:        # Convert the timestamp to a datetime object        payout_time = datetime.utcfromtimestamp(int(payout['timestamp']))        # Use convert_dt_to_user_tz() since payout_time is already a datetime object        formatted_time = convert_dt_to_user_tz(payout_time, 'UTC')        rounded_payout = math.ceil(float(payout['value']))        # Use the first column for the text and the second for the SVG        cols[0].markdown(f"{formatted_time}  {rounded_payout}")        display_svg(cols[1], "assets/data_token.svg", width=20, height=20)    # Display the latest codes in the third column    display_latest_codes(node_data, cols[2])    st.divider()

💡 I used the display_svg function to display the Streamr SVG image beside the payout information. If you are following this tutorial step-by-step, you must have the SVG image saved in your directory under the `assets/data_token.svg` folder.

def display_svg(col: st.delta_generator.DeltaGenerator, path: str, width: Optional[int] = None, height: Optional[int] = None) -> None:    """    Display an SVG image in a Streamlit column.    Args:        col: The Streamlit column to display the image in.        path: The path to the SVG file.        width: The width to resize the image to. If None, the original width of the image is used.        height: The height to resize the image to. If None, the original height of the image is used.    Returns:    None    """    # Load the SVG file and convert it to a ReportLab Drawing    drawing = svg2rlg(path)    # Convert the Drawing to a PIL image    pil_image = renderPM.drawToPIL(drawing)    # Resize the image if width and height are provided    if width and height:        pil_image = pil_image.resize((width, height))    # Convert the PIL image to an IO Bytes object so Streamlit can display it    image_stream = io.BytesIO()    pil_image.save(image_stream, format='PNG')    pil_image = Image.open(image_stream)    # Display the image    col.image(pil_image, use_column_width=False)

Final Step

Nobody:

Pythons `if __name__ == __main__`:

Time to glue everything together:

import concurrent.futuresimport ioimport loggingimport mathimport refrom datetime import datetimefrom typing import Optionalimport pytzimport requestsimport streamlit as stimport streamlit.components.v1 as componentsfrom PIL import Imagefrom reportlab.graphics import renderPMfrom svglib.svglib import svg2rlgimport config# Streamlit page config MUST be the first Streamlit command# used in your app, and MUST only be set oncest.set_page_config(page_title="Streamr Node Dashboard App",page_icon=":lightning:",layout="wide",initial_sidebar_state="expanded",menu_items={    'Get help': 'https://www.thedataengineerblog.com/',    'About': "# This is a Streamlit clone version of the official Streamr BrubeckScan dashboard."    })# Set up logginglogging.basicConfig(filename='app.log',                     level=logging.INFO,                    format='%(asctime)s - %(levelname)s - %(message)s')def fetch_data(endpoint: str) -> dict:    """    Fetch data from a given endpoint.    Args:    endpoint: The URL of the endpoint to fetch data from.    Returns:    The JSON response from the endpoint as a dictionary. Returns None if the request fails.    """    try:        response = requests.get(endpoint)        response.raise_for_status()        return response.json()    except requests.exceptions.RequestException as e:        logging.error(f"Request to {endpoint} failed: {e}")        return Nonedef fetch_node_data(node_address: str) -> dict:    """    Fetch data for a specific Streamr node.    Args:    node_address: The Ethereum address of the Streamr node.    Returns:    The data for the Streamr node as a dictionary. Returns None if the request fails.    """    logging.info(f"Fetching node data for address {node_address}")    return fetch_data(f"{config.API_BASE}/nodes/{node_address}")def get_metrics_data(node_address: str) -> dict:    """    Fetch metrics data for a specific Streamr node.    Args:    node_address: The Ethereum address of the Streamr node.    Returns:    The metrics data for the Streamr node as a dictionary. Returns None if any of the requests fail.    """    logging.info(f"Getting metrics data for node {node_address}")    data = {        "acc_rewards": f"{config.DATA_REWARDS_BASE}/{node_address}",        "claimed_rewards": f"{config.CLAIMED_REWARDS_BASE}/{node_address}",        "apr_apy": config.APR_APY_BASE    }    with concurrent.futures.ThreadPoolExecutor() as executor:        future_to_url = {executor.submit(            fetch_data, url): key for key, url in data.items()}        results = {future_to_url[future]: future.result(            ) for future in concurrent.futures.as_completed(future_to_url)}    # Exclude any endpoints that failed to respond    return {k: v for k, v in results.items() if v is not None}def convert_time_to_user_tz(time_str: str, user_tz: str) -> str:    """    Convert a time string to a given timezone and format it.    Args:    time_str: The time string to convert. It should be in ISO 8601 format (i.e., "YYYY-MM-DDTHH:MM:SS.sssZ").    user_tz: The timezone to convert the time to.    Returns:    The time converted to the user's timezone and formatted as a string.    """    utc = pytz.timezone('UTC')    user_tz = pytz.timezone(user_tz)    # Convert the string to a datetime object    dt = datetime.strptime(time_str, "%Y-%m-%dT%H:%M:%S.%fZ")    # Set the timezone to UTC (since the original time is in UTC)    dt = utc.localize(dt)    # Convert to user selected timezone    dt_user_tz = dt.astimezone(user_tz)    # Format the time in the desired way (12-hour time)    formatted_time = dt_user_tz.strftime("%I:%M:%S %p")    return formatted_timedef convert_dt_to_user_tz(dt: datetime, user_tz: str) -> str:    """    Convert a datetime object to a given timezone and format it.    Args:    dt: The datetime object to convert. It should be naive (i.e., timezone-unaware).    user_tz: The timezone to convert the datetime to.    Returns:    The datetime converted to the user's timezone and formatted as a string.    """    utc = pytz.timezone('UTC')    user_tz = pytz.timezone(user_tz)    # Set the timezone to UTC (since the original time is in UTC)    dt = utc.localize(dt)    # Convert to user selected timezone    dt_user_tz = dt.astimezone(user_tz)    # Format the datetime in the desired way (day, date, time, and timezone)    formatted_time = dt_user_tz.strftime("%a, %d %b %Y %H:%M:%S %Z")    return formatted_timedef check_status(status: bool) -> str:    """    Check the status of a Streamr node.    Args:    status: The status of the Streamr node.    Returns:    A string representing the status of the Streamr node.    """    return ":green[OK]" if status else ":red[NO]"def display_node_info(node_address: str, node_data: dict) -> None:    """    Display information about a specific Streamr node.    Args:        node_address: The Ethereum address of the Streamr node.        node_data: The data for the Streamr node.    Returns:    None    """    st.divider()    col1, col2, col3 = st.columns(3)    col1.image(node_data['data']['node']['identiconURL'],    caption='Node Identicon')    col2.metric("Node Address", node_address[:4] + "...")    col1.markdown(        f"Status: **{check_status(node_data['data']['node']['status'])}**")    col3.metric("Staked $DATA", node_data['data']['node']['staked'])    col2.metric("To be Received", round(        node_data['data']['node']['toBeReceived'], 2))    col2.metric("Total rewards", node_data['data']['node']['rewards'])    col3.metric("Claim Count", node_data['data']['node']['claimCount'])    col3.metric("Percentage of received claims %", round(        node_data['data']['node']['claimPercentage'], 2))def display_latest_codes(node_data: dict, col: st.delta_generator.DeltaGenerator) -> None:    """    Display the latest claimed reward codes for a Streamr node.    Args:    node_data: The data for the Streamr node.    col: The Streamlit column to display the codes in.    Returns:    None    """    all_timezones = pytz.all_timezones    selected_tz = col.selectbox(        "Select your timezone", all_timezones, index=all_timezones.index('US/Eastern'))    for code in node_data['data']['node']['claimedRewardCodes']:        formatted_time = convert_time_to_user_tz(            code['claimTime'], selected_tz)        col.write(f"{code['id']}  {formatted_time}")def display_svg(col: st.delta_generator.DeltaGenerator, path: str, width: Optional[int] = None, height: Optional[int] = None) -> None:    """    Display an SVG image in a Streamlit column.    Args:    col: The Streamlit column to display the image in.    path: The path to the SVG file.    width: The width to resize the image to. If None, the original width of the image is used.    height: The height to resize the image to. If None, the original height of the image is used.    Returns:    None    """    # Load the SVG file and convert it to a ReportLab Drawing    drawing = svg2rlg(path)    # Convert the Drawing to a PIL image    pil_image = renderPM.drawToPIL(drawing)    # Resize the image if width and height are provided    if width and height:        pil_image = pil_image.resize((width, height))    # Convert the PIL image to an IO Bytes object so Streamlit can display it    image_stream = io.BytesIO()    pil_image.save(image_stream, format='PNG')    pil_image = Image.open(image_stream)    # Display the image    col.image(pil_image, use_column_width=False)def display_payouts(node_data: dict) -> None:    """    Display the payouts for a Streamr node.    Args:    node_data: The data for the Streamr node.    Returns:    None    """    # Create placeholders for headers    st.divider()    header1, header2 = st.columns(2)    header1.header("Payouts")    header2.header("Latest codes")    # Create columns for the contents    cols = st.columns([4, 2, 12])    utc = pytz.timezone('UTC')    payouts = node_data['data']['node']['payouts']    payouts.reverse()    for payout in payouts:        # Convert the timestamp to a datetime object        payout_time = datetime.utcfromtimestamp(int(payout['timestamp']))        # Use convert_dt_to_user_tz() since payout_time is already a datetime object        formatted_time = convert_dt_to_user_tz(payout_time, 'UTC')        rounded_payout = math.ceil(float(payout['value']))        # Use the first column for the text and the second for the SVG        cols[0].markdown(f"{formatted_time}  {rounded_payout}")        display_svg(cols[1], "assets/data_token.svg", width=20, height=20)        # Display the latest codes in the third column        display_latest_codes(node_data, cols[2])        st.divider()def main() -> None:    """    The main function of the Streamlit app.     It asks the user for a Streamr node Ethereum address, fetches data for the node, and displays it.    Returns:    None    """    st.title(" Streamr Node Dashboard App ")    node_address = st.text_input(        "Enter a Streamr Node Ethereum address here", placeholder="0x4a2A3501e50759250828ACd85E7450fb55A10a69", max_chars=42)    with st.expander('Copy the address in this expander and paste above for testing 🎉'):        st.code('''0x4a2A3501e50759250828ACd85E7450fb55A10a69''')    if node_address:        logging.info(f"Processing node address {node_address}")        if re.match("^0x[a-fA-F0-9]{40}$", node_address):            node_data = fetch_node_data(node_address)            if node_data is not None and 'data' in node_data and 'node' in node_data['data']:                get_metrics_data(node_address)                display_node_info(node_address, node_data)                display_payouts(node_data)            else:                logging.error(                f"Failed to fetch data for address {node_address}. Please make sure it is a valid Streamr node address.")                st.error(                "Failed to fetch data for the given Ethereum address. Please make sure it is a valid Streamr node address.")        else:        logging.error(        f"Invalid Ethereum address: {node_address}. It should be 42 characters long (including '0x') and hexadecimal.")        st.error(        "Invalid Ethereum address. It should be 42 characters long (including '0x') and hexadecimal.")    else:        logging.warning(            "No Streamr node Ethereum address provided...")        st.warning(            "Please enter a Streamr node Ethereum address to fetch data...")    st.markdown("🔗 **Useful Links**")    st.markdown("- [Streamr Network](https://streamr.network/)")    st.markdown("- [Streamr Hub](https://streamr.network/projects)")    st.markdown("- [Earn $DATA](https://frens.streamr.network/intro)")    st.markdown("- [Streamr Twitter](https://twitter.com/streamr)")    st.markdown(        "💡 **Remember:** Keep building and shipping for a robust decentralized data economy!")if __name__ == '__main__':    main()

Congratulations, youve created a stunning dashboard! 🎉

Feel free to clone the full codebase in the repo.

If you run into any issues dont hesitate to reach out to me or post in comments!

This article was originally published on Streamr Blog on June 13, 2023.

The 4 Common Data Formats in Data Engineering

Tony Kipkemboi — Tue, 30 May 2023 13:01:08 GMT

Introduction

Choosing the right data format is an integral part of data engineering. The decision significantly influences data storage, processing speed, and interoperability.

This article dissects four popular data formats: CSV, JSON, Parquet, and Avro, each with unique strengths and ideal use cases. It also includes Python code snippets demonstrating reading and writing in each file format.

CSV (Comma-Separated Values)

CSV is a simple file format that organizes data into tabular form. Each line in a CSV file represents a record, and commas separate individual fields.

Structure: CSV files represent data in a tabular format. Each row corresponds to a data record, and each column represents a data field.

Pros:

CSV files are simple, lightweight, and human-readable.
They are broadly supported across platforms and programming languages.
Parsing CSV files is straightforward due to their simple structure.

Cons:

CSV files lack a standard schema, leading to potential inconsistencies in data interpretation.
They do not support complex data types or hierarchical or relational data.
They are inefficient for large datasets due to slower read/write speeds.

Use Cases: CSV is a practical choice for simple, flat data structures and smaller datasets where human readability is crucial.

Fun Fact: CSV was first supported by IBM Fortran in 1972, largely because it was easier to type CSV lists on punched cards.

Reading CSV:

import pandas as pddata = pd.read_csv('data.csv')print(data.head())

Writing CSV:

data.to_csv('new_data.csv', index=False)

JSON is a data-interchange format that uses human-readable text to store and transmit objects comprising attribute-value pairs and array data types.

Structure: JSON data is represented as key-value pairs and supports complex nested structures. It allows the use of arrays and objects, enabling a flexible and dynamic schema.

Pros:

JSON files support complex data structures, including nested objects and arrays.
They are language-independent, interoperable, and widely used in web APIs due to their compatibility with JavaScript.

Cons:

JSON files can be inefficient for large datasets because of their verbose nature and repeated field names.
They are not ideal for binary data storage.

Use Cases: JSON is the go-to data format for data interchange between web applications and APIs, especially when dealing with complex data structures.

Fun Fact: Douglas Crockford and Chip Morningstar sent the first JSON message in April 2001.

Reading JSON:

import jsonwith open('data.json') as f:    data = json.load(f)    print(data)

Writing JSON:

with open('new_data.json', 'w') as f:    json.dump(data, f)

Parquet

Apache Parquet is a columnar storage file format available to any project in the Hadoop ecosystem.

Structure: Parquet arranges data by columns, allowing efficient read operations on a subset of the columns. It offers advanced data compression and encoding schemes.

Pros:

Parquet files offer efficient disk I/O and are suitable for query performance due to their columnar storage.
They support complex nested data structures and offer high compression, reducing storage costs.
They are compatible with many data processing frameworks, such as Apache Hadoop, Apache Spark, and Google BigQuery.

Cons:

Parquet files are not human-readable.
They have slower write operations due to compressing and encoding data overhead.

Use Cases: Parquet is the preferred choice for analytical queries and big data operations, where efficient columnar reads are more crucial than write performance.

Fun Fact: Parquet was designed to improve the Trevni columnar storage format created by Doug Cutting, the creator of Hadoop. The first version, Apache Parquet 1.0, was released in July 2013.

Reading Parquet (requires thepyarroworfastparquetlibrary):

import pandas as pddata = pd.read_parquet('data.parquet')print(data.head())

Writing Parquet:

data.to_parquet('new_data.parquet')

Avro

Apache Avro is a row-based storage format designed for data serialization in big data applications.

Structure: Avro stores data definition in JSON format and data in binary format, facilitating compact, fast binary serialization and deserialization. It also supports schema evolution.

Pros:

Avro files provide a compact binary data format that supports schema evolution.
They offer fast read/write operations, making them suitable for real-time processing.
They are widely used with Kafka and Hadoop for data serialization.

Cons:

Avro files are not human-readable.
They require the schema to read the data, adding a layer of complexity.

Use Cases: Avro is the optimal choice for big data applications requiring fast serialization/deserialization and systems needing schema evolutions flexibility.

Fun Fact: Avro was developed by the creator of Apache Hadoop, Doug Cutting, specifically to address big data challenges. The initial release of Avro was on November 2, 2009.

Reading Avro (requires theavrolibrary):

from avro.datafile import DataFileReaderfrom avro.io import DatumReaderwith DataFileReader(open("data.avro", "rb"), DatumReader()) as reader:    for record in reader:        print(record)

Writing Avro:

from avro.datafile import DataFileWriterfrom avro.io import DatumWriterfrom avro.schema import parse# you need a schema to write Avroschema = parse(open("data_schema.avsc", "rb").read()) with DataFileWriter(open("new_data.avro", "wb"), DatumWriter(), schema) as writer:    writer.append({"name": "test", "favorite_number": 7, "favorite_color": "red"})

**Note:**For Avro, you need a schema to write data. The schema is a JSON object that defines the data structure.

Choosing the Right Format

The decision to select a data format isnt a one-size-fits-all situation. It largely depends on several factors, such as the nature and volume of your data, the type of operations youll perform, and the storage capacity.

While CSV and JSON are excellent for simplicity and interoperability, Parquet and Avro stand out when dealing with big data due to their read operations and serialization efficiencies.

To make a well-informed decision:

Evaluate the structure of your data: Is it flat or nested? Simple or complex?
Consider the volume of data: Large datasets may require efficient formats like Parquet or Avro.
Think about the operations: Are you performing more read operations or write operations? Do you need real-time processing?
Consider the storage: Columnar formats like Parquet offers high compression, reducing storage costs.
Consider interoperability: Do you need to share this data with other systems?

Conclusion

Understanding data formats and their strengths is important in the data engineering process. Whether you choose CSV, JSON, Parquet, or Avro, its about picking the right tool for your specific use case. As a data engineer, your role is to balance the trade-offs and choose the format that best serves your data, performance requirements, and business needs.

I hope this deep dive into CSV, JSON, Parquet, and Avro will guide you in your data format selection process.

Stay tuned for more technical content, and dont forget to subscribe to receive updates when it ships!

I am Leaving Data Engineering for Developer Relations

Tony Kipkemboi — Wed, 09 Nov 2022 12:45:45 GMT

I am stepping away from my Data Engineering role at Booz Allen Hamilton and joining Snowflake as a Developer Relations Associate working on Streamlit.

I am humbled and excited to have the opportunity to work with an open-source tool that I am passionate about and a great team.

First, let me take you back to where it all started👇🏿

My Time in the Military

I served in the U.S. Army for seven years before transitioning out in September 2021. Apart from being a soldier, my Military Occupational Specialty (a.k.a. MOS)think job titlewas a Medical Laboratory Technician.

In my first four years of service, I worked at a military health center with daily tasks; checking-in patients, making blood smears, streak-plating bacterial cultures, packaging samples for shipment, and phlebotomy.

My final three years were at a military research institute (USAMRIID). It is during my time here that I found my interest in coding. I worked in Genomics for my assignment and got the chance to interface with Bioinformaticians using Python to analyze the genomic data that I and other research assistants generated from the wet lab. I will save the granular details for another article.

Serving in the military gave me many wonderful experiences and an education to be appreciated. The valuable experiences I acquired while serving will stay with me forever.

Transition to Private Sector

I knew it was time to transition out of the military early in my last tour of service. I remember feeling very stressed and lost between late 2019 and April 2020. The military is all I had known for seven years, and that was my first job.

I entertained the idea of pursuing medicine as a career after service since that was my goal for a long time. After shadowing several doctors, I decided that was not for me anymore.

I gave tech a chance while working in Genomics and interacting with the Bioinformaticians who urged me to continue learning and even gave me simple automation project ideas I could implement at work, motivating me to learn Python. A few months of self-teaching went by, and after applying to grad school, I got my acceptance to the University of Pennsylvanias Masters in Computer and Information Technology.

Six months before my transition, I had the opportunity to intern at Merck & Co. as a Data Engineer. I had a fantastic supervisor who was very supportive and still is to this date. During my internship, I was interviewing for a role at Bloomberg L.P., where I got an offer and had to relocate to the New York City area. I could not have found a better landing coming out of the military.

My time at Bloomberg was pivotal to my journey, and I need a dedicated article to cover all the highlights. Working at Bloomberg was my first tech role and first private sector job. I made great friends and learned much about the financial markets and data. Due to family and personal reasons, we had to relocate out of the area; tl;dr raising a child in the city is not easy for us folks used to open spaces and quietness.

I found my next role with Booz Allen Hamilton as a Data Engineer. Working in the government consulting space was a little familiar from my time in service, and like anything in the government, things moved a little slower compared to the private sector. I enjoyed the teamwork and the flexibility to learn new technologies like cloud services.

Why Pivot to Developer Relations?

Up to this point in my tech career, I have focused on continuous learning by building side projects on top of my day job. In late 2021, I started a YouTube channel where I create blockchain and python tutorials. I have since started writing articles and taking speaking engagements to share my journey and coding workshops.

Honestly, I have felt happier letting my thoughts pour out in written and video format. I enjoy teaching and aspire to break down technical concepts into digestible bits, especially for newbies.

What is Developer Relations (DevRel)? You might wonder. DevRel is an umbrella with different roles within it, such as:

Developer Experience (DevX)
Developer Advocate
Developer Evangelist
Developer Marketing

From the roles above, you can deduce that DevRel is a multifaceted role sitting at the intersection of engineering, product, and marketing.

DevRel scope of responsibility vary by company, but you can expect to have work on a combination of writing documentation and articles, building tools and code samples, video content production, speaking at conferences, and helping developer with blockers they encounter via channels like Stack Overflow and Reddit to name a few. DevRel enables organizations to collect user input, creating a feedback loop mechanism that helps improve the product for the users.

Why Streamlit🎈 at Snowflake?

TL;DR: Streamlit;

It is an open-source app Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science.
It turns data scripts into shareable web apps in minutes.
All in pure Python.
No front-end experience is required!

*Streamlit video demo: https://streamlit.io/*

In this video, Adrien Treuille, Head of Streamlit at Snowflake, demonstrates how to build interactive apps in Snowflake using Streamlit.

https://www.youtube.com/watch?v=e8kZQDKeNwk

When I started learning Python in 2020, I found that I needed to make my projects interactive and user-facing. Naturally, I tried using Flask and Django, but it was a bit painful as a newbie. I stumbled upon Streamlit, and since then, it has been my go-to module. Here are some of the apps I have built using Streamlit;

Twitter Scraper using snscrape Module: A user enters a phrase in the search box, the number of records to return, and a date range. Theres an option to download the dataset to a CSV file.
MVRV (Market Value to Realized Value) Dashboard App: This app I built during a web3 hackathon a few months ago. I used Glassnodes API to get the data and display it on the app. I won a bounty for the app! The app shows the MVRV ratio (tl;dr the ratio tells us if the price of a token/stock is fair or not) of Bitcoin, Ethereum, or Litecoin.
Yelp Reviews Sentiment Analysis WebApp: One of the first Streamlit apps I built. The app scrapes Yelp reviews and scores the sentiment using Hugging Faces BERT model.

My MVRV app was featured in Streamlit Weekly Roundup in March and got fantastic stickers and a hand-written letter; these hardly come by these days!

Considering my career path and interests, being a DevRel for Streamlit at Snowflake is an exciting and obvious decision.

I am excited to work alongside a talented team that includes one of my favorite data content creators Chanin Nantasenamat, a.k.a. The Data Professor. I recommend you subscribe to his YouTube channel.

If you are new to Python or a seasoned developer looking to start with Streamlit, I recommend you start here. Please dont hesitate to reach out and connect with me anytime.

Happy Streamlit-ing🎈

How to Query The Graph Protocol for Onchain Data using Python

Tony Kipkemboi — Fri, 09 Sep 2022 10:30:42 GMT

In this tutorial, we will query the ENS Subgraph using two methods; raw GraphQL query and Subgrounds library by Playgrounds.

The goal is for you to be able to:

query any Subgraph data using Python
understand the two querying methods

What are Subgraphs anyway?

The TL;DR definition of The Graph protocol is described by Tegan Kline, a Co-Founder of The Graph protocol, in her tweet;

The Graph is a decentralized indexing protocol that allows developers to build open-source APIs called subgraphs to query networks like Ethereum, IPFS, and other supported chains. There are 522, and growing, subgraphs deployed at the time of writing. Anyone can query subgraphs for on-chain data, as we will in this tutorial.

Now lets get querying!

Tech Stack

Coding Language: Python
IDE/Coding Platform: Jupyter Notebook (Anaconda)
GraphQL: open-source data query language(QL part of GraphQL) used for APIs
Blockchain API: Subgraph(ENS Subgraph for demo purposes)

Getting Started

Requirements: To follow this tutorial, you will need Python 3.10 and Anaconda installed on your system.

Using Raw GraphQL to Query Subgraphs with Python

Step 1: Setup coding environment

Once you have installed Python and Anaconda, open your command line, create a folder, and change the directory into your folder:

mkdir  && cd

Create a Python virtual environment to keep our project dependencies isolated:

python -m venv env

Activate the virtual environment (env); you should see the name of the environment prefixed after successful activation as such,(env) C:\ :

.\env\Script\activate

Now that we have our environment up and ready, lets install some libraries that our project will depend on for querying data:

pip install pandas requests

To confirm you have the needed packages (pandas and requests), use pip to check:

pip freeze

Since we will be using the virtual environment in Jupyter Notebook, we need to add it as such:

install the ipykernel package, which provides the IPython kernel for Jupyter:

  pip install --user ipykernel

add virtual environment to Jupyter by typing:

  python -m ipykernel install --name=env

After running the command above, you should see something like this:

Installed kernelspec env in C:\ProgramData\jupyter\kernels\env

The final step of the setup is to open Jupyter Notebook; run this command:

jupyter notebook

A tab will open in your browser with Jupyter on localhost.

Locate the New tab and choose env to open a notebook with your created virtual environment.

Now we are ready to roll!

Step 2: Prepare raw query

For this tutorial, we will query the ENS(Ethereum Name Service) Subgraph for some fun data. ENS is a naming system based on the Ethereum blockchain. The common use of ENS has been to map human-readable domain names like vitalik.eth to machine-readable Ethereum addresses. ENS is very popular in the Web3 space; hence the high activity of registrations as seen on ENS Etherscan Registrar Controller.

Lets say we are interested in obtaining the latest data about registered domains to answer these questions;

what are the latest domain names registered?
who are the registrants of the names (hexadecimal ETH addresses)?
when did the registration happen?
what was the cost of registration in ETH?
what is the expiry date of the domain names?

The Graph protocol provides a playground and explorer where anyone can write custom queries to pull data from any given subgraph.

The explorer is much easier to use because of the radio buttons compared to the playground, where you must manually type the queries.

We will play around with the explorer by selecting entities and querying data until we find a query that returns the data to answer our questions above; this is the final query:

query ENSData {  # latest 1000 ENS registrations  registrations(first:1000, orderBy:registrationDate, orderDirection:desc){    domain{      name # like`vitalik.eth`    }    registrant {      id # hexadecimal address    }    registrationDate     cost     expiryDate   }}

Step 3: Query with Python in Jupyter Notebook

It is time to use Python to get data using the query we prepared above.

In our Jupyter Notebook, created earlier, use the first cell to import the dependencies we will need;

import pandas as pdimport requests

Now lets save our query above in a variable and create a function that will handle sending the payload (query) to make an API call to the ENS subgraph and receive data:

# variable holding the query payloadquery = """{    registrations(first:1000, orderBy:registrationDate, orderDirection:desc){        domain{            name        }        registrant {            id        }        registrationDate        cost        expiryDate    }}"""def get_data(query):    """This function posts a request to make an API call to ENS Subgraph URL    parameters:    ------------    query: payload containing specific data we need    return:    -------    response.json(): queried data in JSON format    """    response = requests.post('https://api.thegraph.com/subgraphs/name/ensdomains/ens'                             '',                             json={"query":query})    if response.status_code == 200: # code 200 means no errors         return response.json()    else: # if errors, print the error code for debugging        raise Exception("Query failed with return code {}".format(response.staus_code))

Make sure you run each cell up to this point using the run button on the Notebook or Ctrl + Enter on the keyboard. The final step to get data is to invoke the function and pass in the query:

data = get_data(query)display(data)

Your output from running the function will be in nested JSON + list format.

And voil, we have queried the latest ENS registration data that we need to answer our initial questions!

The next steps you can take would be to flatten the data, load it into pandas DataFrame, and clean up the data. The Epoch Times (registrationDate and expiryDate) would need to be converted to Unix Timestamp, which is more human-readable. The same goes for converting cost from Wei to ETH.

Now that we have queried data the 'more manual' way, lets look at an 'easier' way of doing the same but with more robust advantages using Subgrounds.

Using Subgrounds Python Library to Query Subgraphs

What is Subgrounds?

Subgrounds is an open-source data access layer for querying, manipulating, and visualizing subgraph data. The library makes it easy for data professionalsor anyoneto access on-chain data in a familiar web2 stack. These are some of the highlighted benefits of using subgrounds;

Step 1: Import and initialize subgrounds in Jupyter

In a new Jupyter Notebook cell below the last code above, import subgrounds and run the cell:

from subgrounds import Subgrounds

Initialize Subgrounds:

sg = Subgrounds()

Step 2: Load ENS subgraph URL and create Field Paths

Load the ENS subgraph using its API URL, which you can find it here:

# load ENS subgraph ens = sg.load_subgraph('https://api.thegraph.com/subgraphs/name/ensdomains/ens')

Subgrounds provides options for getting data in different formats; query, query_df, and query_json. Since we need to do some analysis with our data, we will choose to have our data in a pandas DataFrame using query_df.

We will also have our data normalized by the library and use the SyntheticFields function to define a human-readable timestamp format transformation before querying. Lets import the SyntheticField method from subgrounds and define synthetic fields for both registrationDate and expiryDate:

from subgrounds.subgraph import SyntheticField# registrationdate synthetic fieldens.Registration.registrationdate = SyntheticField(    lambda registrationDate: str(datetime.fromtimestamp(registrationDate)),    SyntheticField.STRING,    ens.Registration.registrationDate)# expirydate synthetic fieldens.Registration.expirydate = SyntheticField(    lambda expiryDate: str(datetime.fromtimestamp(expiryDate)),    SyntheticField.STRING,    ens.Registration.expiryDate)

Now we are ready to add Field Paths, the main building blocks used to construct Subgrounds queries. Field Path is a translation of the raw GraphQL schema starting from the root query entity down to the scalar leaf entity:

# Select the latest 1000 registration names by registration datetimeregistrations = ens.Query.registrations(    first=1000, # latest 1000 registrations    orderBy=ens.Registration.registrationDate, # order registrations by time    orderDirection="desc" # latest registration data will be first)field_paths = [    registrations.domain.name, # ens domain like "vitalik.eth"    registrations.registrant.id, # hexadecimal eth address    registrations.registrationdate, # time in epoch format    registrations.cost, # price for registration    registrations.expirydate # expiry date of domain]

Step 3: Query Time!

Now that we have the payload ready, lets send the request for data and display the first five results:

# get datadf = sg.query_df(field_paths)# print the first five resultsdf.head()

The output will look like this:

Boom! We got ourselves some interesting near real-time on-chain data!

Step 4: Perform some Transformations on the Data

Now that we have completed the Extract part of our ETL(Extract, Transform, and Load) process, we proceed to Transform. FYI, we wont be performing the Load stage of ETL, but you can if you like.

The first item in our transformation is to convert the registrations_cost column values from Wei(smallest denomination of ether) to ether:

# Convert `registrations_cost` column from wei to ether # 1 ether = 1,000,000,000,000,000,000 wei (10^18)df['registrations_cost'] = df['registrations_cost'] / (10**18)

The next item would be to rename the columns for simplicity and standardization:

# rename columns for simplicitydf = df.rename(columns={'registrations_domain_name': 'ens_name',                        'registrations_registrant_id': 'owner_address',                        'registrations_registrationdate': 'registration_date',                        'registrations_cost': 'registration_cost_ether',                        'registrations_expirydate': 'expiry_date'                        })# inspect the changes in dfdf.head()

Final dataframe sample:

Whats Next?

If youre interested in digging into the data to derive some fun insights, put on your data analyst monocle and dive into the data! You can interrogate the data to find the average registration cost over a period or create a dashboard to track trending topics extracted from registered names.

Share your hacks and learnings! You can find me on Twitter @ynot_kip if you have any questions or say hi!

GitHub Repo

ENS_subgraph_data

More Resources

The Graph:
About The Graph and How The Graph Works and Querying Best Practices
Subgrounds:
Subgrounds Workshop and Subgrounds Docs
GraphQL:
Introduction to GraphQL

What is a Governance Token?

Tony Kipkemboi — Thu, 04 Aug 2022 18:03:09 GMT

Centralized Governance

In the centralized world, governance is delegated to a select few individuals we trust to represent our values and advocate for change on our behalf.

A good example is how we elect officials to government office.

The officials are our delegates. We trust them to represent our views and vote on matters that reflect our choices, but as we all know, this does not always end up the way we want it to. Big corporations often end up swaying the votes to fit a different agenda.

governance = power

Decentralized Governance

Enter the decentralized world of blockchain!

Blockchain protocols are run by code; code is the law in the decentralized world. Instructions and consequent behaviors are coded into the protocols, which automatically execute once specific parameters are met.

This differs from centralized governance, where a few individuals decide and execute changes affecting millions of people.

A good example of decentralized governance where everyone participates in the management of the organization is known as a Decentralized Autonomous Organization (DAO). Many DAOs focus on almost all aspects of our centralized world; explore existing DAOs on the DAOhaus.

Governance Tokens

There are several models for DAO membership, but I will focus on Token-Based Membership. One means you can be a member of a DAO is by trading tokens in a decentralized exchange.

We will dive deeper into how governance tokens work in a DAO by using an example of a Chama; and no, it's not the village in Rio Arriba County in New Mexico. Chama is a Swahili word and a common term in East Africa, especially in Kenya, where it means an informal cooperative society where a group of people pools their financial resources for investment purposes. The most common one is the Merry Go Round (Rotating Savings and Credit Associations), where a fixed amount of money is collected from every member periodicallyusually monthlyand paid out to one of the members on a rotating schedule.

The members at the end of the payment rotation have the highest risk because those paid early in the schedule have no incentive to keep up with payments. The Rotating Savings and Credit Associations (ROSCA) will solve this issue by front-loading the most trustable members in the payment rotation and the least trustable in the end.

The ROSCA methodology is better than a random structure but still presents risks. A DAO could solve most of the trust issues in this case.

I propose ChamaDAO as a solution. To be a member, you mint a free Chama NFT that grants you access to a token-gated community website and Discord channel. The NFTs give you the right to vote on proposals and are later replaced by a $CHAMAan ERC-20 token. In this example, you need a specific amount of $CHAMA tokens to be a member and consequently have voting rights in the DAO. As a member, you can trade the Chama token on decentralized exchanges allowing new members to join the DAO.

The $CHAMA is a governance token that represents each member's stake in ChamaDAO. By distributing control among members, we achieve on-chain governance. Each member, as a collective, holds power to change the DAO protocol; foundational code. Members have collective ownership and control over the DAO treasury. Suppose ChamaDAO members propose to buy real estate property in the suburbs of Nairobi, Kenya. In that case, each member has to vote for or against the proposal, and the voting result will determine if the contract will automatically execute the proposal. If the majority vote YES, then the code will execute accordingly to disburse funds to purchase the real estate property from the DAO treasury and vice versa.

Takeaways

Governance tokens present a new paradigm to community governance, but it is also too early to conclude. Blockchain technology is still very early, and there is a certain likelihood that it will change over the years to come as more people are onboarded, and governments curve out new policies.

Make sure to do your due diligence before proceeding with the investment in this new ecosystem.

Article Sources

Investopedia. , "What Is a Blockchain?, https://www.investopedia.com/terms/b/blockchain.asp" Accessed Mar. 9, 2022.
Ethereum. "Decentralized autonomous organizations (DAOs), https://ethereum.org/en/dao/" Accessed Mar. 9, 2022.
DAOhaus. "Explore DAOs, https://app.daohaus.club/explore" Accessed Mar. 9, 2022.
Chama (investment). Chama https://en.wikipedia.org/wiki/Chama_(investment)#:~:text=A Chama is an informal,group" or "body". Accessed Mar. 9, 2022.

Tony's Blog

What is an API?

What are APIs and why do they matter?

A brief history of API development

Key API concepts and architecture

Benefits of APIs

Introduction to REST and GraphQL

Summary

Build a Streamr Node Dashboard with Streamlit using Python

Dependencies

Streamlit App Configuration

Fetching Data

Data Transformation

Display Functions

Final Step

The 4 Common Data Formats in Data Engineering

Introduction

CSV (Comma-Separated Values)

Parquet

Avro

Choosing the Right Format

Conclusion

Further reading

I am Leaving Data Engineering for Developer Relations

My Time in the Military

Transition to Private Sector

Why Pivot to Developer Relations?

Why Streamlit🎈 at Snowflake?

How to Query The Graph Protocol for Onchain Data using Python

What are Subgraphs anyway?

Tech Stack

Getting Started

Using Raw GraphQL to Query Subgraphs with Python

Step 1: Setup coding environment

Step 2: Prepare raw query

Step 3: Query with Python in Jupyter Notebook

Using Subgrounds Python Library to Query Subgraphs

What is Subgrounds?

Step 1: Import and initialize subgrounds in Jupyter

Step 2: Load ENS subgraph URL and create Field Paths

Step 3: Query Time!

Step 4: Perform some Transformations on the Data

Whats Next?

GitHub Repo

More Resources

What is a Governance Token?

Centralized Governance

Decentralized Governance

Governance Tokens

Takeaways

Article Sources