Week 1 of the year 2025

January 5, 2025 - 3 minutes read - 615 words

This Week’s Challenges:

Building the Codehub: A Journey into GitHub Data and Static Site Generation (Engineering)

Building the Codehub: A Journey into GitHub Data and Static Site Generation

Purpose of the Project

The primary goal of this project is to create a semi-dynamic and automated zero-cost web portal, Codehub, that aggregates and displays GitHub data for a list of authors (basically the engineers in my org). This portal is designed to provide insights into the contributions and activities of these authors, making it easier to track and showcase their work.

The main idea here is to use Codehub to iterate over the open PRs in the Scrum standup meeting instead of relying solely on the JIRA board. The rationale behind this is that work happens in the PRs, not in JIRAs. When a JIRA ticket is in the “in-progress” state, there should be a PR behind it, and this is what we would like to discuss.

Implemented Process

The process of building the Codehub site involves several steps:

Initial Data Load:
- A bash script to load GitHub data for the specified authors (basically my team for v1). This script fetches data for the last five years and stores it in JSON files. Most of this script was generated with the help of Copilot, and this is the first project where I use Copilot as my primary programming tool.
Periodic Data Refresh:
- A bash script to update the site with the latest data. This script runs an infinite loop, periodically fetching new data, rebuilding the site, and publishing it.
Building the Site:
- Hugo build with custom template developed to transfort GitHub JSON data (in Hugo ./site/data) to a set of web pages.
Publishing the Site:
- The site is published using GitHub Pages.

Tech Stack

The project leverages several tools and technologies:

GitHub CLI (gh):
- Used to pull GitHub data and store it in JSON files within the ./site/data directory. See gh
jq:
- A command-line JSON processor used to manipulate and process the JSON data fetched from GitHub. See jq
Hugo:
- A static site generator that transforms JSON data into static web pages using templates. See Hugo

Main Challenges

Data Fetching and Processing:
- Handling large volumes of data efficiently, especially during the initial load, which can take significant time (e.g., about 30 minutes for 12 authors).
- Using delta load, ensuring the data is up-to-date and accurately reflects the authors’ activities.
Automation and Scheduling:
- Implementing a reliable and efficient periodic refresh mechanism to keep the site updated with minimal manual intervention.
Site Generation and Deployment:
- Configuring GitHub Pages correctly to host the generated site and ensuring it is accessible to users. I’ve ended up with copying ./site/public folter into /docs if the entire process complited successfully. Then committing /docs changes and enabling GitHub pages on this folder.
Converting UTC Time to Local Timezone:
- Hugo does not effectively handle timezone conversions, which posed a challenge for displaying timestamps in the local timezone.
- To overcome this, custom JavaScript was written and embedded in the template files to convert UTC times to the local timezone dynamically on the client side.

function printLocalTime(strUtcTime) {
    var ts_e = Date.parse(strUtcTime);
    var ts_u = new Date();
    ts_u.setTime(ts_e);
    var ts_u_str = ts_u.toLocaleString('en-US', { day: 'numeric', month: 'short', year: 'numeric', hour: 'numeric', minute: 'numeric', timeZoneName: 'short' });
    document.write('<time datetime="' + ts_u_str + '">' + ts_u_str + '</time>');
}

Conclusion

This project showcases the power of combining GitHub’s API, command-line tools, and static site generation to create a (semi-)dynamic and automated web portal. By leveraging tools like gh, jq, and Hugo, we can efficiently fetch, process, and display GitHub data, providing valuable insights into the contributions of various authors.

Happy Coding!