Real-time Github Analytics with ClickHouse, Redpanda

Published Fri, May 9, 2025 ∙ Nexj.js, Educational, Templates ∙ by Olivia Kane

Real-time Github Analytics with ClickHouse, Redpanda

I wanted to find a cool example to build a real-time analytical backend. One of my friends at a venture firm had created a real-time GitHub analytics tool their investors use to source potential open-source startup investment opportunities. I decided to whip up something similar to show how Moose can help you build this in a matter of minutes.

Overview of what I did

  • Ingest GitHub events in real time through Redpanda streams
  • Transform and enrich data with TypeScript
  • Store data in ClickHouse tables
  • Expose a fast, parameterized API for your frontend
  • Generate a type-safe TypeScript SDK using OpenAPI Generator CLI
  • Deploy everything to production with Boreal (backend) and Vercel (frontend)

Deployment steps are not in this post but are shown in the video tutorial starting at 9:27.

Preview the live dashboard here

Setup & Data Model

Let's start with the data model. Moose uses TypeScript interfaces to define the shape of your data. Here's how we model GitHub events and enriched repo star events:

// moose-backend/app/ingest/models.ts
export interface IGhEvent {
  eventType: string;
  eventId: Key<string>;
  createdAt: Date;
  actorLogin: string;
  actorId: number;
  actorUrl: string;
  actorAvatarUrl: string;
  repoUrl: string;
  repoId: number;
  repoOwner: string;
  repoName: string;
  repoFullName: string;
}

export interface IRepoStarEvent extends IGhEvent {
  repoDescription: string;
  repoTopics: string[];
  repoLanguage: string;
  repoStars: number;
  repoForks: number;
  repoWatchers: number;
  repoOpenIssues: number;
  repoCreatedAt: Date;
  repoOwnerLogin: string;
  repoOwnerId: number;
  repoOwnerUrl: string;
  repoOwnerAvatarUrl: string;
  repoOwnerType: string;
  repoOrgId: number;
  repoOrgUrl: string;
  repoOrgLogin: string;
  repoHomepage: string;
}

We just use TypeScript's inheritance to build our data model. The IRepoStarEvent extends IGhEvent, adding all the rich metadata we want to track about starred repositories. This pattern makes it super easy to enrich data as it flows through your pipeline—you just extend the base interface with new fields.

Ingesting Data

With these data models in place, I can declare my pipelines to support the ingest and transformation of this data. Here's how it looks:

// moose-backend/app/index.ts
export const GhEvent = new IngestPipeline<IGhEvent>("GhEvent", {
  ingest: true,
  table: true,
  stream: true,
});

export const RepoStarEvent = new IngestPipeline<IRepoStarEvent>("RepoStar", {
  ingest: false,
  stream: true,
  table: true,
});

GhEvent.stream!.addTransform(RepoStarEvent.stream!, transformGhEvent);

Three lines, three systems wired together. IngestPipeline creates the ingest endpoint, stream, and table from the same type.

Processing / Transformation

See that function transformGHEvent that we added to our stream with addTransform? Let's take a look at that. Transformations are just TypeScript functions that take an event and returns a new event with enriched data on the fly:

// moose-backend/app/ingest/transform.ts
export async function transformGhEvent(
  event: IGhEvent
): Promise<IRepoStarEvent | undefined> {
  if (event.eventType == GitHubEventType.Watch) {
    const repo: RepoResponseType = await octokit.rest.repos.get({
      owner: event.repoOwner,
      repo: event.repoName,
    });
    const repoData = repo.data;
    return {
      ...event,
      repoDescription: repoData.description ?? "",
      repoTopics: repoData.topics ?? [],
      repoLanguage: repoData.language ?? "",
      repoStars: repoData.stargazers_count,
      repoForks: repoData.forks_count,
      repoWatchers: repoData.watchers_count,
      repoOpenIssues: repoData.open_issues_count,
      repoCreatedAt: repoData.created_at
        ? new Date(repoData.created_at)
        : new Date(),
      repoOwnerLogin: repoData.owner.login,
      repoOwnerId: repoData.owner.id,
      repoOwnerUrl: repoData.owner.url,
      repoOwnerAvatarUrl: repoData.owner.avatar_url,
      repoOwnerType: repoData.owner.type,
      repoOrgId: repoData.organization?.id ?? 0,
      repoOrgUrl: repoData.organization?.url ?? "",
      repoOrgLogin: repoData.organization?.login ?? "",
      repoHomepage: repoData.homepage ?? "",
    };
  }
}

The transformer is plain TypeScript: pull repo metadata with Octokit, merge it into the event, return the result. Moose handles retries, back-pressure, and streaming semantics for you; you just focus on mapping input to output.

Exposing Results

We use ConsumptionAPI to expose the transformed data as a parameterized API. Here's the handler for the trending topics timeseries endpoint:

// moose-backend/app/apis/topicTimeseries.ts
export async function getTopicTimeseries(
  { interval = "minute", limit = 10, exclude = "" }: QueryParams,
  { client, sql }: ConsumptionUtil
): Promise<ResponseBody[]> {
  // ...
  const query = sql`
    SELECT
      time,
      arrayMap(
        (topic, events, repos, users) -> map(
          'topic', topic,
          'eventCount', toString(events),
          'uniqueRepos', toString(repos),
          'uniqueUsers', toString(users)
        ),
        groupArray(topic),
        groupArray(totalEvents),
        groupArray(uniqueReposCount),
        groupArray(uniqueUsersCount)
      ) AS topicStats
    FROM (
      SELECT
        /* ... */
      FROM ${RepoStarEvent.table!}
      /* ... */
    )
    GROUP BY time
    ORDER BY time;
  `;
  const resultSet = await client.query.execute<ResponseBody>(query);
  return await resultSet.json();
}

A normal function, but Moose turns it into a REST endpoint and OpenAPI spec automatically. sql embeds ClickHouse SQL with TypeScript types, so your IDE can autocomplete columns and table names while you write queries.

Type-Safe Frontend with OpenAPI Generator CLI

Here's where things get fun: Moose automatically generates an OpenAPI spec for your Ingest and Consumption APIs. I use the OpenAPI Generator CLI to turn that spec into a TypeScript fetch SDK. This SDK is imported directly into the Next.js frontend, so every API call is fully type-checked—no more guessing at request or response shapes.

To generate the SDK, just run:

openapi-generator-cli generate -i http://localhost:5001/openapi.yaml -g
typescript-fetch -o dashboard/generated-client

Now, in the frontend, you can call your backend API like this:

// dashboard/components/trending-topics-chart.tsx
const result = await mooseClient.consumptionTopicTimeseriesGet({
  interval: "hour",
  limit: 10,
  exclude: "typescript,python",
});

And you get full type safety on both the request and the response, straight from your API models.

A sample JSON response:

[
  {
    "time": "2025-05-07T12:00:00Z",
    "topicStats": [
      {
        "topic": "react",
        "eventCount": 42,
        "uniqueRepos": 10,
        "uniqueUsers": 8
      },
      {
        "topic": "nextjs",
        "eventCount": 35,
        "uniqueRepos": 8,
        "uniqueUsers": 7
      }
    ]
  }
]

This lets the frontend render a real-time chart of trending topics, filtered and grouped by any interval, with zero risk of type mismatches.

Wrapping up

  • Build a real-time analytics dashboard in < 100 lines
  • Get type safety and infra automation out of the box
  • Focus on business logic, not boilerplate
  • Deploy to production in minutes
  • Enjoy type-safe API calls from backend to frontend

Build it yourself

Source code is on GitHub: 514-labs/moose/tree/main/templates/github-dev-trends.

  • Use the Moose CLI to download the project: moose init your-app-name github-dev-trends

  • Run it: moose dev

  • ⭐️ Consider giving us a star

  • Join our Moose Community Slack to share what you build or ask questions!

Careers
We're hiring
2025 All rights reserved