---
title: "Org Batch Markdown Exporter Job"
created: "2025-09-13T19:34:22.410Z"
label: "RFC1"
author: "Thomas Hunter II"
status: "review"
visibility: "public"
summary: null
image: null
tags:
  - "markdown"
  - "user-experience"
watchers:
  - "Thomas Hunter II"
  - "Morticia Addams"
reviewers:
  - user: "Thomas Hunter II"
    status: "reject"
  - user: "Morticia Addams"
    status: "await"
  - user: "Gomez Addams"
    status: "comment"
links:
  - rfc: "RFC3"
    relation: "is obsoleted by"
  - rfc: "RFC7"
    relation: "relates to"
  - rfc: "RFC2"
    relation: "relates to"
---

# Abstract

Allow org admins to export one Markdown file for each RFC in their organization as part of a single-click-initiated batch job.

# Background

Currently RFC Hub is a monorepo with a single scalable homogeneous process that handles both web rendering and API calls. Deployments are kicked off via GitHub actions.

Manually backing-up RFC content would require a lot of manual copying and pasting, visiting every RFC, and would be error-prone. Overall it wouldn't be a good user experience.

# Problem

Users of RFC Hub expect the ability to export RFCs in an industry-standard format to avoid vendor lock-in. Supporting this will boost confidence and hopefully increase adoption. Since the underlying representation for RFCs is already Markdown this RFC proposes that the export process generates individual Markdown files for each RFC.

# Proposal

It could possible degrade the overall performance of the rfchub.app service if exports were to happen synchronously as part of an HTTP request. Therefore we will perform the export process, file generation, and compression of RFCs in a separate process. This separate job process will be an hourly GitHub action. The job will read from a database table, perform the export operation, and email the RFCs. Access to database connection details will be provided via environment variables.

> [!NOTE]
> We still need to figure out how to let the GitHub Action service bypass the database firewall restrictions. I think this can be done via an API call made to the cloud provider.

```mermaid
architecture-beta
    group api[Architecture]

    service db(database)[Exports Table] in api
    service github(server)[GitHub Action] in api
    service mailgun(disk)[Email] in api
    service server(server)[Monolith Server] in api

    db:L <-- R:server
    github:T --> B:db
    mailgun:L <-- R:github
````

Metadata will be attached by way of [frontmatter](https://github.com/Kernix13/markdown-cheatsheet/blob/master/frontmatter.md) which is a YAML preface to the Markdown document. This concept is universal and most good Markdown parsers will ignore the preface. Overall it should be a convenient format for users to have and can be easily transformed into different formats.

## Frontmatter Content

Filename: `RFC123: Overall title of RFC.md`

```yaml
---
title: "Overall title of RFC"
created: "2025-09-13T13:37:00Z"
label: "rfc123"
author: "Thomas Hunter II"
status: review
visibility: public
tags:
  - database
  - performance
watchers:
  - "Thomas Hunter II"
  - "Rupert Styx"
reviewers:
  - user: "Thomas Hunter II"
    status: approved
links:
  - rfc: "rfc123"
    type: obsoletes
---

# Synopsis

Markdown content is everything after the set of `---` characters.
```

# User Interface

Display a button on the org interface if the currently logged in user is an org admin. Clicking the button will add a row to the database table signaling a backup needs to happen. Create a unique constraint in the table so that only a single export row can exist at once to avoid abuse. Once the export is complete the user receives an email with the export attached.

> [!NOTE]
> Need to convey to admins visiting the page that an export is underway, that way they won't attempt a second one.

# Definition of success

A developer will deploy the feature to prod and test it to make sure it works.

# Alternatives Considered

## Overly Complicated Export Architecture

We also considered this horizontally scaling auto sharded worker queue based approach:

![Considered horizontally scaling auto sharded worker queue architecture](https://user-content.rfchub.com/f78e19c4-7293-4079-9d4c-03b9d0eca0e2/fake-export-diagram-1BOQ27I0.png)

With this approach, upon having the user click the mass-export button, the Node.js-based monolith server adds a new entry into a MySQL database with details on the export job. It also looks up all of the RFC IDs and then inserts them into a Kafka Queue. Node.js worker jobs run in parallel and are triggered upon consuming messages from the queue. Data about the export operation is hydrated from the MySQL database. Once the ultimate task is complete the final worker kicks off the job to send an email via Mailgun.

This approach was abandoned because customer load isn't high enough to justify the expensive architecture.

## Do Nothing

Nobody likes a walled garden.

## Export HTML

While this format is more universal it isn't necessarily easier to convert to other formats. Conversion from Markdown would also incur additional CPU overhead.

## Export .doc / .docx

These formats are both proprietary to Microsoft but also compatible with tools like Google Docs and Open Office. Again, CPU conversion is a concern.

## Export to Google Drive

This is only useful to organizations that already have a Google account. It would also require an integration with Google APIs and require maintenance.

## Perform Export in Main Server Process

Presumably the export could be fairly fast. A user could theoretically click a link and have their download a few seconds later. That said, orgs with thousands of RFCs will bog down the system. This could be used for DDoS attacks.

# Future Improvements

Include linked materials such as images.