CRUMB a card from devarno-cloud

Knowledge Domain Model

tektree intermediate 6 min read

ELI5

The knowledge tree is a library: Areas are the shelves (with parent shelves and breadcrumbs back to the lobby), Questions are the index cards on each shelf, Answers are the slips clipped to each question, Insights are the staff-written feature cards displayed at the front, and Resources are the external books a card might point to.

Technical Deep Dive

Knowledge-service owns five collections and is the densest data-model surface in tektree. The canonical field shapes are documented in docs/docs/architecture/DATA_MODELS.md; the Go structs live in services/knowledge-service/internal/models/models.go (lean form) and services/api/models/{area,question,insight}.models.go (legacy production form).

ER Diagram

erDiagram
AREA ||--o{ AREA : "parent_id"
AREA ||--o{ QUESTION : "areas[]"
AREA ||--o{ INSIGHT : "areas[]"
AREA ||--o{ RESOURCE : "areas[]"
QUESTION ||--o{ ANSWER : "question_id"
QUESTION }o--|| USER : "created_by"
INSIGHT }o--|| USER : "created_by"
INSIGHT }o--o{ USER : "co_authors[]"
ANSWER }o--|| USER : "created_by"
AREA {
string title
string slug
string parent_id
string details
string[] tags
string[] breadcrumbs
string visibility
string created_by
int follower_count
int content_count
}
QUESTION {
string title
string slug
string body
string[] tags
string[] areas
string visibility
string created_by
int view_count
int upvote_count
int downvote_count
int answer_count
string accepted_answer_id
string status
}
ANSWER {
string question_id
string body
json[] code_snippets
string created_by
int upvote_count
bool is_accepted
}
INSIGHT {
string title
string slug
string body
string insight_type
string[] tags
string[] areas
string[] co_authors
string visibility
string created_by
time published_at
int view_count
int like_count
bool featured
}
RESOURCE {
string url
string title
string description
string[] areas
string created_by
}

Routes

services/knowledge-service/cmd/server/main.go:28-57 registers full CRUD per entity:

GET|POST /api/{areas,questions,insights,resources}
GET|PUT|DELETE /api/{areas,questions,insights,resources}/:id

The api-gateway maps these onto /api/v1/{areas,questions,insights,resources}/*path (services/api-gateway/internal/routes/routes.go).

Indexes (from DATA_MODELS.md)

CollectionIndexes
areasparent_id + created_at, tags, slug, text search (title, details)
questionscreated_by + created_at, areas + created_at, status + created_at, text search
answersquestion_id + created_at, created_by, is_accepted
insightscreated_by + published_at, areas + published_at, featured + published_at, text search

Insight vs Answer

An Answer is bound to a Question (question_id FK) and is the user-generated reply that may be marked accepted. An Insight is a free-standing long-form post that may reference Areas and have co-authors and a featured flag — closer to a blog post than a forum reply. They live in different collections and are emitted with different events (knowledge.answer.submitted vs knowledge.insight.published).

Areas Hierarchy

Areas are a tree via parent_id (string id, not ObjectID — see services/api/models/area.models.go). Each area carries a breadcrumbs array so reads do not need recursive lookups; this denormalisation must be rebuilt when an area is moved.

Key Terms

  • Area → topical shelf; has parent + breadcrumbs.
  • Question → user-asked entry, gathers Answers; can have accepted_answer_id.
  • Insight → long-form post; featured flag drives the home feed.
  • Resource → external link with metadata, tagged into Areas.
  • areas[] → array of area ids on Question/Insight/Resource — the join key for “everything in topic X”.

Q&A

Q: A question has 12 answers but answer_count says 11. What is the most likely cause? A: A counter was missed on the latest knowledge.answer.submitted event consumer. Counts are denormalised aggregates; reconcile by counting answers where question_id matches.

Q: Moving an area to a new parent — what must be touched besides parent_id? A: The breadcrumbs array on the moved area and on every descendant area, plus any cached search-index documents that embed those breadcrumbs.

Q: Why is parent_id a string instead of ObjectID? A: The legacy schema used slugs and external ids inconsistently; the string column accommodates both. New code should treat it opaquely and validate via lookups, not by parsing.

Examples

Listing all Insights tagged into the “distributed-systems” area on the global feed: query insights with areas: <area_id> and sort published_at desc, restricted by visibility != private. The compound index areas + published_at makes this O(log n).

neighbors on the map