Edit only in backend/data/knowledge/ — these are the single source of truth for both the website UI and the Avocado chatbot knowledge base. After any edit, run npm run sync from frontend/ (or just restart npm run dev). Never edit frontend/src/data/knowledge/ directly — those files are auto-overwritten.
Name, bio, tagline, location, contact
backend/data/knowledge/profile.json
name, tagline, bio, summary, obsession, previous, prev_domain, interested_domain, location, email, phone, github, linkedin, resume
Work experience — roles, companies, bullet points
backend/data/knowledge/experience.json
role, company, location, start, end, description, bullets[]
Education — degrees, institutions, highlights
backend/data/knowledge/education.json
institution, school, degree, field, location, start, end, gpa, highlights[]
Projects — title, description, tags, links
backend/data/knowledge/projects.json
title, description, tags[], featured, award, sourceLinks[{label,url}], note
Skills & tools — categories and items
backend/data/knowledge/skills.json
category, items[]
Testimonials — name, role, company, quote
backend/data/knowledge/testimonials.json
name, designation, company, linkedin, description, givenAt, source
⚠ Resume link is hardcoded in Nav.tsx
The resume Google Drive URL in profile.json powers the chatbot and home page, but components/Nav.tsx has a separate hardcoded copy in both desktop and mobile nav. Update both when the resume changes.
Create a new .mdx file — the filename becomes the URL slug. No sync needed; GitHub Actions auto-generates blog.json on push so the chatbot indexes the new post automatically.
New post file
frontend/src/content/blog/my-post.mdx
Filename → URL slug. Required frontmatter: title, date, publishedAt, description, tags[]
Post images
frontend/public/blog/
Place image files here. Reference as /blog/filename.jpg in MDX.
Auto-generated chatbot index
backend/data/knowledge/blog.json
Do not edit — auto-generated by scripts/sync-knowledge.mjs. GH Actions commits it on push; Railway re-ingests on deploy.
publishedAt vs date
publishedAt is the sort key — set it once and never change it. date is the display date — update freely (e.g. after a major revision).
2b · Lab — Living System Docs Lab is for living, in-progress project documentation — architecture, decisions, and progress logs updated as the project evolves. Files live at frontend/src/content/lab/[slug].mdx. Filename = URL slug.
Frontmatter (required)
---
title: "My Project"
status: "active" # active | paused | shipped
description: "One-line summary shown on lab index card."
startedAt: "2026-01-01"
updatedAt: "2026-04-22" # ← update this every time you edit
tech: [Next.js, FastAPI, PostgreSQL]
---
status: active
Green badge with pulse animation. Sorted to top of lab index. Use while actively building.
status: paused
Amber badge. Sorted second. Use when work is on hold.
status: shipped
Indigo badge. Sorted last. Use when the project is complete and deployed.
Always update updatedAt
The lab index card shows "last updated [date]". Set it to today's date every time you make changes or the card will show a stale date.
Lab MDX components
<Status status="active" />
Inline status badge — same colors as the index card. Put it near the top of the document so status is visible in the post.
<Stack items={["Next.js", "Python"]} />
Renders a row of monospace tech tags. Use for a full tech stack listing inside the document body (separate from the frontmatter tech[] chips in the header).
<Metric value="99%" label="uptime" />
Highlighted stat box. Use for key numbers — latency, users, accuracy, uptime. Group multiple Metrics in a flex row for a dashboard effect.
<Decision date="2026-01-10" title="Why X over Y">...</Decision>
Timeline entry with indigo dot. Use for architectural decisions, technology choices, or design tradeoffs. Children text is the reasoning.
<Update date="2026-04-22">...</Update>
Lighter timeline entry with zinc dot. Use for progress notes, milestone completions, or status changes over time. Add a new Update entry each time you revisit the project.
Architecture diagrams
```arch
┌─────────────┐ ┌─────────────┐
│ Frontend │────▶│ Backend │
└─────────────┘ └─────────────┘
```
Always use fenced ```arch blocks for diagrams — never a JSX component. Characters like <, >, and {} inside JSX children cause an MDX acorn parse error.
Typical update workflow
Create a new lab entry
Add frontend/src/content/lab/my-project.mdx with required frontmatter → commit + push → deploys automatically.
Update an existing entry
Edit the MDX file, update updatedAt in frontmatter → commit + push. No sync script needed — lab files are read directly at build time.
Mark a project shipped
Change status to "shipped" in frontmatter, update updatedAt, add a final <Update> timeline note → commit + push.
Chatbot indexing
Lab entries are indexed into ChromaDB via lab.json (auto-generated by sync-knowledge.mjs on every push). Avocado can answer questions about active lab projects, tech stack, and decisions.
3 · Deploy Pipeline (auto) Everything is automated — just commit and push.
Update portfolio data
Edit any backend/data/knowledge/*.json → commit + push → GH Actions runs sync-knowledge.mjs (copies all JSON to frontend/src/data/knowledge/) → Railway redeploys → chatbot re-indexes (hash changed).
Publish a new blog post
Write MDX → commit + push → GH Actions runs sync-knowledge.mjs → generates blog.json + copies all JSON → auto-commits with [skip ci] → Railway redeploys → chatbot indexes the new post.
What sync-knowledge.mjs does
1) Reads all *.mdx from frontend/src/content/blog/, strips MDX, writes blog.json. 2) Copies ALL backend/data/knowledge/*.json → frontend/src/data/knowledge/. Run: node scripts/sync-knowledge.mjs from repo root.
GH Actions auto-commit
Workflow (deploy.yml) needs contents: write, pages: write, id-token: write permissions. Auto-commits synced files with [skip ci] tag to prevent infinite loops.
Chatbot re-ingest (hash-based)
Backend computes SHA-256 of all knowledge JSON files at startup. Re-ingests only when the hash changes — fast startup if nothing changed. Hash stored at chroma_db/.ingest_hash.
Static site deployment
Frontend builds as a static export and deploys to GitHub Pages (sabarishreddy99.github.io). Backend deploys to Railway. Both trigger on push to main.
Tracked automatically. No config needed for new posts — engagement starts recording as soon as a reader opens the post.
Views — unique per visitor per post
Auto-recorded when a reader opens a post. One view per IP address. Shown on the post page and blog index.
Claps — up to 50 per visitor per post
Reader clicks the clap icon button. Clicks batch with a 1.5s debounce before saving. Total shown on index card and post page.
Storage — SQLite analytics.db
Stored in chroma_db/analytics.db. IPs are SHA-256 hashed — never stored raw. On Railway: set ANALYTICS_DB_PATH=/data/analytics.db with a persistent volume so counts survive redeploys.
Persistence on Railway
Without a volume, counts reset on every deploy. Add a Volume (Pro plan) mounted at /data and set ANALYTICS_DB_PATH=/data/analytics.db in backend environment variables.
API endpoints
POST /blog/{slug}/view · POST /blog/{slug}/clap (body: {count}) · GET /blog/{slug}/stats · GET /blog/stats/summary
Response count & unique visitors
Tracked on every chat response. Shown in chatbot footer. Stored in the same analytics.db — subject to same Railway persistence note above.
Model indicator badge
Green pill shows which Gemini model answered (e.g. gemini-2.5-flash). Updates automatically if a fallback was used.
Swap the AI model
Change GEMINI_MODEL in Railway environment variables. No code change needed.
Model fallback chain
Primary: GEMINI_MODEL. Fallbacks: GEMINI_FALLBACK_MODELS (comma-separated). Auto-retries on 503/429 capacity errors in order.
Knowledge base — ChromaDB
ChromaDB persists to backend/chroma_db/ (git-ignored). On Railway: mount a persistent volume at /data and symlink or set the chroma path. Without a volume, the DB rebuilds on every deploy (works, just slower startup ~30–60s).
RAG pipeline
Hybrid: ChromaDB dense (all-MiniLM-L6-v2 embeddings) + BM25 lexical → RRF merge → cross-encoder rerank (ms-marco-MiniLM-L-6-v2). Retrieves top 5 chunks → fed as context to Gemini.
Startup warmup
Embedding model and cross-encoder load at startup. First response may be ~1–2s slower. Models download once and cache in Railway's ephemeral storage.
6 · Environment Variables Backend vars → Railway → your backend service → Variables. Frontend vars → GitHub Actions secrets (used at build time).
Backend (Railway)
GOOGLE_API_KEY
Required. Google AI API key for Gemini. Chat endpoints return 503 without this.
GEMINI_MODEL
Primary model. Default: gemini-2.5-flash. Change here to swap models without code changes.
GEMINI_FALLBACK_MODELS
Comma-separated fallbacks tried on 503/429. Default: gemini-2.0-flash,gemini-2.0-flash-lite,gemini-flash-latest
ANALYTICS_DB_PATH
SQLite file path. Set to /data/analytics.db with a Railway persistent volume, otherwise counts reset on every deploy.
FRONTEND_ORIGIN
CORS allowed origins (comma-separated). Must include production frontend URL or browser requests will be blocked. Default includes localhost:3000 and GitHub Pages.
APP_ENV
dev or prod. Default: dev. Controls logging and debug behavior.
Frontend (GitHub Actions secrets / .env.local)
NEXT_PUBLIC_API_BASE_URL
Backend URL the browser calls. Set to your Railway backend URL in production (e.g. https://your-backend.up.railway.app). Required — chat and blog stats break without it.
NEXT_PUBLIC_BLOG_FONT
Blog reading font. Default: Source_Serif_4. Must match the font statically imported in frontend/src/app/layout.tsx.