docs: add PRD for privacy-friendly analytics
This commit is contained in:
parent
9403cd047c
commit
4df84addfa
1 changed files with 913 additions and 0 deletions
913
prd/PRD-privacy-friendly-analytics.md
Normal file
913
prd/PRD-privacy-friendly-analytics.md
Normal file
|
|
@ -0,0 +1,913 @@
|
||||||
|
# Product Requirements Document: Privacy-Friendly Analytics
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Implement a self-hosted, privacy-first analytics system to track content engagement without using third-party services like Google Analytics. The system will provide insight into which posts, photos, albums, and projects resonate with visitors while respecting user privacy and complying with GDPR/privacy regulations.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Track page views for all content types (Posts, Photos, Albums, Projects)
|
||||||
|
- Provide actionable insights about content performance
|
||||||
|
- Maintain user privacy (no cookies, no PII, no tracking across sites)
|
||||||
|
- Leverage existing infrastructure (PostgreSQL + Redis)
|
||||||
|
- Build admin dashboard for viewing analytics
|
||||||
|
- Keep system lightweight and performant
|
||||||
|
|
||||||
|
## Privacy-First Principles
|
||||||
|
|
||||||
|
### What We Track
|
||||||
|
- Content views (which pages are accessed)
|
||||||
|
- Referrer sources (where traffic comes from)
|
||||||
|
- Approximate unique visitors (session-based deduplication)
|
||||||
|
- Timestamp of visits
|
||||||
|
|
||||||
|
### What We DON'T Track
|
||||||
|
- Personal Identifying Information (PII)
|
||||||
|
- User cookies or local storage
|
||||||
|
- IP addresses (only hashed for deduplication)
|
||||||
|
- User behavior across sessions
|
||||||
|
- Cross-site tracking
|
||||||
|
- Device fingerprinting beyond basic deduplication
|
||||||
|
|
||||||
|
### Privacy Guarantees
|
||||||
|
- **No cookies**: Zero client-side storage
|
||||||
|
- **IP hashing**: IPs hashed with daily salt, never stored
|
||||||
|
- **User-agent hashing**: Combined with IP for session deduplication
|
||||||
|
- **Short retention**: Raw data kept for 90 days, then aggregated
|
||||||
|
- **GDPR compliant**: No consent needed (legitimate interest)
|
||||||
|
- **No third parties**: All data stays on our servers
|
||||||
|
|
||||||
|
## Technical Architecture
|
||||||
|
|
||||||
|
### Database Schema
|
||||||
|
|
||||||
|
#### PageView Table (Detailed Tracking)
|
||||||
|
|
||||||
|
```prisma
|
||||||
|
model PageView {
|
||||||
|
id Int @id @default(autoincrement())
|
||||||
|
contentType String @db.VarChar(50) // "post", "photo", "album", "project"
|
||||||
|
contentId Int // ID of the content
|
||||||
|
contentSlug String @db.VarChar(255) // Slug for reference
|
||||||
|
|
||||||
|
// Privacy-preserving visitor identification
|
||||||
|
sessionHash String @db.VarChar(64) // SHA-256(IP + User-Agent + daily_salt)
|
||||||
|
|
||||||
|
// Metadata
|
||||||
|
referrer String? @db.VarChar(500) // Where visitor came from
|
||||||
|
timestamp DateTime @default(now())
|
||||||
|
|
||||||
|
@@index([contentType, contentId])
|
||||||
|
@@index([timestamp])
|
||||||
|
@@index([sessionHash, timestamp])
|
||||||
|
@@index([contentType, timestamp])
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### AggregatedView Table (Long-term Storage)
|
||||||
|
|
||||||
|
```prisma
|
||||||
|
model AggregatedView {
|
||||||
|
id Int @id @default(autoincrement())
|
||||||
|
contentType String @db.VarChar(50)
|
||||||
|
contentId Int
|
||||||
|
contentSlug String @db.VarChar(255)
|
||||||
|
|
||||||
|
// Aggregated metrics
|
||||||
|
date DateTime @db.Date // Day of aggregation
|
||||||
|
viewCount Int @default(0) // Total views that day
|
||||||
|
uniqueCount Int @default(0) // Approximate unique visitors
|
||||||
|
|
||||||
|
@@unique([contentType, contentId, date])
|
||||||
|
@@index([contentType, contentId])
|
||||||
|
@@index([date])
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### API Endpoints
|
||||||
|
|
||||||
|
#### Tracking Endpoint (Public)
|
||||||
|
|
||||||
|
**`POST /api/analytics/track`**
|
||||||
|
- **Purpose**: Record a page view
|
||||||
|
- **Request Body**:
|
||||||
|
```typescript
|
||||||
|
{
|
||||||
|
contentType: 'post' | 'photo' | 'album' | 'project',
|
||||||
|
contentId: number,
|
||||||
|
contentSlug: string
|
||||||
|
}
|
||||||
|
```
|
||||||
|
- **Server-side Processing**:
|
||||||
|
- Extract IP address from request
|
||||||
|
- Extract User-Agent from headers
|
||||||
|
- Extract Referrer from headers
|
||||||
|
- Generate daily-rotated salt
|
||||||
|
- Create sessionHash: `SHA-256(IP + UserAgent + salt)`
|
||||||
|
- Insert PageView record (never store raw IP)
|
||||||
|
- **Response**: `{ success: true }`
|
||||||
|
- **Rate limiting**: Max 10 requests per minute per session
|
||||||
|
|
||||||
|
#### Admin Analytics Endpoints
|
||||||
|
|
||||||
|
**`GET /api/admin/analytics/overview`**
|
||||||
|
- **Purpose**: Dashboard overview statistics
|
||||||
|
- **Query Parameters**:
|
||||||
|
- `period`: '7d' | '30d' | '90d' | 'all'
|
||||||
|
- **Response**:
|
||||||
|
```typescript
|
||||||
|
{
|
||||||
|
totalViews: number,
|
||||||
|
uniqueVisitors: number,
|
||||||
|
topContent: [
|
||||||
|
{ type, id, slug, title, views, uniqueViews }
|
||||||
|
],
|
||||||
|
viewsByDay: [
|
||||||
|
{ date, views, uniqueVisitors }
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**`GET /api/admin/analytics/content`**
|
||||||
|
- **Purpose**: Detailed analytics for specific content
|
||||||
|
- **Query Parameters**:
|
||||||
|
- `type`: 'post' | 'photo' | 'album' | 'project'
|
||||||
|
- `id`: content ID
|
||||||
|
- `period`: '7d' | '30d' | '90d' | 'all'
|
||||||
|
- **Response**:
|
||||||
|
```typescript
|
||||||
|
{
|
||||||
|
contentInfo: { type, id, slug, title },
|
||||||
|
totalViews: number,
|
||||||
|
uniqueVisitors: number,
|
||||||
|
viewsByDay: [{ date, views, uniqueVisitors }],
|
||||||
|
topReferrers: [{ referrer, count }]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**`GET /api/admin/analytics/trending`**
|
||||||
|
- **Purpose**: Find trending content
|
||||||
|
- **Query Parameters**:
|
||||||
|
- `type`: 'post' | 'photo' | 'album' | 'project' | 'all'
|
||||||
|
- `days`: number (default 7)
|
||||||
|
- `limit`: number (default 10)
|
||||||
|
- **Response**:
|
||||||
|
```typescript
|
||||||
|
[
|
||||||
|
{
|
||||||
|
type, id, slug, title,
|
||||||
|
recentViews: number,
|
||||||
|
previousViews: number,
|
||||||
|
growthPercent: number
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
**`GET /api/admin/analytics/referrers`**
|
||||||
|
- **Purpose**: Traffic source analysis
|
||||||
|
- **Query Parameters**:
|
||||||
|
- `period`: '7d' | '30d' | '90d' | 'all'
|
||||||
|
- **Response**:
|
||||||
|
```typescript
|
||||||
|
[
|
||||||
|
{
|
||||||
|
referrer: string,
|
||||||
|
views: number,
|
||||||
|
uniqueVisitors: number,
|
||||||
|
topContent: [{ type, id, slug, title, views }]
|
||||||
|
}
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
### Redis Caching Strategy
|
||||||
|
|
||||||
|
**Cache Keys**:
|
||||||
|
- `analytics:overview:{period}` - Dashboard overview (TTL: 10 minutes)
|
||||||
|
- `analytics:content:{type}:{id}:{period}` - Content details (TTL: 10 minutes)
|
||||||
|
- `analytics:trending:{type}:{days}` - Trending content (TTL: 5 minutes)
|
||||||
|
- `analytics:referrers:{period}` - Referrer stats (TTL: 15 minutes)
|
||||||
|
- `analytics:salt:{date}` - Daily salt for hashing (TTL: 24 hours)
|
||||||
|
|
||||||
|
**Cache Invalidation**:
|
||||||
|
- Automatic TTL expiration (stale data acceptable for analytics)
|
||||||
|
- Manual flush on data aggregation (daily job)
|
||||||
|
- Progressive cache warming during admin page load
|
||||||
|
|
||||||
|
### Frontend Integration
|
||||||
|
|
||||||
|
#### Client-side Tracking Hook
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// src/lib/utils/analytics.ts
|
||||||
|
export async function trackPageView(
|
||||||
|
contentType: 'post' | 'photo' | 'album' | 'project',
|
||||||
|
contentId: number,
|
||||||
|
contentSlug: string
|
||||||
|
): Promise<void> {
|
||||||
|
try {
|
||||||
|
await fetch('/api/analytics/track', {
|
||||||
|
method: 'POST',
|
||||||
|
headers: { 'Content-Type': 'application/json' },
|
||||||
|
body: JSON.stringify({ contentType, contentId, contentSlug }),
|
||||||
|
// Fire and forget - don't block page render
|
||||||
|
keepalive: true
|
||||||
|
});
|
||||||
|
} catch (error) {
|
||||||
|
// Silently fail - analytics shouldn't break the page
|
||||||
|
console.debug('Analytics tracking failed:', error);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
#### Page Integration Examples
|
||||||
|
|
||||||
|
**Universe Post Page** (`/universe/[slug]/+page.svelte`):
|
||||||
|
```svelte
|
||||||
|
<script lang="ts">
|
||||||
|
import { onMount } from 'svelte';
|
||||||
|
import { trackPageView } from '$lib/utils/analytics';
|
||||||
|
|
||||||
|
const { data } = $props();
|
||||||
|
|
||||||
|
onMount(() => {
|
||||||
|
trackPageView('post', data.post.id, data.post.slug);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
```
|
||||||
|
|
||||||
|
**Photo Page** (`/photos/[id]/+page.svelte`):
|
||||||
|
```svelte
|
||||||
|
<script lang="ts">
|
||||||
|
import { onMount } from 'svelte';
|
||||||
|
import { trackPageView } from '$lib/utils/analytics';
|
||||||
|
|
||||||
|
const { data } = $props();
|
||||||
|
|
||||||
|
onMount(() => {
|
||||||
|
trackPageView('photo', data.photo.id, data.photo.slug || String(data.photo.id));
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
```
|
||||||
|
|
||||||
|
**Album Page** (`/albums/[slug]/+page.svelte`):
|
||||||
|
```svelte
|
||||||
|
<script lang="ts">
|
||||||
|
import { onMount } from 'svelte';
|
||||||
|
import { trackPageView } from '$lib/utils/analytics';
|
||||||
|
|
||||||
|
const { data } = $props();
|
||||||
|
|
||||||
|
onMount(() => {
|
||||||
|
trackPageView('album', data.album.id, data.album.slug);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
```
|
||||||
|
|
||||||
|
**Project Page** (`/work/[slug]/+page.svelte`):
|
||||||
|
```svelte
|
||||||
|
<script lang="ts">
|
||||||
|
import { onMount } from 'svelte';
|
||||||
|
import { trackPageView } from '$lib/utils/analytics';
|
||||||
|
|
||||||
|
const { data } = $props();
|
||||||
|
|
||||||
|
onMount(() => {
|
||||||
|
trackPageView('project', data.project.id, data.project.slug);
|
||||||
|
});
|
||||||
|
</script>
|
||||||
|
```
|
||||||
|
|
||||||
|
### Admin Dashboard UI
|
||||||
|
|
||||||
|
#### Main Analytics Page (`/admin/analytics/+page.svelte`)
|
||||||
|
|
||||||
|
**Layout**:
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
|
│ Analytics Overview │
|
||||||
|
│ [7 Days] [30 Days] [90 Days] [All Time] │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ 5,432 │ │ 2,891 │ │ 3.2 │ │
|
||||||
|
│ │ Views │ │ Visitors│ │ Avg/Day │ │
|
||||||
|
│ └──────────┘ └──────────┘ └──────────┘ │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ Views Over Time │
|
||||||
|
│ [Line Chart: Views per day] │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ Top Content │
|
||||||
|
│ 1. Photo: Sunset in Tokyo 234 views │
|
||||||
|
│ 2. Post: New Design System 189 views │
|
||||||
|
│ 3. Project: Mobile Redesign 156 views │
|
||||||
|
│ 4. Album: Japan 2024 142 views │
|
||||||
|
│ ... │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ Top Referrers │
|
||||||
|
│ 1. Direct / Bookmark 45% │
|
||||||
|
│ 2. twitter.com 23% │
|
||||||
|
│ 3. linkedin.com 15% │
|
||||||
|
│ ... │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Components**:
|
||||||
|
- Period selector (tabs or dropdown)
|
||||||
|
- Stat cards (total views, unique visitors, avg views/day)
|
||||||
|
- Time series chart (using simple SVG or chart library)
|
||||||
|
- Top content table (clickable to view detailed analytics)
|
||||||
|
- Top referrers table
|
||||||
|
- Loading states and error handling
|
||||||
|
|
||||||
|
#### Content Detail Page (`/admin/analytics/[type]/[id]/+page.svelte`)
|
||||||
|
|
||||||
|
**Layout**:
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────┐
|
||||||
|
│ ← Back to Overview │
|
||||||
|
│ Analytics: "Sunset in Tokyo" (Photo) │
|
||||||
|
│ [7 Days] [30 Days] [90 Days] [All Time] │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ ┌──────────┐ ┌──────────┐ │
|
||||||
|
│ │ 234 │ │ 187 │ │
|
||||||
|
│ │ Views │ │ Unique │ │
|
||||||
|
│ └──────────┘ └──────────┘ │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ Views Over Time │
|
||||||
|
│ [Line Chart: Daily views] │
|
||||||
|
├─────────────────────────────────────────────────┤
|
||||||
|
│ Traffic Sources │
|
||||||
|
│ 1. Direct 89 views │
|
||||||
|
│ 2. twitter.com/user/post 45 views │
|
||||||
|
│ 3. reddit.com/r/photography 23 views │
|
||||||
|
│ ... │
|
||||||
|
└─────────────────────────────────────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
**Features**:
|
||||||
|
- Content preview/link
|
||||||
|
- Period selector
|
||||||
|
- View count and unique visitor count
|
||||||
|
- Daily breakdown chart
|
||||||
|
- Detailed referrer list with clickable links
|
||||||
|
- Export data option (CSV)
|
||||||
|
|
||||||
|
#### Integration with Existing Admin
|
||||||
|
|
||||||
|
Add analytics link to admin navigation:
|
||||||
|
- Navigation item: "Analytics"
|
||||||
|
- Badge showing today's view count
|
||||||
|
- Quick stats in admin dashboard overview
|
||||||
|
|
||||||
|
### Data Retention & Cleanup
|
||||||
|
|
||||||
|
#### Daily Aggregation Job
|
||||||
|
|
||||||
|
**Cron job** (runs at 2 AM daily):
|
||||||
|
```typescript
|
||||||
|
// scripts/aggregate-analytics.ts
|
||||||
|
async function aggregateOldData() {
|
||||||
|
const cutoffDate = new Date();
|
||||||
|
cutoffDate.setDate(cutoffDate.getDate() - 90);
|
||||||
|
|
||||||
|
// 1. Group PageViews older than 90 days by (contentType, contentId, date)
|
||||||
|
const oldViews = await prisma.pageView.groupBy({
|
||||||
|
by: ['contentType', 'contentId', 'contentSlug'],
|
||||||
|
where: { timestamp: { lt: cutoffDate } },
|
||||||
|
_count: { id: true },
|
||||||
|
_count: { sessionHash: true } // Approximate unique
|
||||||
|
});
|
||||||
|
|
||||||
|
// 2. Insert/update AggregatedView records
|
||||||
|
for (const view of oldViews) {
|
||||||
|
await prisma.aggregatedView.upsert({
|
||||||
|
where: {
|
||||||
|
contentType_contentId_date: {
|
||||||
|
contentType: view.contentType,
|
||||||
|
contentId: view.contentId,
|
||||||
|
date: extractDate(view.timestamp)
|
||||||
|
}
|
||||||
|
},
|
||||||
|
update: {
|
||||||
|
viewCount: { increment: view._count.id },
|
||||||
|
uniqueCount: { increment: view._count.sessionHash }
|
||||||
|
},
|
||||||
|
create: {
|
||||||
|
contentType: view.contentType,
|
||||||
|
contentId: view.contentId,
|
||||||
|
contentSlug: view.contentSlug,
|
||||||
|
date: extractDate(view.timestamp),
|
||||||
|
viewCount: view._count.id,
|
||||||
|
uniqueCount: view._count.sessionHash
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Delete old raw PageView records
|
||||||
|
await prisma.pageView.deleteMany({
|
||||||
|
where: { timestamp: { lt: cutoffDate } }
|
||||||
|
});
|
||||||
|
|
||||||
|
console.log(`Aggregated and cleaned up views older than ${cutoffDate}`);
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Run via**:
|
||||||
|
- Cron (if available): `0 2 * * * cd /app && npm run analytics:aggregate`
|
||||||
|
- Railway Cron Jobs (if supported)
|
||||||
|
- Manual trigger from admin panel
|
||||||
|
- Scheduled serverless function
|
||||||
|
|
||||||
|
#### Retention Policy
|
||||||
|
|
||||||
|
- **Detailed data**: 90 days (in PageView table)
|
||||||
|
- **Aggregated data**: Forever (in AggregatedView table)
|
||||||
|
- **Daily summaries**: Minimal storage footprint
|
||||||
|
- **Total storage estimate**: ~10MB per year for typical traffic
|
||||||
|
|
||||||
|
### Session Hash Implementation
|
||||||
|
|
||||||
|
```typescript
|
||||||
|
// src/lib/server/analytics-hash.ts
|
||||||
|
import crypto from 'crypto';
|
||||||
|
import redis from './redis-client';
|
||||||
|
|
||||||
|
export async function generateSessionHash(
|
||||||
|
ip: string,
|
||||||
|
userAgent: string
|
||||||
|
): Promise<string> {
|
||||||
|
// Get or create daily salt
|
||||||
|
const today = new Date().toISOString().split('T')[0]; // YYYY-MM-DD
|
||||||
|
const saltKey = `analytics:salt:${today}`;
|
||||||
|
|
||||||
|
let salt = await redis.get(saltKey);
|
||||||
|
if (!salt) {
|
||||||
|
salt = crypto.randomBytes(32).toString('hex');
|
||||||
|
await redis.set(saltKey, salt, 'EX', 86400); // 24 hour TTL
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create session hash
|
||||||
|
const data = `${ip}|${userAgent}|${salt}`;
|
||||||
|
return crypto
|
||||||
|
.createHash('sha256')
|
||||||
|
.update(data)
|
||||||
|
.digest('hex');
|
||||||
|
}
|
||||||
|
|
||||||
|
// Helper to extract IP from request (handles proxies)
|
||||||
|
export function getClientIP(request: Request): string {
|
||||||
|
const forwarded = request.headers.get('x-forwarded-for');
|
||||||
|
if (forwarded) {
|
||||||
|
return forwarded.split(',')[0].trim();
|
||||||
|
}
|
||||||
|
|
||||||
|
const realIP = request.headers.get('x-real-ip');
|
||||||
|
if (realIP) {
|
||||||
|
return realIP;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Fallback to connection IP (may not be available in serverless)
|
||||||
|
return 'unknown';
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Performance Considerations
|
||||||
|
|
||||||
|
#### Write Performance
|
||||||
|
- PageView inserts are async (fire-and-forget from client)
|
||||||
|
- No transaction overhead
|
||||||
|
- Batch inserts for high traffic (future optimization)
|
||||||
|
- Index optimization for common queries
|
||||||
|
|
||||||
|
#### Read Performance
|
||||||
|
- Redis caching for all admin queries
|
||||||
|
- Aggressive cache TTLs (5-15 minutes acceptable)
|
||||||
|
- Pre-aggregated data for historical queries
|
||||||
|
- Efficient indexes on timestamp and content fields
|
||||||
|
|
||||||
|
#### Database Growth
|
||||||
|
- ~100 bytes per PageView record
|
||||||
|
- 1,000 views/day = ~100KB/day = ~3.6MB/year (raw)
|
||||||
|
- Aggregation reduces to ~10KB/year after 90 days
|
||||||
|
- Negligible compared to media storage
|
||||||
|
|
||||||
|
## Implementation Phases
|
||||||
|
|
||||||
|
### Phase 1: Foundation & Database (Week 1)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [x] Design PageView and AggregatedView schema
|
||||||
|
- [ ] Create Prisma migration for analytics tables
|
||||||
|
- [ ] Add indexes for common query patterns
|
||||||
|
- [ ] Test migrations on local database
|
||||||
|
- [ ] Create seed data for testing
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Database schema ready
|
||||||
|
- Migrations tested and working
|
||||||
|
|
||||||
|
### Phase 2: Tracking Infrastructure (Week 1)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Implement session hash generation utilities
|
||||||
|
- [ ] Create `POST /api/analytics/track` endpoint
|
||||||
|
- [ ] Add IP extraction and User-Agent handling
|
||||||
|
- [ ] Implement rate limiting
|
||||||
|
- [ ] Create analytics utility functions
|
||||||
|
- [ ] Add error handling and logging
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Tracking endpoint functional
|
||||||
|
- Privacy-preserving hash working
|
||||||
|
|
||||||
|
### Phase 3: Frontend Integration (Week 2)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Create `trackPageView()` utility function
|
||||||
|
- [ ] Add tracking to Universe post pages
|
||||||
|
- [ ] Add tracking to Photo pages
|
||||||
|
- [ ] Add tracking to Album pages
|
||||||
|
- [ ] Add tracking to Project pages
|
||||||
|
- [ ] Test tracking across all page types
|
||||||
|
- [ ] Verify data appearing in database
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- All content pages tracking views
|
||||||
|
- PageView data accumulating
|
||||||
|
|
||||||
|
### Phase 4: Analytics API Endpoints (Week 2)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Implement `GET /api/admin/analytics/overview`
|
||||||
|
- [ ] Implement `GET /api/admin/analytics/content`
|
||||||
|
- [ ] Implement `GET /api/admin/analytics/trending`
|
||||||
|
- [ ] Implement `GET /api/admin/analytics/referrers`
|
||||||
|
- [ ] Add authentication middleware
|
||||||
|
- [ ] Write analytics query utilities
|
||||||
|
- [ ] Implement date range filtering
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- All admin API endpoints working
|
||||||
|
- Query performance optimized
|
||||||
|
|
||||||
|
### Phase 5: Redis Caching (Week 3)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Implement cache key strategy
|
||||||
|
- [ ] Add caching to overview endpoint
|
||||||
|
- [ ] Add caching to content endpoint
|
||||||
|
- [ ] Add caching to trending endpoint
|
||||||
|
- [ ] Add caching to referrers endpoint
|
||||||
|
- [ ] Implement cache warming
|
||||||
|
- [ ] Test cache invalidation
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Redis caching active
|
||||||
|
- Response times under 100ms
|
||||||
|
|
||||||
|
### Phase 6: Admin Dashboard UI (Week 3-4)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Create `/admin/analytics` route
|
||||||
|
- [ ] Build overview page layout
|
||||||
|
- [ ] Implement period selector component
|
||||||
|
- [ ] Create stat cards component
|
||||||
|
- [ ] Build time series chart component
|
||||||
|
- [ ] Create top content table
|
||||||
|
- [ ] Create top referrers table
|
||||||
|
- [ ] Add loading and error states
|
||||||
|
- [ ] Style dashboard to match admin theme
|
||||||
|
- [ ] Test responsive design
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Analytics dashboard fully functional
|
||||||
|
- UI matches admin design system
|
||||||
|
|
||||||
|
### Phase 7: Content Detail Pages (Week 4)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Create `/admin/analytics/[type]/[id]` route
|
||||||
|
- [ ] Build content detail page layout
|
||||||
|
- [ ] Implement detailed metrics display
|
||||||
|
- [ ] Create referrer breakdown table
|
||||||
|
- [ ] Add navigation back to overview
|
||||||
|
- [ ] Add content preview/link
|
||||||
|
- [ ] Implement CSV export option
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Content detail pages working
|
||||||
|
- Drill-down functionality complete
|
||||||
|
|
||||||
|
### Phase 8: Data Aggregation & Cleanup (Week 5)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Write aggregation script
|
||||||
|
- [ ] Test aggregation with sample data
|
||||||
|
- [ ] Create manual trigger endpoint
|
||||||
|
- [ ] Set up scheduled job (cron or Railway)
|
||||||
|
- [ ] Add aggregation status logging
|
||||||
|
- [ ] Test data retention policy
|
||||||
|
- [ ] Document aggregation process
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Aggregation job running daily
|
||||||
|
- Old data cleaned automatically
|
||||||
|
|
||||||
|
### Phase 9: Polish & Testing (Week 5)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Add analytics link to admin navigation
|
||||||
|
- [ ] Create quick stats widget for admin dashboard
|
||||||
|
- [ ] Add today's view count badge
|
||||||
|
- [ ] Performance optimization pass
|
||||||
|
- [ ] Error handling improvements
|
||||||
|
- [ ] Write documentation
|
||||||
|
- [ ] Create user guide for analytics
|
||||||
|
- [ ] End-to-end testing
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- System fully integrated
|
||||||
|
- Documentation complete
|
||||||
|
|
||||||
|
### Phase 10: Monitoring & Launch (Week 6)
|
||||||
|
|
||||||
|
**Tasks**:
|
||||||
|
- [ ] Set up logging for analytics endpoints
|
||||||
|
- [ ] Monitor database query performance
|
||||||
|
- [ ] Check Redis cache hit rates
|
||||||
|
- [ ] Verify aggregation job running
|
||||||
|
- [ ] Test with production traffic
|
||||||
|
- [ ] Create runbook for troubleshooting
|
||||||
|
- [ ] Announce analytics feature
|
||||||
|
|
||||||
|
**Deliverables**:
|
||||||
|
- Production analytics live
|
||||||
|
- Monitoring in place
|
||||||
|
|
||||||
|
## Success Metrics
|
||||||
|
|
||||||
|
### Functional Requirements
|
||||||
|
- ✅ Track views for all content types (posts, photos, albums, projects)
|
||||||
|
- ✅ Provide unique visitor estimates (session-based)
|
||||||
|
- ✅ Show trending content over different time periods
|
||||||
|
- ✅ Display traffic sources (referrers)
|
||||||
|
- ✅ Admin dashboard accessible and intuitive
|
||||||
|
|
||||||
|
### Performance Requirements
|
||||||
|
- API response time < 100ms (cached queries)
|
||||||
|
- Tracking endpoint < 50ms response time
|
||||||
|
- No performance impact on public pages
|
||||||
|
- Database growth < 100MB/year
|
||||||
|
- Analytics page load < 2 seconds
|
||||||
|
|
||||||
|
### Privacy Requirements
|
||||||
|
- No cookies or client-side storage
|
||||||
|
- No IP addresses stored
|
||||||
|
- Session hashing non-reversible
|
||||||
|
- Data retention policy enforced
|
||||||
|
- GDPR compliant by design
|
||||||
|
|
||||||
|
### User Experience
|
||||||
|
- Admin can view analytics in < 3 clicks
|
||||||
|
- Dashboard updates within 5-10 minutes
|
||||||
|
- Clear visualization of trends
|
||||||
|
- Easy to identify popular content
|
||||||
|
- Referrer sources actionable
|
||||||
|
|
||||||
|
## Technical Decisions & Rationale
|
||||||
|
|
||||||
|
### Why Self-Hosted?
|
||||||
|
- **Privacy control**: Full ownership of analytics data
|
||||||
|
- **No third parties**: Data never leaves our servers
|
||||||
|
- **Cost**: Zero ongoing cost vs. paid analytics services
|
||||||
|
- **Customization**: Tailored to our exact content types
|
||||||
|
|
||||||
|
### Why PostgreSQL for Storage?
|
||||||
|
- **Already in stack**: Leverages existing database
|
||||||
|
- **Relational queries**: Perfect for analytics aggregations
|
||||||
|
- **JSON support**: Flexible for future extensions
|
||||||
|
- **Reliability**: Battle-tested for high-volume writes
|
||||||
|
|
||||||
|
### Why Redis for Caching?
|
||||||
|
- **Already in stack**: Existing Redis instance available
|
||||||
|
- **Speed**: Sub-millisecond cache lookups
|
||||||
|
- **TTL support**: Automatic expiration for stale data
|
||||||
|
- **Simple**: Key-value model perfect for cache
|
||||||
|
|
||||||
|
### Why Session Hashing?
|
||||||
|
- **Privacy**: Can't reverse to identify users
|
||||||
|
- **Deduplication**: Approximate unique visitors
|
||||||
|
- **Daily rotation**: Limits tracking window to 24 hours
|
||||||
|
- **No cookies**: Works without user consent
|
||||||
|
|
||||||
|
### Why 90-Day Retention?
|
||||||
|
- **Privacy**: Limit detailed tracking window
|
||||||
|
- **Performance**: Keeps PageView table size manageable
|
||||||
|
- **Historical data**: Aggregated summaries preserved forever
|
||||||
|
- **Balance**: Fresh data for trends, long-term for insights
|
||||||
|
|
||||||
|
## Future Enhancements
|
||||||
|
|
||||||
|
### Phase 2 Features (Post-Launch)
|
||||||
|
- [ ] Real-time analytics (WebSocket updates)
|
||||||
|
- [ ] Geographic data (country-level, privacy-preserving)
|
||||||
|
- [ ] View duration tracking (time on page)
|
||||||
|
- [ ] Custom events (video plays, downloads, etc.)
|
||||||
|
- [ ] A/B testing support
|
||||||
|
- [ ] Conversion tracking (email signups, etc.)
|
||||||
|
|
||||||
|
### Advanced Analytics
|
||||||
|
- [ ] Cohort analysis
|
||||||
|
- [ ] Funnel tracking
|
||||||
|
- [ ] Retention metrics
|
||||||
|
- [ ] Bounce rate calculation
|
||||||
|
- [ ] Exit page tracking
|
||||||
|
|
||||||
|
### Integrations
|
||||||
|
- [ ] Export to CSV/JSON
|
||||||
|
- [ ] Scheduled email reports
|
||||||
|
- [ ] Slack notifications for milestones
|
||||||
|
- [ ] Public analytics widget (opt-in)
|
||||||
|
|
||||||
|
### Admin Improvements
|
||||||
|
- [ ] Custom date range selection
|
||||||
|
- [ ] Saved analytics views
|
||||||
|
- [ ] Compare time periods
|
||||||
|
- [ ] Annotations on charts
|
||||||
|
- [ ] Predicted trends
|
||||||
|
|
||||||
|
## Testing Strategy
|
||||||
|
|
||||||
|
### Unit Tests
|
||||||
|
- Session hash generation
|
||||||
|
- Date range utilities
|
||||||
|
- Aggregation logic
|
||||||
|
- Cache key generation
|
||||||
|
|
||||||
|
### Integration Tests
|
||||||
|
- Tracking endpoint
|
||||||
|
- Analytics API endpoints
|
||||||
|
- Redis caching layer
|
||||||
|
- Database queries
|
||||||
|
|
||||||
|
### End-to-End Tests
|
||||||
|
- Track view from public page
|
||||||
|
- View analytics in admin
|
||||||
|
- Verify cache behavior
|
||||||
|
- Test aggregation job
|
||||||
|
|
||||||
|
### Load Testing
|
||||||
|
- Simulate 100 concurrent tracking requests
|
||||||
|
- Test admin dashboard under load
|
||||||
|
- Verify database performance
|
||||||
|
- Check Redis cache hit rates
|
||||||
|
|
||||||
|
## Documentation Requirements
|
||||||
|
|
||||||
|
### Developer Documentation
|
||||||
|
- API endpoint specifications
|
||||||
|
- Database schema documentation
|
||||||
|
- Caching strategy guide
|
||||||
|
- Aggregation job setup
|
||||||
|
|
||||||
|
### User Documentation
|
||||||
|
- Admin analytics guide
|
||||||
|
- Interpreting metrics
|
||||||
|
- Privacy policy updates
|
||||||
|
- Troubleshooting guide
|
||||||
|
|
||||||
|
### Operational Documentation
|
||||||
|
- Deployment checklist
|
||||||
|
- Monitoring setup
|
||||||
|
- Backup procedures
|
||||||
|
- Incident response
|
||||||
|
|
||||||
|
## Security Considerations
|
||||||
|
|
||||||
|
### Rate Limiting
|
||||||
|
- Tracking endpoint: 10 requests/minute per session
|
||||||
|
- Admin endpoints: 100 requests/minute per user
|
||||||
|
- Prevents abuse and DoS attacks
|
||||||
|
|
||||||
|
### Authentication
|
||||||
|
- All admin analytics endpoints require authentication
|
||||||
|
- Use existing admin auth system
|
||||||
|
- No public access to analytics data
|
||||||
|
|
||||||
|
### Data Privacy
|
||||||
|
- Never log raw IPs in application logs
|
||||||
|
- Session hashes rotated daily
|
||||||
|
- No cross-session tracking
|
||||||
|
- Complies with GDPR "legitimate interest" basis
|
||||||
|
|
||||||
|
### SQL Injection Prevention
|
||||||
|
- Use Prisma ORM (parameterized queries)
|
||||||
|
- Validate all input parameters
|
||||||
|
- Sanitize referrer URLs
|
||||||
|
|
||||||
|
## Open Questions
|
||||||
|
|
||||||
|
1. **Chart Library**: Use lightweight SVG solution or import charting library?
|
||||||
|
- Option A: Simple SVG line charts (custom, lightweight)
|
||||||
|
- Option B: Chart.js or similar (feature-rich, heavier)
|
||||||
|
- **Decision**: Start with simple SVG, upgrade if needed
|
||||||
|
|
||||||
|
2. **Real-time Updates**: Should analytics dashboard update live?
|
||||||
|
- Option A: Manual refresh only (simpler)
|
||||||
|
- Option B: Auto-refresh every 30 seconds (nicer UX)
|
||||||
|
- Option C: WebSocket real-time (complex)
|
||||||
|
- **Decision**: Auto-refresh for Phase 1
|
||||||
|
|
||||||
|
3. **Export Functionality**: CSV export priority?
|
||||||
|
- **Decision**: Include in Phase 2, not critical for MVP
|
||||||
|
|
||||||
|
4. **Geographic Data**: Track country-level data?
|
||||||
|
- **Decision**: Future enhancement, requires IP geolocation
|
||||||
|
|
||||||
|
## Appendix
|
||||||
|
|
||||||
|
### Example Queries
|
||||||
|
|
||||||
|
**Total views for a piece of content**:
|
||||||
|
```sql
|
||||||
|
SELECT COUNT(*) FROM PageView
|
||||||
|
WHERE contentType = 'photo' AND contentId = 123;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Unique visitors (approximate)**:
|
||||||
|
```sql
|
||||||
|
SELECT COUNT(DISTINCT sessionHash) FROM PageView
|
||||||
|
WHERE contentType = 'photo' AND contentId = 123
|
||||||
|
AND timestamp > NOW() - INTERVAL '7 days';
|
||||||
|
```
|
||||||
|
|
||||||
|
**Top content in last 7 days**:
|
||||||
|
```sql
|
||||||
|
SELECT contentType, contentId, contentSlug,
|
||||||
|
COUNT(*) as views,
|
||||||
|
COUNT(DISTINCT sessionHash) as unique_visitors
|
||||||
|
FROM PageView
|
||||||
|
WHERE timestamp > NOW() - INTERVAL '7 days'
|
||||||
|
GROUP BY contentType, contentId, contentSlug
|
||||||
|
ORDER BY views DESC
|
||||||
|
LIMIT 10;
|
||||||
|
```
|
||||||
|
|
||||||
|
**Views by day**:
|
||||||
|
```sql
|
||||||
|
SELECT DATE(timestamp) as date,
|
||||||
|
COUNT(*) as views,
|
||||||
|
COUNT(DISTINCT sessionHash) as unique_visitors
|
||||||
|
FROM PageView
|
||||||
|
WHERE contentType = 'photo' AND contentId = 123
|
||||||
|
GROUP BY DATE(timestamp)
|
||||||
|
ORDER BY date DESC;
|
||||||
|
```
|
||||||
|
|
||||||
|
### Database Migration Template
|
||||||
|
|
||||||
|
```prisma
|
||||||
|
-- CreateTable
|
||||||
|
CREATE TABLE "PageView" (
|
||||||
|
"id" SERIAL PRIMARY KEY,
|
||||||
|
"contentType" VARCHAR(50) NOT NULL,
|
||||||
|
"contentId" INTEGER NOT NULL,
|
||||||
|
"contentSlug" VARCHAR(255) NOT NULL,
|
||||||
|
"sessionHash" VARCHAR(64) NOT NULL,
|
||||||
|
"referrer" VARCHAR(500),
|
||||||
|
"timestamp" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP
|
||||||
|
);
|
||||||
|
|
||||||
|
-- CreateTable
|
||||||
|
CREATE TABLE "AggregatedView" (
|
||||||
|
"id" SERIAL PRIMARY KEY,
|
||||||
|
"contentType" VARCHAR(50) NOT NULL,
|
||||||
|
"contentId" INTEGER NOT NULL,
|
||||||
|
"contentSlug" VARCHAR(255) NOT NULL,
|
||||||
|
"date" DATE NOT NULL,
|
||||||
|
"viewCount" INTEGER NOT NULL DEFAULT 0,
|
||||||
|
"uniqueCount" INTEGER NOT NULL DEFAULT 0
|
||||||
|
);
|
||||||
|
|
||||||
|
-- CreateIndex
|
||||||
|
CREATE INDEX "PageView_contentType_contentId_idx" ON "PageView"("contentType", "contentId");
|
||||||
|
CREATE INDEX "PageView_timestamp_idx" ON "PageView"("timestamp");
|
||||||
|
CREATE INDEX "PageView_sessionHash_timestamp_idx" ON "PageView"("sessionHash", "timestamp");
|
||||||
|
CREATE INDEX "PageView_contentType_timestamp_idx" ON "PageView"("contentType", "timestamp");
|
||||||
|
|
||||||
|
-- CreateIndex
|
||||||
|
CREATE UNIQUE INDEX "AggregatedView_contentType_contentId_date_key" ON "AggregatedView"("contentType", "contentId", "date");
|
||||||
|
CREATE INDEX "AggregatedView_contentType_contentId_idx" ON "AggregatedView"("contentType", "contentId");
|
||||||
|
CREATE INDEX "AggregatedView_date_idx" ON "AggregatedView"("date");
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
No new environment variables required - uses existing:
|
||||||
|
- `DATABASE_URL` (PostgreSQL)
|
||||||
|
- `REDIS_URL` (Redis)
|
||||||
|
|
||||||
|
## Conclusion
|
||||||
|
|
||||||
|
This privacy-friendly analytics system provides essential insights into content performance while maintaining strict privacy standards. By leveraging existing infrastructure and implementing smart caching, it delivers a lightweight, performant solution that respects user privacy and complies with modern data protection regulations.
|
||||||
|
|
||||||
|
The phased approach allows for incremental delivery, with the core tracking and basic dashboard available within 2-3 weeks, and advanced features rolled out progressively based on actual usage and feedback.
|
||||||
Loading…
Reference in a new issue