AICode QualityContext WindowSemantic Analysis•15 min read

Why Your Codebase is Invisible to AI (And What to Do About It)

Peng Cao

January 22, 2025

Key Insight: AI can only see what you make visible. Code structure, duplication, and fragmentation all impact AI's ability to help you.

I watched GitHub Copilot suggest the same validation logic three times in one week. Different syntax. Different variable names. Same exact purpose.

The AI wasn't broken. My codebase was invisible.

AI models have limited visibility. If your logic is fragmented, it remains invisible to the AI's context.

Here's the problem: AI can write code, but it can't see your patterns. Not the way humans do. When you have the same logic scattered across different files with different names, AI treats each one as unique. So it solves it again. And again.

Warning: This isn't just annoying. It's expensive.

The Context Window Crisis

Every time your AI assistant helps with code, it needs context. It reads your file, follows imports, understands dependencies. All of this costs tokens. The more fragmented your code, the more tokens you burn.

Let me show you a real example from building ReceiptClaimer.

Example 1: User Validation - The Hard Way

I had user validation logic spread across 8 files:

api/auth/validate-email.ts
api/auth/validate-password.ts
api/users/check-email-exists.ts
api/users/validate-username.ts
lib/validators/email.ts
lib/validators/password-strength.ts
utils/auth/email-format.ts
utils/validation/user-fields.ts

Total context cost: 12,450 tokens per request.

Example 2: User Validation - The Smart Way

After refactoring, I consolidated to 2 files:

lib/user-validation/index.ts - All validation logic
lib/user-validation/types.ts - Shared types

Three Ways Your Code Becomes Invisible

1. Semantic Duplicates: Same Logic, Different Disguise

typescript

// File: api/receipts/validate.ts
function checkReceiptData(data: any): boolean {
  if (!data.merchant) return false;
  if (!data.amount) return false;
  if (data.amount <= 0) return false;
  if (!data.date) return false;
  return true;
}

// File: lib/validators/receipt-validator.ts
export function isValidReceipt(receipt: ReceiptInput): boolean {
  const hasRequiredFields = receipt.merchant &&
                           receipt.amount &&
                           receipt.date;
  const hasPositiveAmount = receipt.amount > 0;
  return hasRequiredFields && hasPositiveAmount;
}

2. Domain Fragmentation: Scattered Logic That Bleeds Tokens

text

src/
  api/
    receipts/
      upload.ts          # Handles file upload
      extract.ts         # Calls OCR service
      parse.ts           # Parses OCR response
  lib/
    ocr/
      google-vision.ts   # Google Vision integration
      openai-vision.ts   # OpenAI Vision integration
    parsers/
      receipt-parser.ts  # Parsing logic
  services/
    receipt-service.ts   # Business logic
  utils/
    file-upload.ts       # S3 upload helper

3. Low Cohesion: Mixed Concerns That Confuse Everyone

typescript

// lib/utils/helpers.ts (820 lines)
export function formatCurrency(amount: number): string { ... }
export function parseDate(dateStr: string): Date { ... }
export function uploadToS3(file: Buffer): Promise<string> { ... }
export function validateEmail(email: string): boolean { ... }
export function generateToken(): string { ... }
export function calculateGST(amount: number): number { ... }
export function hashPassword(pwd: string): Promise<string> { ... }

*Peng Cao is the founder of receiptclaimer and creator of aiready, an open-source suite for measuring and optimising codebases for AI adoption.*

Join the Discussion

Have questions or want to share your AI code quality story? Drop them below. I read every comment.