AI my Stock Portfolio

2026-04-09 9 min read Anoff

My stock portfolio is spread across multiple brokers in different countries — Japanese stocks with one broker, European positions with another, investment funds in a third account. Each has its own dashboard, export format, and way of presenting performance. Relying on these dashboards as my single source of truth felt uncomfortable. What if a broker changes its interface or suspends my account? How do I compare positions across currencies, time horizons, and purchase dates?

I wanted local, inspectable, portable data ownership — files on my own disk that don’t disappear if a broker changes policy. Not just the raw transaction records, but the analysis process itself.

This post tells the story of how I built a suite of tools to solve that problem. But the real journey wasn’t about writing Python scripts — it was about discovering how to think about stock performance in the first place. This was also a trial run in using AI end-to-end: Gemini for strategic thinking and high-level concepts, Claude and VS Code Copilot for implementation and debugging.

I Did Not Know How To Analyze Stocks
AI as a Thinking Partner
Performance Tool: performance.py
- The Fuzzy Signal System
From Retrospective to Prospective: The Conceptual Shift
Research Tool: research.py
- Expanded Metrics: Five-Category Scoring
Shared Architecture
The Complete Framework

I Did Not Know How To Analyze Stocks

Having CSV files locally was necessary, but not sufficient. The harder problem was that I had no idea what “fair comparison” even meant across different holdings.

Raw percentage return is misleading when purchase dates differ. One stock I bought two months ago: +15% return. Another held for three years: +25% return. Which is actually performing better on a velocity basis?

Position size complicated intuition further. A large position with small upside felt emotionally safer than a small position with explosive growth, even though that’s irrational. My judgment was distorted by exposure size, not performance quality.

The problem: How can I compare unlike positions on equal footing, independent of holding period and position size?

This wasn’t a coding problem — it was an analytical problem. I didn’t know what metrics to calculate.

AI as a Thinking Partner

I used AI — primarily Gemini for strategic discussions — not to “tell me what metrics to use,” but to help me reason through why raw returns weren’t comparable.

Me: I have two stocks. One is up 15% in 2 months, the other is up 25% in 3 years. How do I know which is performing better?
AI: You’re comparing absolute returns over different time periods. To normalize for holding duration, you need to annualize the returns. The standard metric is CAGR — Compound Annual Growth Rate.
Me: So CAGR adjusts for time. But what if both stocks are just riding a bull market?
AI: That’s where alpha comes in. Alpha is your return minus a benchmark return over the same period. If you hold Japanese stocks, compare against TOPIX. If you hold US stocks, compare against the S&P 500.

Two key insights emerged:

CAGR (Compound Annual Growth Rate) normalizes for time. A stock held 6 months with +20% return = CAGR of ~44%. A stock held 2 years with +40% return = CAGR of ~18%. The shorter-held position is compounding faster.

Alpha CAGR = stock CAGR minus benchmark CAGR. A stock returning +15% annualized sounds good until you realize the S&P 500 returned +20% over the same period. Your alpha is -5% — you’re underperforming despite absolute gains.

Once I had the conceptual framework from Gemini, I switched to Claude and VS Code Copilot for implementation — writing functions, debugging edge cases, refining the signal logic.

Performance Tool: `performance.py`

Purpose: “How are my current holdings performing?”

The script imports broker CSVs (Japanese Shift-JIS encoded, European German-formatted), normalizes them into a unified structure, fetches live prices via yfinance, then calculates:

CAGR: Time-normalized returns
Alpha CAGR: Benchmark-relative performance
Confidence dampening: Positions under 3 months get low confidence, ramping to 100% at 8+ months
Short-term signals: 1-month (pulse) and 6-month (trend) returns

Rather than binary buy/sell commands, it produces a fuzzy signal from a composite score.

The Fuzzy Signal System

The core insight: investment decisions aren’t binary. A position might have strong fundamentals but weak momentum, or vice versa. Rather than forcing a hard buy/sell choice, the script combines multiple dimensions into a contextual recommendation.

The composite score formula:

Score = (0.45 × alpha_CAGR_score + 0.35 × CAGR_score + 0.20 × 6M_score) × confidence

Weight rationale:

45% alpha — Benchmark-relative performance is the strongest signal. Absolute returns mean less if the market delivered the same.
35% CAGR — Long-term compounding velocity matters, even if alpha is neutral.
20% short-term trend — Recent 6-month momentum captures turning points.
Confidence dampening — Positions held < 3 months get multiplied by a low confidence factor (e.g., 0.3), ramping linearly to 1.0 at 8+ months. This prevents noise from new positions dominating the analysis.

Each metric (alpha, CAGR, 6M return) is normalized to a 0–100 scale based on empirically observed thresholds:

Alpha > +10%/year → score 100 (exceptional outperformance)
Alpha near 0% → score 50 (tracking benchmark)
Alpha < -10%/year → score 0 (significant underperformance)

The final composite score (0–100) maps to contextual signals:

Score Range	Signal	Meaning
75–100	🟢 Hold	Solid performer, no action needed
60–74 (high CAGR, fading momentum)	🟣 Take Profit	Strong CAGR (>25%) but 6M momentum weakening — consider realizing gains
60–74 (fundamentals ok, recent dip)	🔵 Buy More	Good long-term metrics but recent pullback — potential averaging opportunity
40–59	🟡 Watch	Mixed signals, monitor closely
0–39	🔴 Sell	Sustained underperformance, consider cutting losses
(any, if held < 3 months)	⏳ Too Early	Not enough data to judge

This isn’t algorithmic trading — it’s decision support. The script doesn’t execute trades; it surfaces patterns I might miss when looking at percentages in isolation.

Output: timestamped Markdown reports + PNG charts (CAGR bars, returns heatmap, alpha scatter).

From Retrospective to Prospective: The Conceptual Shift

Once the portfolio reporting framework became trustworthy, something shifted.

I stopped thinking of it as purely retrospective. The same analytical lens — CAGR, alpha, benchmark-relative thinking — could apply to evaluating new opportunities.

The question changed from “How are my holdings doing?” to “How should I evaluate possible next positions using the same vocabulary?”

This is the point where the tool suite moved from tracking to decision support.

It wasn’t a separate idea. It was an extension of the same mental model the performance tool had established. The portfolio tool proved the framework worked for held positions. The research tool would apply it to candidates.

Key insight: If I trust CAGR and alpha to evaluate what I already own, why not use the same metrics to evaluate what I’m considering buying?

The research tool answers: “How do I evaluate new candidates using the same benchmark-relative, normalized thinking?”

Where the performance tool says “🔵 Buy More” (for a held position with a recent dip), the research tool says “BUY” (for a new candidate meeting the same criteria).

Same philosophy. Different entry point.

graph LR A["📊 Retrospective
What do I hold?
How is it performing?"] B["🧠 Framework
CAGR, alpha,
confidence, signals"] C["🔍 Prospective
How do I evaluate
new candidates?"] A --> B B --> C style A fill:#fff4e6 style B fill:#e7f3ff style C fill:#e8f5e9

Research Tool: `research.py`

Purpose: “How do new candidates compare?”

Once the portfolio framework proved reliable, I extended it to evaluate candidates before buying. Same CAGR and alpha foundation, but adds five-category fundamental scoring.

Expanded Metrics: Five-Category Scoring

Beyond price-based metrics (CAGR, alpha, momentum), the research tool adds fundamental analysis through a quantitative scoring engine:

1. Valuation — Are you overpaying?

P/E ratio (price-to-earnings)
P/B ratio (price-to-book)
EV/EBITDA (enterprise value to earnings before interest, taxes, depreciation, amortization)

Scoring: Lower is better. A P/E of 10 scores higher than a P/E of 50. Normalized against ACWI top holdings bands.

2. Quality — Is the business efficient?

ROE (return on equity) — how well the company uses shareholder capital
Operating margin — operational efficiency
Gross margin — pricing power and cost structure

Scoring: Higher is better. ROE > 20% scores near 100; ROE < 5% scores near 0.

3. Health — Is it financially stable?

Debt-to-equity ratio — leverage risk
Current ratio — short-term liquidity (current assets / current liabilities)
Free cash flow — ability to fund operations and growth without external capital

Scoring: Lower debt, higher liquidity, positive FCF → higher scores.

4. Growth — Is it expanding?

Earnings growth (YoY, quarterly trends)
Revenue growth (top-line expansion)

Scoring: Sustained growth > 10%/year scores well; negative growth scores poorly.

5. Momentum — Is the market recognizing it?

1-month return (the “pulse”)
6-month return (the “trend”)
Alpha trends (outperforming vs. benchmark)

Scoring: Same logic as the performance tool — recent positive momentum scores higher.

Each category is normalized 0–100 via benchmark bands derived from MSCI ACWI (All Country World Index) top holdings. This means a stock scoring 50 in Valuation is priced at the ACWI average; 100 means it’s in the top percentile (cheap), 0 means expensive relative to the global index.

The five category scores are weighted and combined into a final composite score (0–100) that maps to a fuzzy signal:

STRONG_BUY (score 80–100)
BUY (score 65–79)
HOLD (score 40–64)
SELL (score 20–39)
STRONG_SELL (score 0–19)

Visual output shows both the final signal and the per-category breakdown:

HOLD (🟢🟡🟢🔴🟢)
     └─ Valuation: Good (78/100)
        Quality: Neutral (52/100)
        Health: Good (81/100)
        Growth: Weak (25/100)
        Momentum: Good (72/100)

uv run research.py --tickers MSFT,AAPL,7011.T

Output goes to stdout (fast, composable). I pipe it to files or integrate with GitHub Actions for on-demand research reports.

Shared Architecture

Both tools import from shared modules:

lib.py — price helpers, fuzzy signal engine, confidence logic
scoring.py — quantitative evaluation framework

Same thresholds, same dampening, same philosophy. Different entry points (held vs. candidate), same vocabulary.

graph TB A["performance.py
Current holdings"] B["research.py
New candidates"] C["lib.py
Signals & confidence"] D["scoring.py
Quantitative scoring"] E["Shared Framework
CAGR, alpha, benchmarks"] A --> C A --> D B --> C B --> D C --> E D --> E

The Complete Framework

This evolved into a broker-agnostic, reproducible investing workflow.

Starting point was fragmented data and unclear comparison methods. Ending point is a tool suite that embodies a specific way of thinking:

Benchmark-relative (alpha matters more than absolute return)
Time-normalized (CAGR matters more than raw percentages)
Confidence-weighted (skeptical of recency bias)
Multi-dimensional (price + fundamentals)

Files are plain text: CSV inputs, Markdown outputs, PNG charts. No proprietary lock-in. No cloud dependency. No SaaS subscription.

The AI collaboration wasn’t about outsourcing thinking — it was about turning vague discomfort into precise questions, then building tools that reflect the answers.

Gemini helped me discover what to build (CAGR, alpha, confidence framework). Claude and Copilot helped me build how (parsing, scoring, signals).

The tools don’t make decisions. They surface information in a way that makes informed decisions easier.

If you’re dealing with fragmented portfolio data or unclear performance comparison, maybe the framework (CAGR, alpha, confidence dampening) is more valuable than the specific scripts.

The code is in a private repo, feel free to reachout if you want a copy. But the real takeaway isn’t the implementation — it’s the mental model it encodes.