Diego Garcia

BUILDING

A personal financial intelligence system. NLP signals fused with quantitative market models into a single portfolio decision layer.

TRACK A · NLP·TRACK B · QUANT
SCROLL

SYSTEM OVERVIEW

The Architecture

Two parallel tracks, one qualitative and one quantitative, that independently process information and converge into a single decision layer. Neither track alone produces reliable signals. The edge lives in the convergence.

TRACK A

NLP Pipeline

Raw text → structured financial signal. What is the market narrative saying, and how strong is that signal?

TRACK B

Quant Pipeline

Raw market data → pattern recognition. What is price behaviour telling us, and what regime are we in?

FUSION

Convergence Layer

When the news says X and price does Y, what typically happens next? A learned map of how information flows into price.

TRACK A · NLPNews / FilingsFinBERT ModelSentiment SignalTopic + UrgencyTRACK B · QUANTOHLCV + IndicatorsRegime DetectionPrice PatternsOptions + InsiderFUSION LAYERXGBoost · Divergence ScoreOUTPUT LAYERSignals · Allocations · DigestCONVERGENCE

MODEL STACK

The Five Models

Each model serves a distinct purpose. They are designed to be built sequentially. Each feeds the next. Do not skip Model 4 to reach the exciting parts.

01
THE EARS

Financial NLP

Fine-tuned FinBERT

Extracts structured signals from financial text: sentiment scores, mentioned tickers, topic classification, and urgency ratings. Not summaries. Machine-readable signal.

WHY THIS ARCHITECTURE

Pre-trained on financial corpora. Produces 90% of the output at 10% of the complexity of a full LLM. Cheaper, faster, debuggable.

OUTPUT

JSON per document: { ticker, sentiment_score, topic, urgency, entities }

02
THE CONTEXT

Regime Detection

XGBoost classifier

Answers the meta-question before any signal is read: what kind of market are we in right now? Momentum strategy in a ranging market destroys returns. This prevents that.

WHY THIS ARCHITECTURE

HMMs assume fixed transition probabilities. Real regimes are driven by macro catalysts that break stationarity entirely. XGBoost is more robust and produces interpretable feature importances.

OUTPUT

Regime label + confidence score daily: trending bull · trending bear · ranging low-vol · ranging high-vol

03
THE EYES

Price Pattern Model

Temporal Fusion Transformer

Learns temporal patterns in price and volume. Given these quantitative conditions historically, what has happened over the next 1, 5, and 20 days?

WHY THIS ARCHITECTURE

Significantly better than LSTM for multivariate financial time series. Handles variable-length lookbacks and produces interpretable attention weights showing which features matter.

OUTPUT

Probability distribution of price movement over 3 horizons, conditioned on current regime

04
THE GUARD

Risk Model

Rolling covariance + VaR + CVaR

Understands correlation between positions and estimates portfolio-level risk before any position is taken. Stops you from being 'diversified' across assets that crash together.

WHY THIS ARCHITECTURE

Well-understood mathematics, not deep learning. Build it before the optimizer. Risk management bolted on late gets bolted on badly.

OUTPUT

Risk score per position · Max drawdown estimate · Position size ceiling per asset

05
THE HANDS

Portfolio Optimizer

Markowitz mean-variance + ML expected returns

Given signals from the Fusion Layer and risk constraints from Model 4, computes the optimal allocation across assets. When to rebalance and by how much.

WHY THIS ARCHITECTURE

Markowitz is sensitive to expected return inputs. Small Fusion Layer errors get amplified into extreme allocations. Hard weight constraints from Model 4 are mandatory, not optional.

OUTPUT

Allocation percentages with hard caps · Rebalancing recommendations · Kelly-informed position sizing

DATA SOURCES · ALL FREE TIER DURING DEVELOPMENT

OHLCVYahoo FinanceMarketAlpha VantageMacroFREDFilingsSEC EDGARTextNewsAPIOptionsUnusual WhalesShort InterestFINRASentimentReddit APITraining DataHuggingFace

WHY THIS EXISTS

Most retail investors are flying blind. Generic tools, generic signals, no memory of what worked before.

Kairox Vector is my attempt to build infrastructure that compounds. Not a strategy. Not a screener. A system that gets incrementally better informed the longer it runs, because the data it accumulates and the correlations it learns are specific to how I think about markets. Every phase adds a layer of judgment I did not have before.

PHASE 3 OF 9 · REGIME DETECTION · IN PROGRESS