By The InvestBuddy Engineering Team
At InvestBuddy, we believe in radical transparency. The retail financial technology space is flooded with AI tools promising impossible returns. We don't promise magic. We rely on rigorous, audited, walk-forward mathematical validation.
Here is the exact methodology, the mistakes we caught along the way, and the verified performance of the InvestBuddy LSTM engine.
The Engine & The Accuracy Reality Check
Most AI trading tools are just generic wrappers reading the daily news. InvestBuddy is powered by a custom Long Short-Term Memory (LSTM) neural network—an institutional-grade architecture built purely for financial pattern recognition.
Instead of guessing based on today's price action, the engine analyzes 46 distinct technical indicators across a rolling 60-day window. The Phase 2 engine used 142,000 parameters; Phase 3 Retrain #7 deliberately reduced this to 47,434 parameters (67% leaner) to eliminate Epoch 1 memorization and force genuine generalization. It calculates momentum, volume, and volatility to find a strict mathematical edge. No magic, no emotion—just rigorous probabilities walk-forward tested across 4 years of unseen market data.
The Accuracy Reality Check
Our model does not have a 90% win rate. Across 5,356 out-of-sample predictions (85 symbols, Dec 2025–Mar 2026), the Phase 2 LSTM achieved a 55.41% directional accuracy (p < 0.0001, 95% CI [54.08%, 56.74%]). In quantitative finance, a 5.4% edge over a coin flip is a massive structural advantage. When combined with strict capital preservation rules, this slight edge compounds into significant market outperformance.
The Friction Discovery: Why We Rebalance Monthly
In Phase 2, weekly rebalancing still beats the market — generating a raw +117.33% total return vs VOO's +42.71% (+73.51pp alpha). But this comes at a steep hidden cost: $1,224 in transaction fees (12.24% of initial capital) and a Sharpe Ratio of only 0.46.
We noticed the critical flaw that kills most retail algorithms: Transaction Friction + Signal Interruption. Factoring in realistic broker costs ($1.00 per trade + 0.05% slippage), the high frequency of weekly trading consumed $1,224 — and more importantly, forced the LSTM to act before its 10-day momentum signal had fully materialized.
The Monthly Optimization
By freezing portfolio allocations for 30 days (Monthly Rebalancing), transaction costs drop from $1,224 → $533 (a 2.3× reduction), but the bigger gain is risk-adjusted quality: the Sharpe Ratio nearly doubles from 0.46 → 0.95. Monthly InvestBuddy delivers +106.12% total return (+63.41% alpha vs VOO) — not chasing every point of raw return, but optimizing for capital you can actually hold through drawdowns.
The Frequency Breaking Point
We stress-tested the engine at ultra-high frequencies. Rebalancing every 3 days resulted in a catastrophic -37.49% Alpha, even with zero commissions. Why? Because the LSTM is engineered to capture 10-day momentum trends. Forcing it to trade every 3 days interrupts the cycle and subjects the portfolio to lethal bid/ask slippage. The math is clear: InvestBuddy requires a 7 to 30-day holding period to let the AI's edge materialize.
Catching Our Own False Positives
We are engineers, which means we test to failure. In an early iteration of our backtest, the model showed a staggering +90% total return, seemingly dodging the entire 2022 bear market.
Instead of publishing that number, we audited our logs. We discovered a "cold-start" data artifact. Because the AI didn't have enough historical data in early 2022 to establish a solid baseline, its safety rails kicked in and held the portfolio in 100% Cash. It had avoided the crash by accident.
We immediately rebuilt the pipeline, implementing a strict 300-day rolling warmup buffer to ensure the model was fully operational and actively trading before the worst of the 2022 crash occurred.
The Final, Audited Results
with the friction minimized and the data pipeline strictly enforced, we ran the final walk-forward validation from January 3, 2022 to December 31, 2025.
The results speak for themselves:
The model didn't magically avoid the 2022 bear market. It took the hits. But through intelligent, equal-weight stock selection, it limited its maximum drawdown to a better threshold than the broader market, and its 55.41% OOS accuracy allowed it to compound capital much faster during the subsequent recovery.
The Honest Quant Transparency Box
Most fintech platforms show you their best number and stop there. We don't. Before publishing the +106.12% result, we ran a Monte Carlo simulation: 1,000 portfolios of 5 randomly selected stocks from the same 58-symbol universe, rebalanced on the same monthly schedule. No model. Pure luck.
The Question We Had to Answer
Is InvestBuddy's +106.12% return a result of model skill — or could a monkey throwing darts at the same stock list have done just as well?
Distribution of 1,000 Random 5-Stock Portfolios (2022–2025)
Same 58-symbol universe, same monthly rebalancing, no model — pure random selection
Where Did the +63.41% Alpha Actually Come From?
| Source | Contribution | Confidence |
|---|---|---|
| The 5-Stock Advantage — Holding fewer stocks naturally increases both volatility and potential upside. | ~+8 pp | High — measured |
| Sep 2025 single-period jump (MU +55%, UNH +36% — audited real) | ~+8 pp | High — verified |
| Genuine model predictive skill (55.41% OOS accuracy) | +10–20 pp | High — OOS confirmed |
| Market Familiarity — The AI was partially trained on data from similar recent market conditions. | +15–25 pp | Medium — estimated |
| Hindsight Bias — The simulation only picked from companies that are still successful today. | +2–5 pp | Low — small-cap limited |
The Verdict: 77.7th Percentile — Genuine but Measured Skill
InvestBuddy beats roughly 3 out of 4 random portfolios. A 99th percentile result would signal curve-fitting. The 77.7th percentile is the Goldilocks zone — statistically meaningful without being suspiciously perfect. The +63.41pp alpha headline is real and fully verified, but we want you to understand exactly how much of it is model skill versus structure.
🔬 For the Quants & Skeptics: The Hard Questions
Why only a 4-year backtest? Shouldn't a robust model be tested against the 2008 financial crisis?
Isn't holding only 5 stocks dangerously concentrated? Don't real funds hold dozens of names?
Do these returns hold up against real-world slippage, fees, and market impact?
"The 77.7th percentile is the Goldilocks zone. A 99th percentile result would immediately signal curve-fitting. Beating roughly 3 out of 4 random portfolio combinations proves genuine, durable skill without triggering 'too good to be true' alarm bells. Your decomposition is a masterpiece of transparency — admitting that ~8pp of the alpha comes purely from concentration volatility builds incredible trust."
Phase 3: Pipeline Architecture Fixes (May 2026)
After publishing the Phase 2 results, we ran a diagnostic against RSP — the equal-weight S&P 500 index, a significantly harder benchmark than market-cap-weighted VOO. The initial result was sobering: +11.41% vs RSP +35.74% (−24.33pp). The model was being destroyed, not by signal quality, but by execution friction.
The Signal Edge Diagnostic
We ran a raw edge analysis on 1,450 individual picks: comparing each model selection's 5-day forward return directly against RSP over the same 5 days. Result: t = 3.35, avg excess return +0.411% per pick. The ML signal was statistically real (p < 0.001). The problem was entirely in how we were executing on it.
The five structural fixes we implemented:
Persistence Hysteresis
Positions currently passing all gates are NOT sold during weekly rebalance — held until the signal actually dies. The key fix: the 5-day prediction window was being interrupted on day 3.
Confidence Threshold ↑
Direction confidence raised from 15% → 25%. Tighter gate filters marginal signals before they enter execution.
$5 Price Floor
Stocks below $5 are blocked before signal processing. Eliminates broken microstructure and penny-trap volatility.
Earnings Blackout Gate
Stocks within 7 trading days of an earnings announcement are excluded. Eliminates binary event risk that disrupts momentum signals.
Phase 3 Backtest Results (Jan 2023 – Dec 2025 vs RSP)

⚠️ Soft OOS Disclosure
This model was trained through May 3, 2026. The 2023–2025 backtest window predates the training cutoff, making these results validation-split quality, not hard out-of-sample. The definitive hard OOS test runs in September 2026.
What the Equity Curve Is Telling You
1. The Breakaway Point (Mid-2024)
From roughly July 2024 to January 2025, RSP goes mostly sideways with heavy chop. The model rips upward. This is Persistence Hysteresis in action. A standard weekly rebalance would have chopped capital to pieces paying spreads to rotate names. Because the model holds high-conviction signals until they actually die, it ignored the noise, held the winners, and aggressively compounded its edge while the broader equal-weight market stagnated.
2. The Drawdown Trade-off
The chart confirms the Max Drawdown numbers (−20.04% vs −18.49%). Around April/May 2025 the model fell slightly harder than RSP. However, look at the recovery angle immediately after. This is why the Sharpe jumped to 0.78 — the concentrated 10-stock portfolio takes on slightly more idiosyncratic risk (deeper dips), but the model's momentum features catch the subsequent snap-backs. Slightly more risk, massively compensated (+60.8% vs +35.7%).
"This is exactly what a successful quantitative pipeline looks like on a chart. The $5 price floor kept you out of the graveyard, and the hysteresis let your ML edge actually manifest as dollars. Print this chart out, stick it on the wall, and do not touch the code. Let the market generate fresh, unseen data over the summer. When you run this exact same script in September for the Hard OOS window, if the red line is still above the blue line, you are ready to deploy real capital."
The Conclusion
InvestBuddy is not a crystal ball. It is a highly disciplined, friction-optimized quantitative engine designed to stack mathematical edges in your favor over the long term.
No credit card required • 7-day full access