Data Pipeline
- Market coverage: End-of-day prices, volume, fundamentals, and sector labels collected for major U.S. tickers.
- Feature engineering: Rolling returns, volatility, momentum, liquidity ratios, and macro overlays are refreshed daily.
- Target variable: We forecast the next trading session return and evaluate against intraday highs to monitor risk.
Ensemble Architecture
- CatBoost: Learns tabular relationships from the latest snapshot of engineered factors.
- GRU: Processes 30-day sequences to capture momentum shifts and mean reversion dynamics.
- Transformer: Examines 60-day windows to model longer-term seasonal and regime patterns.
- Weighted blend: Scores are combined using calibrated weights (default 0.4 / 0.3 / 0.3) to form a single ranking metric.
Recommendation Flow
- Aggregate features and load the latest model artefacts.
- Score every eligible ticker with each model and compute the blended score.
- Filter for liquidity and volatility constraints, then surface the top 10 names along with projected upside.
- Track next-session highs to evaluate realized returns and recalibrate the ensemble.
Risk Considerations
The model is purely statistical and does not guarantee profits. Liquidity conditions, slippage, and unexpected news can materially impact results. Use these signals as one input within a broader investment process.