DRL Alpha Bot

Welcome — Step-by-Step Guide

✓

Bot is Running

DRL Alpha Bot process is live at port 5050 with real Binance WebSocket data feeding BTC & ETH 5-minute candles.

2

Resolve Circuit Breakers

Check the Risk tab for any active halts. When the daily loss limit is triggered during paper trading, you can safely reset it to continue learning.

3

Train the PPO Agent

The DRL model has no trained weights yet. Run the CSCV training pipeline — it uses 90 days of real Binance 5-min candles, runs 50 hyperparameter trials, and applies the Bailey et al. PBO overfitting filter. Takes ~30 minutes.

Loading data…

4

Validate Model Quality (PBO < 10%)

After training, the PBO (Probability of Backtest Overfitting) score must be below 10%. Only then will the bot switch from heuristic fallback to the trained PPO agent. Check the Agent tab for results.

5

Connect to Real Polymarket Markets

Currently using synthetic markets derived from Binance momentum — this is normal when no "BTC Up/Down 5-min" contracts are live on Polymarket. The Markets tab shows when real contracts appear. Strategy signals are identical in both modes.

6

Analyze Performance & Iterate

Monitor the Overview tab for equity curve, win rate, and Sharpe ratio. The 12-dim feature vector (Volume, RSI, DX, UltOsc, OBV, HT Phase, Price Divergence, Time Left, Spread, 5m Return, 1m Momentum, Vol Spike) drives the PPO policy. Retrain when win rate drops below 48% or drawdown exceeds 15%.

How the 5-Min Prediction Model Works

📡

Binance Latency Edge

Real-time WebSocket stream — price data arrives 2-3s before Polymarket updates

📊

12-Dim Feature Vector

Volume, RSI, DX, UltOsc, OBV, HT Phase, Price Divergence, Time Left, Spread, 5m Return, 1m Momentum, Vol Spike

🤖

PPO Agent (Stable-Baselines3)

Discrete(5) actions: HOLD, BUY YES small/large, BUY NO small/large

🛡️

Three-Layer Risk Stack

CVIX kill switch + Drawdown circuit breakers + Quarter-Kelly position sizing

🔬

CSCV Overfitting Filter

Bailey et al. (2016) — rejects models with PBO > 10% as overfitted to historical data

Data Sources — All Live

Binance OHLCV

api.binance.com + wss://stream.binance.com

● LIVE

Polymarket Gamma API

gamma-api.polymarket.com — scanned every 10s

● LIVE

Trade Execution

Paper mode only — no real funds at risk

● PAPER

⚙️

Training PPO… —

PAPER

Portfolio Balance

$—

—

Live Prices

BTC / USDT

$—

—

ETH / USDT

$—

—

Performance

Return📈

—

Cumulative

Win Rate🎯

—

All trades

Sharpe⚡

—

Risk-adjusted

Drawdown📉

—

Peak-to-trough

Equity Curve

$— Paper portfolio

Recent Activity

No trades yet

Waiting for markets...

⚙️

Training PPO… —

Active Markets

Scanning Polymarket...

5-min crypto UP/DOWN contracts

Open Positions

No open positions

Trade Statistics

Total Trades

—

Open Positions

—

Total P&L

—

Profit Factor

—

Avg Trade P&L

—

Trades / Day

—

Strategy Signal Vector

Loading signals...

⚙️

Training PPO… —

DRL Agent Status

🤖

Algorithm

Heuristic fallback — no model loaded

—

📅

Trained

Never

⚙️

Decision Mode

Binance divergence heuristic (≥2% signal → trade)

Probability of Backtest Overfitting (PBO)

Train model to evaluate

Target

PBO < 10%

Hyperparameters

Learning Rate

—

Batch Size

—

Gamma (γ)

—

Network Arch

—

Steps (n_steps)

—

RL Environment (MDP)

Observation Space

Box(18,) float32

Action Space

Discrete(5)

CSCV Splits

C(5,2) = 10

Latency Edge

~2.7s

Resolution Fee

2.0%

⚙️

Training PPO… —

Actions

Volatility Kill Switch (CVIX)

CVIX —

0 — Low volatility Threshold 90.1 120

Status

INACTIVE

Position Sizing (Quarter-Kelly)

Kelly Fraction 25%

⅛ Kelly¼ Kelly (target)½ Kelly

Open Exposure

0.0%

Exposure vs 15% Max

Max Per Trade

5.0%

Circuit Breakers

Daily Loss

10% → 1h pause · 20% → full day pause

0.0%

Weekly Loss

25% → triggers model retrain

0.0%

Total Drawdown

40% → permanent full stop

0.0%

Volatility (CVIX)

Halt all trading when CVIX > 90.1

OK

Training Pipeline

Status

Pipeline Status

IDLE

Progress —

PPO Results

PPO

PBO: — · Return: —

Volatility: —

—

TD3 / SAC

Incompatible — require continuous (Box) action space; env uses Discrete(5)

N/A

Training Method

📄

Bailey et al. (2016) CSCV

Combinatorial Symmetric Cross-Validation — PBO threshold α = 10%

🔢

Hyperparameter Search

2,700 combinations · 50 trials sampled · optuna-style

📦

Dataset

90 days · 25,920 BTC/ETH 5-min candles · 80/20 holdout split

5-Min Prediction Bot — Live Fund

USDC Balance

—

Total P&L

—

Win Rate

—

Drawdown

—

💰

Peak Balance

—

📈

Avg Edge per Trade

—

🔁

Total Trades / Cycles

—

📡

Bot Status

—

🛑

Risk / Halt

—

🌐

WebSocket

—

Mode

🧾

—

🔗

Open Full 5-Min Bot Dashboard

Charts, trades, signals, AI recommendations →

›

🖥️

Infrastructure Dashboard

System health, service control, log viewer → (auth required: seelauser)

›