Genesis 2 — Reference Guide | Cascade MoE Neural Network

How It Works — Live Animation

Watch how Genesis 2 creates neurons from data, builds expert routes, and performs cascade activation.

Complete Data Flow

Input Query

"configure nginx"

→

Hash Embedding

NeuronPoolEmbedding.encode()

neuron_embedding.py:163

→

ANN Search

cosine similarity

genesis2_gen.py:370

→

Seed Expert

+ input-match boost

genesis2_gen.py:386

→

Cascade

reverse index spread

genesis2_gen.py:420

→

Composer

nn.Sequential

genesis2_gen.py:155

→

Response

+ exec command

genesis2_web.py:845

Academic License — What's Included

⚙

genesis2_core.py

Core architecture: neurons, experts, cascade (831 lines)

✨

genesis2_gen.py

Generation engine: composer, memory, boost (1053 lines)

🌐

genesis2_web.py

Web UI + REST API + command executor (2068 lines)

🧠

neuron_embedding.py

Hash embedding without external models (280 lines)

💾

genesis2_trained_full.pt

Pre-trained model: 10,800 experts (3.5 GB)

📂

datasets/ (30 files)

Professional+ only

📜

patent/

Enterprise+ only

🔧

Utility scripts (4)

Bundle only

⚙ genesis2_core.pyAll Tiers

The foundational architecture. Defines neurons, experts, cascade inference, and the 1-step learning algorithm.

class EmbeddingEngine

def __init__(self, model_name="paraphrase-multilingual-MiniLM-L12-v2")

Line 34

Loads SentenceTransformer model for initial embedding. Used during training phase only — production uses NeuronPoolEmbedding instead.

Change model_name to try other multilingual models (e.g., "all-MiniLM-L6-v2" for speed).

def encode(self, text) -> tensor

Line 42

Encodes single text to normalized 384-dim vector. Results are cached to avoid re-encoding.

def encode_batch(self, texts) -> tensor

Line 47

Batch encoding with automatic cache misses handling. Returns tensor of shape (N, 384).

class TrainableNeuron(nn.Module)

def __init__(self, dim, neuron_id, concept="")

Line 64

Creates a single shared neuron. Architecture: Linear(dim→dim) + LayerNorm + GELU + gate. Xavier init with gain=0.1 for stability. Gate bias starts at 1.0 (open).

Change gain=0.1 in xavier_uniform_ — higher values = faster learning but less stable. Change gate bias from 1.0 to control initial neuron contribution.

def forward(self, x) -> tensor

Line 88

Residual transform: x + sigmoid(gate(x)) * gelu(norm(linear(x))). The gate (sigmoid 0-1) controls how much this neuron contributes.

class ExpertRoute

def __init__(self, eid, route, text, embedding, concepts, dim=384)

Line 109

An expert = a route through the shared neuron pool. Stores: route (list of neuron IDs), text, embedding, concepts. Creates per-expert micro_head (Linear dim→dim, identity-initialized).

The micro_head is initialized as identity matrix — try different initializations to change expert specialization behavior.

class CascadeMoETrainable(nn.Module)

def __init__(self, emb_engine, hidden_dim=384, output_dim=384)

Line 142

Main model. Contains: neuron_pool (ModuleDict), experts (dict), reverse_index (neuron→experts), concept_map, OutputHead.

def _threshold(self) -> float

Line 175

Adaptive neuron reuse threshold: min(0.95, base + 0.025 * log2(n/5)). Grows as pool expands — forces new neurons for novel concepts.

Change base_threshold (0.82): lower = more reuse, higher = more unique neurons. Change growth rate 0.025 and cap 0.95.

def _get_or_create_neuron(self, concept, concept_emb, context="")

Line 184

Checks if a matching neuron exists (exact name first, then cosine similarity). If sim ≥ threshold AND usage < cap: reuse existing. Otherwise: create new neuron. Returns (neuron_id, was_reused).

Change neuron_cap (80) to control when neurons get "full" and new ones are created.

def learn(self, text, target_text, concepts=None)

Line 280

THE CORE LEARNING METHOD. 1-step learning:

Encode input and target text
Find/create neurons for each concept (reuse if similar enough)
Build route — ordered list of neuron IDs
Freeze ALL shared neurons, unfreeze ONLY exclusive (non-shared) ones
Create ExpertRoute with identity-initialized micro_head
Train: shared neurons get 1 gradient step, micro_head gets 10 steps
Loss = MSE + cosine_embedding_loss

Returns (expert_id, route, final_loss, reused_count).

Change n_steps=10 for micro-head training, weight_decay=0.01, gradient clip norm 1.0. More steps = better fit but slower learning.

def cascade_infer(self, query_text, top_k=5)

Line 517

CASCADE INFERENCE. Finds seed expert, spreads activation through shared neurons via reverse index. Scoring: 0.4*cascade + 0.3*output_sim + 0.2*input_sim + 0.1*text_bonus.

All scoring weights are tunable. cascade_threshold (0.2) controls spread depth. top_k controls how many candidates to evaluate.

def save(self, path)

Line 594

Saves entire model state: neuron pool, head, all experts, config, counters.

✨ genesis2_gen.pyAll Tiers

Generation engine. Extends core with concept chains, composer network, dialogue memory, and exec commands.

class ConceptNeuron(nn.Module)

def __init__(self, nid, dim, concept="")

Line 48

Extended neuron: Linear + LayerNorm + GELU + weight_gate + episodic memory (max 30 responses). Gate bias starts at 0.5.

Change max_responses=30 for episodic memory depth. Change gate bias to control initial neuron activation level.

def store_response(self, input_text, response_text, response_emb)

Line 85

Stores (input, response, embedding) tuple in neuron's memory. FIFO eviction at max capacity. This is how neurons "remember" multiple responses.

def best_response(self, query_emb) -> (text, score)

Line 91

Finds best matching stored response by cosine similarity with query embedding.

class CascadeGenerator(nn.Module)

def __init__(self, emb_engine, dim=None)

Line 118

Core generation model. Components:

neurons — shared neuron pool (ModuleDict)
routes — expert routes with input/response/exec/micro_head
reverse — neuron_key → set of expert_ids (cascade index)
memory_update — Linear(3*dim → 2*dim) + GELU + LN + Linear(→dim)
route_query — Linear(2*dim → dim) + GELU + Linear(→dim)
stop_gate — Linear(2*dim → dim/2) + GELU + Linear(→1) + Sigmoid
composer — Linear(dim) + GELU + LN + Linear(dim) [patent-critical]

def learn(self, input_text, response_text, concepts=None, exec_cmd=None)

Line 221

Learns input→response mapping with optional exec command. Creates micro_head per expert (identity init). Trains micro_head + composer together (10 steps, lr=0.005). All shared neurons stay FROZEN.

Change composer lr (0.005), training steps (10), loss weights. The frozen-shared/trainable-unique split is the core of zero-forgetting guarantee.

def generate(self, query, top_k=5, depth=2, session_id=None)

Line 318

THE MAIN GENERATION METHOD. Full cascade activation pipeline:

Extract concepts from query (_auto_concepts)
Seed selection: ANN cosine + concept coverage + input-match boost (+1.0 exact, +0.2 substring)
Cascade: spread through reverse index at configurable depth
Adaptive cascade threshold: min(0.7, 0.25 + 0.05 * log2(routes/50))
Gate-weighted fragment collection from activated neurons
Composer transforms cascade state to response space
Expert scoring: 0.4*sem_query + 0.4*sem_composed + overlap + input-match
Micro-head refinement: 0.7*base_score + 0.3*micro_sim
Response composition from fragments (threshold 0.35)
Exec command selection (local-first, top-10 experts)

Returns: {response, fragments, concepts, score, neurons_activated, seed_expert, exec, thinking}.

Key parameters: top_k (candidates), depth (cascade iterations), fragment threshold (0.35), all scoring weights. Try depth=3 for deeper associations.

def pretrain_route_generator(self, epochs=3, batch_size=32)

Line 623

Pretrains memory_update, route_query, stop_gate, composer on all learned routes. Contrastive training: positive neurons score higher than negative by margin=0.5.

Change epochs, batch_size, lr=0.001, margin=0.5. More epochs = better routing but diminishing returns after 5.

def save_state(self, path)

Line 766

Atomic save: writes to temp file, verifies size, renames. Saves in float16 to reduce file size (~50% smaller). Includes neurons, routes, composer, all networks, counters.

The float16 save is critical for fitting the 3.5GB state in 16GB RAM. Change to float32 if you have 32GB+ RAM for slightly higher precision.

def load_state(self, path)

Line 817

Loads state, converts float16 back to float32. Uses mmap for speed. Logs progress every 2000 items.

🧠 neuron_embedding.pyAll Tiers

Patent-critical: native hash embedding from neuron pool. No external models (no MiniLM, no BERT).

class HashEncoder

def __init__(self, dim=384)

Line 33

Creates deterministic random projection matrix (65536 buckets × dim), seed=42. Tracks IDF (inverse document frequency) for term weighting.

Change dim (384) for embedding size. Change _num_buckets (65536) — more buckets = fewer collisions but more memory.

def _tokenize(self, text) -> list

Line 44

Splits text into: words + character trigrams (#xxx#) + bigrams (@xx@). Multi-level tokenization captures both semantics and character patterns.

def encode(self, text) -> tensor

Line 72

Encodes text: tokenize → MD5 hash → IDF-weighted average of projection vectors → L2 normalize. Word weight=1.0, n-gram weight=0.3.

Change word/n-gram weights to prioritize semantic meaning (words) vs character patterns (n-grams).

class NeuronPoolEmbedding

def __init__(self, neuron_pool=None, dim=384)

Line 115

Full embedding engine. Uses HashEncoder for base encoding, enriches with neuron pool concepts. Cache max 10000 entries.

def encode(self, text) -> tensor

Line 163

3-STEP ENCODING (patent-critical):

Hash encoding (raw projection)
Find matching neurons via inverted word index (F1 > 0.15)
If no text match: fallback to cosine similarity (top-5, threshold > 0.2)
Adaptive mixing: neuron_mix = min(0.85, 0.2 + max_weight*0.5 + matched*0.01)
Result = normalize(raw_mix * raw + neuron_mix * neuron_context)

F1 threshold (0.15), cosine fallback (0.2), mixing formula, softmax temperature (5.0) — all affect how much neuron knowledge enriches the embedding.

🌐 genesis2_web.pyAll Tiers

Web interface, REST API, SSH agent, command executor, output analysis. The largest file (2068 lines).

class CommandExecutor

def execute(self, cmd, timeout=15, auto_run=True)

Line 47

Executes shell commands safely. Strips sudo on macOS. Blocks dangerous commands (rm, mkfs, dd, shutdown). Checks OS compatibility for Linux-only commands.

Modify SAFE_COMMANDS, BLOCKED, LINUX_ONLY sets to change security policy. Change timeout for long-running commands.

class WiFiExpert — Main Knowledge Engine

def query(self, question, session_id='default')

Line 845

MAIN QUERY ENTRY POINT. Full pipeline:

Check dialogue state machine (awaiting credentials/scan)
Decompose complex queries (split by "+", "and", commas)
For each sub-query: gen.generate() with cascade
Substitute params (IP, domain, port, MAC) into response
Execute command if provided
Analyze output (conntrack/ping/ARP analysis)
Agent loop: if empty result, try alternative approach

Returns: {question, answer, concepts, fragments, score, time_ms, commands, exec_results, neurons, thinking}.

def _substitute_params(self, question, exec_cmd, result=None)

Line 349

Smart parameter extraction: finds IPv4, IPv6, domains, ports, MACs in the question text and substitutes them into exec command templates. Builds real commands from placeholders.

def _analyze_output(self, stdout, question)

Line 590

Comprehensive output analysis: conntrack traffic (80+ port identifiers), netstat (known IP ranges: Google, Telegram, Apple, Meta, AWS, Azure...), ping quality, ARP device detection.

Add new port mappings, IP ranges, or analysis patterns to extend detection capabilities.

def _find_remote_command(self, question, session)

Line 159

Maps 17 keyword patterns to SSH commands: time, wifi config, clients, uptime, interfaces, firewall, dhcp, dns, logs, firmware, channel, network, backup, packages, ping, speed, restart, reboot.

def learn(self, question, answer, concepts=None, exec_cmd=None)

Line 1001

Learn new Q&A pair via API. Updates embedding index after learning. API fields: "question" and "answer" (NOT "input"/"response"!).

HTTP Server

POST /api/query — JSON query

Standard query: {"question": "..."} → {"answer": "...", "commands": [...]}

POST /api/query-stream — SSE streaming

Real-time streaming with thinking steps. Shows cascade activation progress.

POST /api/learn — teach new fact

{"question": "...", "answer": "...", "exec": "..."} — learns in 130-550ms.

POST /api/save — save model state

Saves to disk. Checks disk space (>5GB required).

POST /api/pretrain — pretrain composer

Starts pretrain in background thread. Improves response quality.

Professional License — Everything in Academic, plus:

⚙

genesis2_core.py

Core architecture (831 lines)

✨

genesis2_gen.py

Generation engine (1053 lines)

🌐

genesis2_web.py

Web UI + REST API (2068 lines)

🧠

neuron_embedding.py

Hash embedding (280 lines)

💾

genesis2_trained_full.pt

Pre-trained model (3.5 GB)

📂

datasets/ (30 JSONL files)

10,600+ training facts across 22 domains

📜

patent/

Enterprise+ only

🔧

Utility scripts (4)

Bundle only

📂 Training Datasets — 30 Domains

Each dataset is a JSONL file with format: {"question": "...", "answer": "...", "concepts": [...], "exec": "..."}

Dataset	Domain	Content
networking.jsonl	Network Engineering	Routing, switching, VLANs, subnets
cisco.jsonl	Cisco IOS	CLI commands, OSPF, BGP, ACLs
mikrotik.jsonl	MikroTik RouterOS	Firewall, NAT, DHCP, wireless
linux.jsonl	Linux Admin	Package management, services, filesystem
security.jsonl	Security	Hardening, fail2ban, firewalld, SELinux
docker_k8s.jsonl	Containers	Docker, Kubernetes, Helm, pods
monitoring.jsonl	Monitoring	Zabbix, Prometheus, Grafana
devops.jsonl	DevOps	Ansible, Terraform, CI/CD
databases.jsonl	Databases	PostgreSQL, MySQL, backup, replication
web_servers.jsonl	Web Servers	Nginx, Apache, reverse proxy, SSL
dns_dhcp.jsonl	DNS/DHCP	BIND, dnsmasq, ISC DHCP
vpn.jsonl	VPN	WireGuard, OpenVPN, IPsec
wifi.jsonl	WiFi	Configuration, troubleshooting, security
cloud.jsonl	Cloud	AWS, GCP, Azure basics
scada.jsonl	SCADA/ICS	Industrial protocols, PLC, HMI
voip.jsonl	VoIP	Asterisk, SIP, RTP
windows.jsonl	Windows	Active Directory, GPO, PowerShell
macos.jsonl	macOS	System preferences, brew, networksetup
virtualization.jsonl	Virtualization	KVM, Proxmox, VMware
automation.jsonl	Automation	Scripting, cron, systemd timers
backup.jsonl	Backup	rsync, borgbackup, snapshots
troubleshooting.jsonl	Troubleshooting	Diagnostics, log analysis, recovery
network_recon.jsonl	Recon	nmap, port scanning, discovery
traffic.jsonl	Traffic	tcpdump, Wireshark, packet analysis
scan_vpn_aware.jsonl	Scanning	VPN-aware network scanning
iot.jsonl	IoT	MQTT, CoAP, edge devices
russian_infra.jsonl	RU Infra	Russian software and infrastructure
general.jsonl	General	Common IT tasks and queries
exec_general.jsonl	Exec Commands	Pure executable command mappings
video_access.jsonl	Video/Access	IP cameras, access control, NVR

With these datasets you can retrain the model from scratch, add your own domains, or modify existing knowledge.

How to Train from Datasets

python3 genesis2_web.py

On first run without a saved state, the server automatically loads all datasets from datasets/ directory and trains the model. Each fact is learned in 130-550ms.

POST /api/learn {"question":"...", "answer":"...", "exec":"..."}

Add new facts at runtime via API. The model learns instantly without restarting.

POST /api/save

Save updated model to disk after adding new knowledge.

POST /api/pretrain

After adding many facts, pretrain the composer and routing networks for better response quality.

All functions from Academic tier are also included. View Academic reference ↑

Enterprise License — Everything in Professional, plus:

⚙

All source code (4 files)

4,232 lines total

💾

genesis2_trained_full.pt

Pre-trained model (3.5 GB)

📂

datasets/ (30 JSONL files)

10,600+ training facts

📜

patent/ (13 files)

Full patent documentation + drawings

📖

Network_Automation_with_AI_FULL.pdf

132-page book

🔧

Utility scripts (4)

Bundle only

📜 Patent Documentation

Full patent filed at FIPS Russia, 31.05.2026 | IPC G06N 3/04 | 8 claims (2 independent + 6 dependent)

File	Description
PATENT_APPLICATION_EN.md	Full patent text in English
PATENT_APPLICATION_RU.md	Full patent text in Russian (official)
PRIOR_ART_SEARCH_REPORT.md	Analysis of prior art and differentiation
CASCADE_MOE_DEPOSIT.md	RCIS blockchain deposit certificate
Technical Drawings (PNG + JPG, 300dpi):
Fig.1 Architecture	Complete system architecture overview
Fig.2 Learning	1-step learning process with frozen/unfrozen neurons
Fig.3 Cascade Activation	How cascade spreads through reverse index
Fig.4 Hash Embedding	NeuronPoolEmbedding encoding pipeline
Official Documents (PDF):
Description (RU)	Official utility model description
Claims (RU)	Patent claims formula
Abstract (RU)	Patent abstract

All functions from Academic + Professional tiers included. View Professional reference ↑

Source + Patent Bundle — Everything, plus:

⚙

All source code (4 files)

4,232 lines total

💾

genesis2_trained_full.pt

Pre-trained model (3.5 GB)

📂

datasets/ (30 JSONL files)

10,600+ training facts

📜

patent/ (15 files)

Patent + PRIORITY_EVIDENCE + STRATEGY

📖

Network_Automation_with_AI_FULL.pdf

132-page book

🔧

clean_datasets.py

Dataset cleaning utility (240 lines)

🔧

enrich_exec.py

Add exec commands to facts (260 lines)

🔧

fix_dumb_exec.py

Environment-aware exec fixing (200 lines)

🔧

convert_atomic.py

Dataset format converter (330 lines)

🔧 Utility ScriptsBundle Only

Complete data pipeline for creating, cleaning, and enriching training datasets.

clean_datasets.py (240 lines)

Dataset Cleaner

Removes duplicate entries ("details N" patterns), removes facts without exec from exec-heavy files, filters irrelevant general knowledge, merges fragmented entries. Use this to clean your own datasets before training.

Modify cleaning rules to match your domain. Add/remove patterns for your specific data sources.

enrich_exec.py (260 lines)

Exec Enricher

Adds executable commands to theory-only facts. Maps concepts to known patterns: ping, dns, ssl, cpu, memory, network, storage, etc. Converts "how to check disk space" into actual df -h commands.

Add your own concept-to-command mappings for custom domains (e.g., cloud CLI commands, proprietary tools).

fix_dumb_exec.py (200 lines)

Environment-Aware Fixer

Makes exec commands environment-aware: auto-detects network interface, checks VPN status, handles macOS vs Linux differences. Ensures commands actually work on the target system.

convert_atomic.py (330 lines)

Format Converter

Converts datasets from various formats to atomic JSONL: {question, answer, concepts, exec}. Splits long answers into digestible chunks, deduplicates, drops irrelevant domains.

Add new format parsers for your data sources (CSV, SQL dumps, API responses, etc.).

Bundle-Exclusive: Patent Strategy & Priority Evidence

File	Description
PRIORITY_EVIDENCE.md	RCIS blockchain certificate with hash proof — establishes priority date and authorship
PATENT_STRATEGY.md	Filing strategy, claim structure, international expansion plan, prior art positioning

These documents are essential for white-label buyers who need to understand the IP landscape.

All functions from all tiers included. View Enterprise reference ↑

Key Constants & Thresholds

These values control model behavior. All are modifiable in the source code.

Constant	File : Line	Value	What It Controls
base_threshold	core:169	0.82	Cosine threshold for neuron reuse (lower=more sharing)
neuron_cap	core:167	80	Max times a neuron can be reused before splitting
lr_neuron	core:170	0.01	Learning rate for neuron weights
lr_head	core:171	0.005	Learning rate for micro-head
route_len	core:172	3	Number of neurons per expert route
micro_head steps	core:384	10	Training iterations per learn() call
threshold (gen)	gen:179	0.82	Neuron reuse threshold for generator
neuron_cap (gen)	gen:179	60	Generator neuron usage limit
compose_steps	gen:180	50	Composer training iterations
cascade depth	gen:318	2	Cascade spread iterations
fragment threshold	gen:526	0.35	Min weight to include response fragment
input-match exact	gen:386	+1.0	Exact input text match boost
input-match substr	gen:388	+0.2	Substring match boost
hash buckets	emb:37	65536	Projection matrix size for hash encoding
cache max	emb:119	10000	Embedding cache entries
F1 threshold	emb:204	0.15	Min quality for neuron-text match
neuron_mix max	emb:241	0.85	Max neuron contribution to embedding
softmax temp	emb:235	5.0	Temperature for neuron weight softmax
max_responses	gen:67	30	Episodic memory slots per neuron
session neurons	gen:604	10	Dialogue memory limit per session
pretrain margin	gen:719	0.5	Contrastive loss margin
exec timeout	web:47	15s	Command execution timeout
server port	web:2048	8765	Web UI port

Get Genesis 2

Choose the edition that fits your needs:

Academic — $299 Professional — $1,499 Enterprise — $4,999 Source + Patent — $5,000

Contact: avlarionov@hotmail.com | GitHub: larionovavi-stack