How It Works — Live Animation
Watch how Genesis 2 creates neurons from data, builds expert routes, and performs cascade activation.
Complete Data Flow
Academic License — What's Included
⚙ genesis2_core.pyAll Tiers
The foundational architecture. Defines neurons, experts, cascade inference, and the 1-step learning algorithm.
class EmbeddingEngine
model_name to try other multilingual models (e.g., "all-MiniLM-L6-v2" for speed).class TrainableNeuron(nn.Module)
Linear(dim→dim) + LayerNorm + GELU + gate. Xavier init with gain=0.1 for stability. Gate bias starts at 1.0 (open).gain=0.1 in xavier_uniform_ — higher values = faster learning but less stable. Change gate bias from 1.0 to control initial neuron contribution.x + sigmoid(gate(x)) * gelu(norm(linear(x))). The gate (sigmoid 0-1) controls how much this neuron contributes.class ExpertRoute
micro_head (Linear dim→dim, identity-initialized).class CascadeMoETrainable(nn.Module)
min(0.95, base + 0.025 * log2(n/5)). Grows as pool expands — forces new neurons for novel concepts.base_threshold (0.82): lower = more reuse, higher = more unique neurons. Change growth rate 0.025 and cap 0.95.neuron_cap (80) to control when neurons get "full" and new ones are created.- Encode input and target text
- Find/create neurons for each concept (reuse if similar enough)
- Build route — ordered list of neuron IDs
- Freeze ALL shared neurons, unfreeze ONLY exclusive (non-shared) ones
- Create ExpertRoute with identity-initialized micro_head
- Train: shared neurons get 1 gradient step, micro_head gets 10 steps
- Loss = MSE + cosine_embedding_loss
n_steps=10 for micro-head training, weight_decay=0.01, gradient clip norm 1.0. More steps = better fit but slower learning.0.4*cascade + 0.3*output_sim + 0.2*input_sim + 0.1*text_bonus.top_k controls how many candidates to evaluate.✨ genesis2_gen.pyAll Tiers
Generation engine. Extends core with concept chains, composer network, dialogue memory, and exec commands.
class ConceptNeuron(nn.Module)
max_responses=30 for episodic memory depth. Change gate bias to control initial neuron activation level.class CascadeGenerator(nn.Module)
neurons— shared neuron pool (ModuleDict)routes— expert routes with input/response/exec/micro_headreverse— neuron_key → set of expert_ids (cascade index)memory_update— Linear(3*dim → 2*dim) + GELU + LN + Linear(→dim)route_query— Linear(2*dim → dim) + GELU + Linear(→dim)stop_gate— Linear(2*dim → dim/2) + GELU + Linear(→1) + Sigmoidcomposer— Linear(dim) + GELU + LN + Linear(dim) [patent-critical]
- Extract concepts from query (
_auto_concepts) - Seed selection: ANN cosine + concept coverage + input-match boost (+1.0 exact, +0.2 substring)
- Cascade: spread through reverse index at configurable depth
- Adaptive cascade threshold:
min(0.7, 0.25 + 0.05 * log2(routes/50)) - Gate-weighted fragment collection from activated neurons
- Composer transforms cascade state to response space
- Expert scoring:
0.4*sem_query + 0.4*sem_composed + overlap + input-match - Micro-head refinement:
0.7*base_score + 0.3*micro_sim - Response composition from fragments (threshold 0.35)
- Exec command selection (local-first, top-10 experts)
top_k (candidates), depth (cascade iterations), fragment threshold (0.35), all scoring weights. Try depth=3 for deeper associations.epochs, batch_size, lr=0.001, margin=0.5. More epochs = better routing but diminishing returns after 5.🧠 neuron_embedding.pyAll Tiers
Patent-critical: native hash embedding from neuron pool. No external models (no MiniLM, no BERT).
class HashEncoder
dim (384) for embedding size. Change _num_buckets (65536) — more buckets = fewer collisions but more memory.class NeuronPoolEmbedding
- Hash encoding (raw projection)
- Find matching neurons via inverted word index (F1 > 0.15)
- If no text match: fallback to cosine similarity (top-5, threshold > 0.2)
- Adaptive mixing:
neuron_mix = min(0.85, 0.2 + max_weight*0.5 + matched*0.01) - Result = normalize(raw_mix * raw + neuron_mix * neuron_context)
🌐 genesis2_web.pyAll Tiers
Web interface, REST API, SSH agent, command executor, output analysis. The largest file (2068 lines).
class CommandExecutor
SAFE_COMMANDS, BLOCKED, LINUX_ONLY sets to change security policy. Change timeout for long-running commands.class WiFiExpert — Main Knowledge Engine
- Check dialogue state machine (awaiting credentials/scan)
- Decompose complex queries (split by "+", "and", commas)
- For each sub-query:
gen.generate()with cascade - Substitute params (IP, domain, port, MAC) into response
- Execute command if provided
- Analyze output (conntrack/ping/ARP analysis)
- Agent loop: if empty result, try alternative approach
HTTP Server
{"question": "..."} → {"answer": "...", "commands": [...]}{"question": "...", "answer": "...", "exec": "..."} — learns in 130-550ms.Professional License — Everything in Academic, plus:
📂 Training Datasets — 30 Domains
Each dataset is a JSONL file with format: {"question": "...", "answer": "...", "concepts": [...], "exec": "..."}
| Dataset | Domain | Content |
|---|---|---|
| networking.jsonl | Network Engineering | Routing, switching, VLANs, subnets |
| cisco.jsonl | Cisco IOS | CLI commands, OSPF, BGP, ACLs |
| mikrotik.jsonl | MikroTik RouterOS | Firewall, NAT, DHCP, wireless |
| linux.jsonl | Linux Admin | Package management, services, filesystem |
| security.jsonl | Security | Hardening, fail2ban, firewalld, SELinux |
| docker_k8s.jsonl | Containers | Docker, Kubernetes, Helm, pods |
| monitoring.jsonl | Monitoring | Zabbix, Prometheus, Grafana |
| devops.jsonl | DevOps | Ansible, Terraform, CI/CD |
| databases.jsonl | Databases | PostgreSQL, MySQL, backup, replication |
| web_servers.jsonl | Web Servers | Nginx, Apache, reverse proxy, SSL |
| dns_dhcp.jsonl | DNS/DHCP | BIND, dnsmasq, ISC DHCP |
| vpn.jsonl | VPN | WireGuard, OpenVPN, IPsec |
| wifi.jsonl | WiFi | Configuration, troubleshooting, security |
| cloud.jsonl | Cloud | AWS, GCP, Azure basics |
| scada.jsonl | SCADA/ICS | Industrial protocols, PLC, HMI |
| voip.jsonl | VoIP | Asterisk, SIP, RTP |
| windows.jsonl | Windows | Active Directory, GPO, PowerShell |
| macos.jsonl | macOS | System preferences, brew, networksetup |
| virtualization.jsonl | Virtualization | KVM, Proxmox, VMware |
| automation.jsonl | Automation | Scripting, cron, systemd timers |
| backup.jsonl | Backup | rsync, borgbackup, snapshots |
| troubleshooting.jsonl | Troubleshooting | Diagnostics, log analysis, recovery |
| network_recon.jsonl | Recon | nmap, port scanning, discovery |
| traffic.jsonl | Traffic | tcpdump, Wireshark, packet analysis |
| scan_vpn_aware.jsonl | Scanning | VPN-aware network scanning |
| iot.jsonl | IoT | MQTT, CoAP, edge devices |
| russian_infra.jsonl | RU Infra | Russian software and infrastructure |
| general.jsonl | General | Common IT tasks and queries |
| exec_general.jsonl | Exec Commands | Pure executable command mappings |
| video_access.jsonl | Video/Access | IP cameras, access control, NVR |
With these datasets you can retrain the model from scratch, add your own domains, or modify existing knowledge.
How to Train from Datasets
datasets/ directory and trains the model. Each fact is learned in 130-550ms.All functions from Academic tier are also included. View Academic reference ↑
Enterprise License — Everything in Professional, plus:
📜 Patent Documentation
Full patent filed at FIPS Russia, 31.05.2026 | IPC G06N 3/04 | 8 claims (2 independent + 6 dependent)
| File | Description |
|---|---|
| PATENT_APPLICATION_EN.md | Full patent text in English |
| PATENT_APPLICATION_RU.md | Full patent text in Russian (official) |
| PRIOR_ART_SEARCH_REPORT.md | Analysis of prior art and differentiation |
| CASCADE_MOE_DEPOSIT.md | RCIS blockchain deposit certificate |
| Technical Drawings (PNG + JPG, 300dpi): | |
| Fig.1 Architecture | Complete system architecture overview |
| Fig.2 Learning | 1-step learning process with frozen/unfrozen neurons |
| Fig.3 Cascade Activation | How cascade spreads through reverse index |
| Fig.4 Hash Embedding | NeuronPoolEmbedding encoding pipeline |
| Official Documents (PDF): | |
| Description (RU) | Official utility model description |
| Claims (RU) | Patent claims formula |
| Abstract (RU) | Patent abstract |
All functions from Academic + Professional tiers included. View Professional reference ↑
Source + Patent Bundle — Everything, plus:
🔧 Utility ScriptsBundle Only
Complete data pipeline for creating, cleaning, and enriching training datasets.
clean_datasets.py (240 lines)
enrich_exec.py (260 lines)
df -h commands.fix_dumb_exec.py (200 lines)
convert_atomic.py (330 lines)
{question, answer, concepts, exec}. Splits long answers into digestible chunks, deduplicates, drops irrelevant domains.Bundle-Exclusive: Patent Strategy & Priority Evidence
| File | Description |
|---|---|
| PRIORITY_EVIDENCE.md | RCIS blockchain certificate with hash proof — establishes priority date and authorship |
| PATENT_STRATEGY.md | Filing strategy, claim structure, international expansion plan, prior art positioning |
These documents are essential for white-label buyers who need to understand the IP landscape.
All functions from all tiers included. View Enterprise reference ↑
Key Constants & Thresholds
These values control model behavior. All are modifiable in the source code.
| Constant | File : Line | Value | What It Controls |
|---|---|---|---|
| base_threshold | core:169 | 0.82 | Cosine threshold for neuron reuse (lower=more sharing) |
| neuron_cap | core:167 | 80 | Max times a neuron can be reused before splitting |
| lr_neuron | core:170 | 0.01 | Learning rate for neuron weights |
| lr_head | core:171 | 0.005 | Learning rate for micro-head |
| route_len | core:172 | 3 | Number of neurons per expert route |
| micro_head steps | core:384 | 10 | Training iterations per learn() call |
| threshold (gen) | gen:179 | 0.82 | Neuron reuse threshold for generator |
| neuron_cap (gen) | gen:179 | 60 | Generator neuron usage limit |
| compose_steps | gen:180 | 50 | Composer training iterations |
| cascade depth | gen:318 | 2 | Cascade spread iterations |
| fragment threshold | gen:526 | 0.35 | Min weight to include response fragment |
| input-match exact | gen:386 | +1.0 | Exact input text match boost |
| input-match substr | gen:388 | +0.2 | Substring match boost |
| hash buckets | emb:37 | 65536 | Projection matrix size for hash encoding |
| cache max | emb:119 | 10000 | Embedding cache entries |
| F1 threshold | emb:204 | 0.15 | Min quality for neuron-text match |
| neuron_mix max | emb:241 | 0.85 | Max neuron contribution to embedding |
| softmax temp | emb:235 | 5.0 | Temperature for neuron weight softmax |
| max_responses | gen:67 | 30 | Episodic memory slots per neuron |
| session neurons | gen:604 | 10 | Dialogue memory limit per session |
| pretrain margin | gen:719 | 0.5 | Contrastive loss margin |
| exec timeout | web:47 | 15s | Command execution timeout |
| server port | web:2048 | 8765 | Web UI port |
Get Genesis 2
Choose the edition that fits your needs:
Contact: avlarionov@hotmail.com | GitHub: larionovavi-stack