DataFlood Suite Documentation
Welcome to the comprehensive documentation for the DataFlood Suite - a powerful platform for generating realistic synthetic data based on DataFlood's human-editable models.
Quick Start
New to DataFlood? Start here:
- Core Concepts - Understand the fundamentals
- DataFlood CLI Getting Started - Command-line quick start
- FloodGate API Quick Start - API quick start
Components
DataFlood (CLI)
Command-line tool for schema generation and document creation.
- Getting Started - Quick introduction to DataFlood CLI
- Core Concepts - Schemas, models, and histograms
- CLI Reference - Complete command reference
DataFloodEditor (GUI)
Visual editor for creating schemas and designing sequences.
- Overview - Introduction to the GUI application
- Model Editor - Visual schema creation
- Tides Editor - Time-based sequence design
- Model Merge Tool - Combining schemas
FloodGate (API)
RESTful API service for programmatic access.
- Quick Start - Get started with the API
- API Reference - Complete endpoint documentation
Common Topics
Guides & Tutorials
- Use Cases & Examples - Real-world scenarios
- Best Practices & Performance - Optimization guide
- Troubleshooting Guide - Common issues and solutions
- Data Formats (JSON, CSV, JSONL)
Documentation by Task
I want to...
Generate synthetic data from existing samples
- Use DataFlood CLI to analyze your data
- Or import into DataFloodEditor
Create a schema from scratch
- Use DataFloodEditor Model Editor
- Define properties and constraints
- Configure statistical behavior for each data element
- Test generation
Design time-based sequences
- Open Tides Editor
- Add document generation steps
- Configure parent-child relationships
- Set timing patterns
Merge multiple schemas
- Use Model Merge Tool
- Drag and drop properties
- Resolve conflicts
- Save merged DataFlood model
Integrate with my application
- Start FloodGate API
- Use REST endpoints
Generate large datasets
- Use DataFlood CLI for batch generation
- Or FloodGate API serving mode
- Follow performance tips
- Use streaming options
Import data from CSV or JSON
- Use DataFloodEditor Import
- Configure import settings
- Review generated schema
- Enhance by editing the DataFlood model
Troubleshoot issues
- Check Troubleshooting Guide
- Review Best Practices
- Validate DataFlood models
- Test with small batches
Architecture Overview
┌─────────────────────────────────────────────────────┐
│ User Interfaces │
├──────────────┬──────────────┬──────────────────────┤
│ DataFlood │DataFloodEditor│ FloodGate API │
│ CLI │ GUI │ REST Service │
├──────────────┴──────────────┴──────────────────────┤
│ DataFlood Core Library │
│ • Schema Generation • Document Generation │
│ • Statistical Models • Sequence Execution │
└─────────────────────────────────────────────────────┘
Component Relationships
- DataFlood CLI: Command-line interface for DataFlood model generation and document creation
- DataFloodEditor: Desktop application for visual editing and project management
- FloodGate API: HTTP service for programmatic access and integration
Key Features
Statistical Modeling
- String Models: Pattern recognition, n-grams, vocabularies
- Histograms: Numeric distributions
- Format Detection: Automatic recognition of emails, URLs, dates
- Entropy Control: Fine-tune randomness
Data Generation
- Realistic Data: Maintains statistical properties
- Multiple Formats: JSON, CSV, JSONL output
- Reproducible: Seed-based generation
Tides Sequence Design
- Time-Based: Generate documents over time
- Relationships: Parent-child document links
- Transactions: Triggered generation
- Orchestration: Multiple models working together
Integration
- REST API: Full HTTP interface
- Swagger: Interactive documentation
- Docker: Container deployment
- Batch Operations: Large-scale generation
Sample Use Cases
E-commerce Testing
Generate realistic product catalogs, customer profiles, orders, and transactions for testing e-commerce systems. See complete example.
IoT Simulation
Create sensor data streams with realistic patterns, anomalies, and time-based variations. See IoT example.
Financial Data
Generate banking transactions, account records, and settlement data with proper relationships. See banking example.
Healthcare Records
Create synthetic patient records, appointments, and medical data for system testing. See healthcare example.
Log Generation
Produce application logs, security events, and audit trails for testing log analysis systems. See logging example.
Getting Help
Documentation
- Browse this documentation
- Check component-specific guides
- Review examples and tutorials
Interactive Help
- FloodGate Swagger UI at
http://localhost:5000/swagger
- DataFloodEditor tooltips and help menu
- DataFlood CLI
--help
flag
Troubleshooting
- See Troubleshooting Guide
- Check Best Practices
- Review error messages
Version Information
- Current Version: 1.0.0
- Documentation Updated: August 2025
- .NET Version: 9.0
- License: Commercial (contact for licensing)
Quick Links
Getting Started
References
Key Guides
v1.0, all documentation copyright SmallMinds 2025