I build intelligent document processing systems and AI-powered applications. Specializing in OCR, PDF parsing, and synthetic data generation for educational technology.
Specializing in OCR, PDF processing, and synthetic data generation for educational technology and document automation
Advanced optical character recognition systems using multiple backends (DeepSeek OCR, Mathpix, Gemini) with GPU acceleration and intelligent routing.
Custom PDF parsing and rendering engine built from scratch with Cairo graphics, FreeType fonts, and hierarchical document structure analysis.
Generating synthetic training data for OCR models with ground-truth bounding boxes, mimicking real-world exam papers and documents.
Fine-tuning vision-language models using LoRA for efficient parameter optimization, with vLLM serving and Gemini API integration for intelligent document processing.
Building end-to-end document intelligence systems with Flask backends, React frontends, and automated deployment pipelines for scalable processing.
Exam paper processing, automated question generation, tutoring platforms, and learning management systems.
Intelligent document processing, form extraction, contract analysis, and archival digitization.
OCR model development, synthetic data research, document intelligence algorithms, and open-source tools.