Production-Grade AI Data Engineering: From Raw Multimodal Data to Active Agents