My profile picture

👋 I'm Patrick the AI Engineer

I am a highly adaptable professional who thrives on creative problem-solving and collaborative endeavors. My proven track record demonstrates the ability to seamlessly integrate into diverse teams while maintaining flexibility in approach and execution. Through continuous learning and innovative thinking, I consistently deliver solutions that exceed expectations. Working across multiple projects has sharpened my ability to pivot strategies while maintaining team cohesion and project momentum. My natural creativity shines through in developing unique solutions, whether working independently or as part of a dynamic team.
Building type-safe AI applications with structured outputs using the AI SDK and Zod
How to Build a Video Scene Detector that Runs in Your Browser
Create a voice-assisted CLI tool that allows you to talk to a computer
How to extract audio from video and generate transcripts with timestamps and speaker diarization
How to Build a Real-Time Video Captioning App in the Browser that helps with Accessibility
How to place text between a foreground subject and background in videos using an AI model that runs in the browser
How to Build a Virtual Outfit Try-On App with Gemini 2.5 Flash Image Generation
Building type-safe AI applications with structured outputs using the AI SDK and Zod
How to Build a Video Scene Detector that Runs in Your Browser
Create a voice-assisted CLI tool that allows you to talk to a computer
How to extract audio from video and generate transcripts with timestamps and speaker diarization
How to Build a Real-Time Video Captioning App in the Browser that helps with Accessibility
How to place text between a foreground subject and background in videos using an AI model that runs in the browser
How to Build a Virtual Outfit Try-On App with Gemini 2.5 Flash Image Generation
Building type-safe AI applications with structured outputs using the AI SDK and Zod
How to Build a Video Scene Detector that Runs in Your Browser
Create a voice-assisted CLI tool that allows you to talk to a computer
How to extract audio from video and generate transcripts with timestamps and speaker diarization
How to Build a Real-Time Video Captioning App in the Browser that helps with Accessibility
How to place text between a foreground subject and background in videos using an AI model that runs in the browser
How to Build a Virtual Outfit Try-On App with Gemini 2.5 Flash Image Generation
Building type-safe AI applications with structured outputs using the AI SDK and Zod
How to Build a Video Scene Detector that Runs in Your Browser
Create a voice-assisted CLI tool that allows you to talk to a computer
How to extract audio from video and generate transcripts with timestamps and speaker diarization
How to Build a Real-Time Video Captioning App in the Browser that helps with Accessibility
How to place text between a foreground subject and background in videos using an AI model that runs in the browser
How to Build a Virtual Outfit Try-On App with Gemini 2.5 Flash Image Generation

Latest Tutorials

Guides and tutorials on design and development

Build Karaoke-Style Video Captions in the Browser with Whisper

Create word-level, karaoke-style captions entirely in the browser using Whisper, WebGPU/WASM, and burn them into videos with Mediabunny

Extract Structured Data From PDF Images With Vision Models

Render PDF pages as screenshots and use OpenAI's vision models with structured outputs to extract typed invoice data from scanned documents and complex layouts

Extract Structured Data From PDFs With AI SDK

Build a PDF invoice parser that extracts text server-side and uses LLM structured outputs to convert unstructured invoice data into typed JSON objects
Copyright © 2025