MakeThisVid and D-ID solve different AI video problems. MakeThisVid generates original AI video scenes from a text prompt or product photo — synthesized footage with ambient audio baked in, built for ads and social content. D-ID specializes in AI talking portrait videos — animating a still photo of a person to lip-sync to a script or audio track. D-ID also offers a generative avatar presenter feature.
MakeThisVid vs D-ID: Scene Generation vs AI Talking Portrait Video
D-ID is best known for its core technology: taking a still photograph of a person and animating it to lip-sync to an audio recording or script — the talking portrait. This capability is used in digital memorial applications, personalized video creation, and avatar presenter videos for marketing and training. D-ID also offers broader AI avatar presenter functionality similar to other platforms in that category. D-ID offers a 14-day free trial (no credit card required) with limited video minutes and a full-screen watermark on exports; paid plans start at around $5.90/month for personal use, scaling up for commercial rights and higher volume. MakeThisVid does something different: it generates original video scenes — synthesized footage of environments, products, and atmospheric moments from a text description or photo. No portraits, no lip-sync, no avatars. The output is a downloadable MP4 with audio built into every clip and no watermark — 6-second 720p or 8-second 1080p, purpose-built for paid ads and social media. Entry starts at $2.99 for one credit with no subscription required.
MakeThisVid vs D-ID 7 criteria
| Criterion | MakeThisVid | D-ID |
|---|---|---|
| Output type | AI scene generator (synthesized footage) | Avatar/talking-head generator (not scene generation) |
| Clip length | 6s at 720p or 8s at 1080p — fixed short-form format | Narration-paced; length follows the audio script |
| Audio | Always included in every render — no upgrade needed | AI voice + lip-sync to a still photo |
| Watermark | No watermark on any pack or plan | Watermark on trial and entry tier; no watermark on higher paid tiers |
| Commercial use | Included on every pack and plan from $2.99 | Requires a paid plan above the entry tier |
| Entry price | From $2.99 (one-time, no subscription required) | 14-day free trial; paid plans start around $5.90/mo |
| Best for | Short-form ads, social, branded scene clips | Talking-head avatars and lip-sync from a still photo |
How to Use MakeThisVid
From prompt to downloadable MP4, ready to deploy.
-
Core technology difference
D-ID's core technology is photo animation — taking a still image of a person and animating it to lip-sync to audio. MakeThisVid generates AI video scenes from text descriptions or product photos — synthesized footage of places, objects, and moments. Fundamentally different capabilities.
-
Pricing comparison
MakeThisVid: one-time packs from $2.99 (1 credit) with no subscription required; subscriptions at $19.99/mo (10 credits), $49.99/mo (30 credits), $79.99/mo (60 credits). D-ID: 14-day free trial with limited video minutes; paid plans start around $5.90/month for personal use — commercial rights and removal of watermarks require higher tiers. D-ID Studio measures usage in video minutes, deducted per render and rounded up to the nearest 15 seconds.
-
Talking portrait vs scene footage
D-ID's output is a moving, speaking portrait of a person — realistic lip-sync animation on a still photo. MakeThisVid's output is a synthesized scene — an environment or product moment in motion. Neither tool can produce what the other does.
-
Personalized video use case
D-ID's photo-to-video technology enables personalized video at scale — creating individual video messages using a speaker's photo and a personalized script. This is a sales and marketing use case (video prospecting, personalized outreach) that MakeThisVid doesn't support.
-
Advertising and social media creative
MakeThisVid's 6-second and 8-second scene clips are purpose-built for paid ad placements and social media creative — audio always on, no watermark, commercial use on every pack. D-ID's output — talking portraits and avatar presenters — is better suited for personalized video messages, training content, and explainer ads featuring a presenter.
Who Uses MakeThisVid for This
MakeThisVid for product and scene-based ad creative
Original synthesized footage of products, environments, and lifestyle moments — for TikTok, Instagram, and Meta paid ads. Visual-first creative where the scene does the work. Audio is built into every clip at no extra cost.
D-ID for personalized video outreach
Sales teams using D-ID to send personalized video messages where a presenter's photo is animated with a custom script for each prospect. This is a specific sales enablement use case unique to D-ID's core technology.
D-ID for talking portrait content
Animating historical photos, creating digital memorial videos, or producing avatar presenter content from a still image. D-ID's talking portrait capability has applications across entertainment, education, and enterprise communications.
Frequently Asked Questions
Related
Generate original AI video scenes
Describe the moment or drop a product photo. About a minute to a downloadable MP4 — audio included, no watermark, commercial use licensed.
Try MakeThisVid