Skip to content
Updated for 2026

MakeThisVid and D-ID solve different AI video problems. MakeThisVid generates original AI video scenes from a text prompt or product photo — synthesized footage with ambient audio baked in, built for ads and social content. D-ID specializes in AI talking portrait videos — animating a still photo of a person to lip-sync to a script or audio track. D-ID also offers a generative avatar presenter feature.

MakeThisVid vs D-ID: Scene Generation vs AI Talking Portrait Video

D-ID is best known for its core technology: taking a still photograph of a person and animating it to lip-sync to an audio recording or script — the talking portrait. This capability is used in digital memorial applications, personalized video creation, and avatar presenter videos for marketing and training. D-ID also offers broader AI avatar presenter functionality similar to other platforms in that category. D-ID offers a 14-day free trial (no credit card required) with limited video minutes and a full-screen watermark on exports; paid plans start at around $5.90/month for personal use, scaling up for commercial rights and higher volume. MakeThisVid does something different: it generates original video scenes — synthesized footage of environments, products, and atmospheric moments from a text description or photo. No portraits, no lip-sync, no avatars. The output is a downloadable MP4 with audio built into every clip and no watermark — 6-second 720p or 8-second 1080p, purpose-built for paid ads and social media. Entry starts at $2.99 for one credit with no subscription required.

MakeThisVid vs D-ID 7 criteria

Criterion MakeThisVid D-ID
Output type AI scene generator (synthesized footage) Avatar/talking-head generator (not scene generation)
Clip length 6s at 720p or 8s at 1080p — fixed short-form format Narration-paced; length follows the audio script
Audio Always included in every render — no upgrade needed AI voice + lip-sync to a still photo
Watermark No watermark on any pack or plan Watermark on trial and entry tier; no watermark on higher paid tiers
Commercial use Included on every pack and plan from $2.99 Requires a paid plan above the entry tier
Entry price From $2.99 (one-time, no subscription required) 14-day free trial; paid plans start around $5.90/mo
Best for Short-form ads, social, branded scene clips Talking-head avatars and lip-sync from a still photo

How to Use MakeThisVid

From prompt to downloadable MP4, ready to deploy.

  1. Core technology difference

    D-ID's core technology is photo animation — taking a still image of a person and animating it to lip-sync to audio. MakeThisVid generates AI video scenes from text descriptions or product photos — synthesized footage of places, objects, and moments. Fundamentally different capabilities.

  2. Pricing comparison

    MakeThisVid: one-time packs from $2.99 (1 credit) with no subscription required; subscriptions at $19.99/mo (10 credits), $49.99/mo (30 credits), $79.99/mo (60 credits). D-ID: 14-day free trial with limited video minutes; paid plans start around $5.90/month for personal use — commercial rights and removal of watermarks require higher tiers. D-ID Studio measures usage in video minutes, deducted per render and rounded up to the nearest 15 seconds.

  3. Talking portrait vs scene footage

    D-ID's output is a moving, speaking portrait of a person — realistic lip-sync animation on a still photo. MakeThisVid's output is a synthesized scene — an environment or product moment in motion. Neither tool can produce what the other does.

  4. Personalized video use case

    D-ID's photo-to-video technology enables personalized video at scale — creating individual video messages using a speaker's photo and a personalized script. This is a sales and marketing use case (video prospecting, personalized outreach) that MakeThisVid doesn't support.

  5. Advertising and social media creative

    MakeThisVid's 6-second and 8-second scene clips are purpose-built for paid ad placements and social media creative — audio always on, no watermark, commercial use on every pack. D-ID's output — talking portraits and avatar presenters — is better suited for personalized video messages, training content, and explainer ads featuring a presenter.

Who Uses MakeThisVid for This

MakeThisVid for product and scene-based ad creative

Original synthesized footage of products, environments, and lifestyle moments — for TikTok, Instagram, and Meta paid ads. Visual-first creative where the scene does the work. Audio is built into every clip at no extra cost.

D-ID for personalized video outreach

Sales teams using D-ID to send personalized video messages where a presenter's photo is animated with a custom script for each prospect. This is a specific sales enablement use case unique to D-ID's core technology.

D-ID for talking portrait content

Animating historical photos, creating digital memorial videos, or producing avatar presenter content from a still image. D-ID's talking portrait capability has applications across entertainment, education, and enterprise communications.

Frequently Asked Questions

D-ID's primary capability is animating still photos of people to lip-sync with audio, and avatar presenter videos. It does not synthesize environmental scenes or product footage from visual descriptions the way MakeThisVid does.
No. MakeThisVid generates AI scene videos from descriptions and product photos — it cannot animate a portrait to lip-sync with audio. D-ID's talking portrait technology is unique to that platform.
MakeThisVid is better suited for product scene ads — cinematic product footage from a photo or description, audio included, no watermark. D-ID could work for presenter-style product walkthrough videos, but that's a more complex production workflow than MakeThisVid's photo-to-video mode.
D-ID offers a 14-day free trial with limited video minutes and no credit card required, but exports carry a full-screen watermark. MakeThisVid has no free tier — credits are required for all renders, starting at $2.99 for one credit. Failed renders are automatically refunded.
MakeThisVid's 6-second and 8-second cinematic clips are purpose-built for short-form paid ad placements — audio baked in, no watermark, commercial use included. D-ID's talking portrait and avatar content can work in certain ad formats but is not optimized for the visual-first, short-form ad creative that performs best on TikTok and Instagram Reels.
No. Every MakeThisVid output is a clean downloadable MP4 with no watermark, regardless of which pack or plan you use. Commercial use is included on all packs and plans.

Generate original AI video scenes

Describe the moment or drop a product photo. About a minute to a downloadable MP4 — audio included, no watermark, commercial use licensed.

Try MakeThisVid