Apple's SHARP can turn a photo into a 3D scene in under a second

Apple's AI developments have been much mocked, but could the Cupertino giant emerge as a surprise leader in AI-driven 3D? A host of tech companies are researching tools for simpler, faster creation of 3D scenes, environments and digital twins, and Apple's just made a pretty big leap.

SHARP is an experimental AI model that can quickly turn 2D images into 3D gaussian splats that can then be viewed on Vision Pro. Some now think that though a combination of its hardware and software, Apple could have the edge for developing AI-driven 3D workflows.

Instead of traditional polygons, gaussian splatting uses millions of fuzzy 3D ellipsoids with defined position, size, orientation, colour and transparency to represent and render intricate 3D scenes in real-time so that they look highly accurate from a particular viewpoint.

Most techniques require lots – sometimes hundreds – of images of a scene from different angles (see our pick of the best 3D scanners). But Apple’s SHARP uses AI to predict the scene from just one photo in under a second on a standard GPU.

Apple trained SHARP on swathes of synthetic and real-world data to teach it to identify frequent depth and geometry patterns so it can predict the position and appearance of 3D Gaussians via a single forward pass through a neural network.

According to the research paper, distances and scale remain consistent in real-world terms. The representation is metric, with absolute scale, supporting metric camera movements.

The compromise is that SHARP only accurately renders nearby viewpoints, not unseen parts of the scene, which means users can't venture far from that viewpoint.

With the code available on GitHub, and people have been testing out the tool and sharing the results on social media (see below). Others are wondering why Apple chose to illustrate the model with an image of a horse that appears

This week also saw the launch of SpAItial AI's Echo, which can turn 2D images into editable 3D worlds on which users can apply different styles. The company hopes to add full prompt-based scene manipulation, allowing users to add, remove, rearrange, or restyle objects.

TOPICS
Joe Foley
Freelance journalist and editor

Joe is a regular freelance journalist and editor at Creative Bloq. He writes news, features and buying guides and keeps track of the best equipment and software for creatives, from video editing programs to monitors and accessories. A veteran news writer and photographer, he now works as a project manager at the London and Buenos Aires-based design, production and branding agency Hermana Creatives. There he manages a team of designers, photographers and video editors who specialise in producing visual content and design assets for the hospitality sector. He also dances Argentine tango.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.