How does Augmented Reality work?

December 3, 2020

Augmented Reality is the modification of a real-life environment by the addition of sound, visual elements, or other sensory stimuli.

Once confined squarely to the realm of academic experimentation and niche industrial use-cases, Augmented Reality has since developed into a widespread consumer phenomenon and has found its way into many popular apps.

Image for post
AR App for Education

As of now, there are a vast plethora of Augmented Reality experiences available in the market, ranging from AR games, AR marketing campaigns, AR furniture apps, etc. You might have encountered some of them already: Pokemon Go, Snapchat Filters, are just some examples of AR in action.

But how exactly does AR work? How exactly does a smartphone scan an object and return a virtual image to your phone?


Every AR system is comprised of components that can be organized into two overarching categories:

  1. Hardware
  2. Software

The diagram below depicts a simplistic explanation of how the components of your smartphone fit together to construct and produce an AR experience.

Image for post
Smartphone Augmented Reality Flowchart


AR Hardware consists of the various physical devices and sensors required to make sense of an environment, create and render a digital scene, and display the resulting visual information to your eyes.

The purpose of the hardware is to acquire data, process the data, and display it.
Image for post
Breakdown of a Smartphone Camera

Most consumer AR experiences are accessed via camera through smartphones. In order to be capable of running AR, most smartphones also contain the following:

Input Sensors:

Your average smartphone is chock-full of sensors. The sensors gather information from the environment that guide and aid various apps and processes.

Image for post
Various smartphone sensors

In particular, the following sensors are instrumental for AR:

  • Depth Sensor: Measures depth and distance
  • Gyroscope: Measures angular or rotational movement
  • Accelerometer: Measures dynamic movement and tracks different motions such as shaking, tilting, and swinging
  • Light Sensor: Measures the amount of ambient light present

Some smartphones (e.g. Apple’s newer generations of iPhone and iPad) employ LIDAR sensors. LIDAR stands for light detection and ranging. It works by measuring distances (ranging) by illuminating the target with laser light and measuring the reflection with a sensor. Differences in laser return times and wavelengths can then be used to make digital 3-D representations of the target.

Processing Hardware:

Image for post

Processor (CPU): The Processor is akin to the ‘brain’ of the phone. It receives and executes commands, performing billions of calculations per second. The effectiveness of the processor directly affects the speed of your phone and whether it can run AR experiences well.

Graphic Processing Unit (GPU): The GPU helps to render the visual elements of a phone’s display. In an AR experience, the GPU would be responsible for creating and rendering digital content on your screen. The more sophisticated the GPU, the higher the resolution, and the faster the smoother the motion.

Output Hardware:

Image for post

In the case of smartphones — the output hardware would be the smartphone itself. Other types of output hardware could include heads-up displays, AR glasses, projectors, etc.



An AR Software Development Kit (SDK), is the core technological software engine that powers the development and creation of AR experiences. The role of the SDK is to perform the task of fusing digital content and information with the real world.

AR Platforms or Software Development Kits, are required to enable AR content to run on applications or browsers. SDKs are typically intended for specific frameworks and hardware. In this article, I discuss two of the most widely used SDKs, though it should be noted that other SDKs exist as well.

Image for post

ARCore and ARKit are Google and Apple’s respective Augmented Reality development platforms. ARCore/ARKit works by tracking the position of the mobile device as it moves and building its own understanding of the real world around it. They possess 3 fundamental technologies that allow users to build augmented reality experiences.

  1. Motion Tracking — This allows your phone to understand its position relative to the real world. The phone camera identifies interesting points, called features, and tracks how these points move over time. Combined with the data from sensors, the software is able to determine the position/orientation of the phone as it moves through space. This process is called simultaneous localization and mapping, also known as SLAM.
  2. Environmental Understanding — Allows your phone to detect the size and location of various surfaces and how they are orientated (vertical, horizontal, angled). ARCore/ARKit looks for clusters of feature points that appear to lie on the same surface and makes these surfaces available to the app as planes.
  3. Light Estimation — Allows your phone to estimate the environment’s current lighting conditions. Virtual objects can be placed with the same lighting conditions in order to enhance realism.


Finally, the AR experience is usually viewed through a browser window, in the case of WebAR, or through a downloaded application.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Ready to get started?

Our Studio

Get in touch.
We’ll help you achieve the reality you want