Building a Golf Simulator From Two Phones and 3D Projection Math

How we’re experimenting with dual-phone capture, projection matrices, and physics models to simulate ball flight without a single radar unit.

When people hear “golf simulator,” they usually picture a launch monitor and a projector: radar, proprietary cameras, and a big box that costs as much as a car. Our goal with Penguin is different.

We're exploring how far we can go with two phones, good calibration, and honest math. Instead of asking hardware to do everything, we lean on 3D reconstruction, projection matrices, and physics to estimate ball flight from the same devices coaches already use on the range.

This article is a look under the hood at how we think about a "two-phone simulator": what data we need, how 3D projection comes into play, and what's realistic to simulate without a radar unit watching every shot.

What we mean by a “two-phone simulator”

Before diving into math, it's worth clarifying what we're aiming for and what we're not.

Our target experience looks like this:

  • You set up two phones on tripods around the hitting area (e.g., one down-the-line, one face-on or offset).
  • Penguin guides you through a quick calibration flow so we understand the cameras and the space.
  • You hit a ball. From the two videos, we estimate club motion, impact geometry, and the early ball trajectory.
  • We feed those estimates into a physics model and show you a simulated ball flight and landing on a virtual fairway or green.

We're not trying to perfectly recreate every dimple or spin component. We are trying to generate consistent, useful approximations that help coaches and players see how mechanics translate into flight — using only the devices they already carry.

From pixels to rays: the role of projection matrices

At the heart of this whole approach is a simple idea from projective geometry: a pinhole camera maps 3D points in space onto a 2D image using a projection matrix.

In practice, a camera's projection matrix combines two kinds of information:

  • Intrinsics: focal length, principal point, and distortion parameters — how the camera sees the world.
  • Extrinsics: where the camera is positioned and how it's oriented relative to the scene.

With those in hand, a pixel isn't just a dot on a screen anymore. It represents a ray in 3D space: a set of possible points that could have projected to that pixel.

For a single phone, that ray never collapses to a single point. With two phones, we can intersect rays from both cameras and recover approximate 3D positions. That's the core of how we turn dual-phone video into trajectories and launch estimates.

Step 1: dual-phone capture and calibration

A simulator built on cameras lives or dies on calibration. Before we can simulate anything, we need to understand:

  • each phone's intrinsic parameters, and
  • how the two phones are positioned relative to each other.

Our pipeline starts by treating each phone as a pinhole camera and estimating its intrinsics (or loading them from prior calibration). Then we estimate the extrinsic transform between the two phones using shared observations — patterns, mats, markers, or the environment.

The end result is a pair of calibrated cameras with known projection matrices. Once that's in place, every frame of video becomes a structured 3D measurement, not just a picture.

Step 2: detecting club, ball, and impact geometry

Ball flight starts at impact, so that's where our measurement stack focuses. On each frame near impact, we aim to recover:

  • Club pose: 3D position and orientation of the shaft and head.
  • Club velocity: how the club head is moving just before and just after impact.
  • Ball position and early velocity: where the ball is in the first few frames after launch and how fast it's moving.

We get there by combining pose estimation (for body and club) with object tracking (for the ball) across both views. Given the calibrated cameras, each detected keypoint or ball location in image space can be triangulated into a 3D point in world space.

Estimate those points across a handful of frames, and we can approximate 3D velocities as well: direction and speed for both club and ball.

Step 3: inferring launch conditions from motion

Traditional launch monitors directly measure ball speed, launch angle, spin, and sometimes detailed spin axis information. With two phones, we're instead inferring these quantities from:

  • the 3D motion of the club before and after impact,
  • the early 3D motion of the ball, and
  • the known geometry of the hitting area (mat, tee, ground).

Our goal isn't to perfectly replicate every launch metric, but to estimate a set of effective launch parameters:

  • Initial ball velocity: magnitude and direction of the ball's initial 3D velocity vector.
  • Launch angles: vertical and horizontal angles relative to a target line.
  • Spin proxies: features that correlate with spin, such as attack angle, dynamic loft, face-to-path, and how the ball accelerates in early frames.

Even when we can't fully reconstruct spin, we can often classify its type and relative magnitude (e.g., “high draw spin,” “low-cut spin”) well enough to drive a useful flight model.

Step 4: feeding a physics engine, not a black box

Once we have effective launch parameters, we plug them into a physics model rather than treating the simulator as a black box. At a high level, our flight engine simulates:

  • Gravity: pulling the ball down with a constant acceleration.
  • Drag: air resistance that slows the ball, especially at higher speeds.
  • Lift (from spin): a simplified model of how backspin and sidespin create lift and curvature.

The key idea is that our simulator is transparent and tunable. As we collect more data from real-world shots (and, where possible, compare with radar or high-end systems), we can refine the mapping from observed motion to effective launch parameters and the constants in the flight model.

That means Penguin's simulator can get smarter over time without new hardware — just better math and better learned mappings.

Why projection math is so central to this approach

When you strip away the implementation details, almost every step comes back to projection:

  • using projection matrices to turn pixels into rays,
  • intersecting rays from two cameras to reconstruct 3D points,
  • differentiating 3D positions over time to get velocities, and
  • feeding those velocities into a simulator that maps them back into an imagined 3D world you can see on screen.

In other words, projection math doesn't just help us interpret the video. It's the bridge between:

  • what the cameras saw, and
  • the virtual environment we're rendering the ball into.

If we get the calibration and projection right, the rest of the system has a solid foundation to stand on.

Design constraints we have to respect

As fun as the math is, we keep a few very practical constraints in mind while designing the simulator experience:

  • Setup complexity: two-phone calibration can't feel like a lab experiment. We're working toward flows that feel more like "align to this guide, take a short clip" than "spend 20 minutes placing markers."
  • Robustness in messy environments: ranges and indoor facilities have reflections, other players, and noisy backgrounds. Our tracking and reconstruction need to handle imperfect footage.
  • Latency: a simulator that takes 30–60 seconds to render a shot isn't very fun. That’s where our edge-compute work comes in: doing as much as possible on or near the devices so feedback feels immediate.
  • Honesty about limits: there will be ranges, lighting conditions, or swing types where we can't get a trustworthy read. In those cases, we'd rather tell you that clearly than show you a confident-but-wrong ball flight.

What this means for coaches and programs

A two-phone simulator doesn't try to replace high-end launch monitors outright. Instead, it opens up a new space:

  • More reps, more contexts: simulate flight on the range, in a net bay, or in a temporary setup without special hardware.
  • Tighter connection between mechanics and outcome: see how specific 3D changes in club path, low point, and face behavior translate into directional and curvature changes, even when you can't see the full ball flight.
  • Accessible tech stack: build data-rich programs using phones your players already own, rather than needing a dedicated room and budget for every team.

Over time, we see this blending with everything else Penguin does: stereo vision for 3D motion, edge compute for real-time feedback, and long-term tracking so you can see how changes in mechanics influence simulated (and real) ball flight over seasons, not just sessions.

Where we're experimenting next

We're still in the experimental phase of this two-phone simulator journey, and there's a lot left to explore:

  • Better spin estimation: using more of the early ball flight and club–ball interaction to sharpen our spin classification and magnitude estimates.
  • Adaptive calibration: quietly refining camera understanding over many swings instead of relying on a single calibration moment.
  • Interactive visuals: letting coaches orbit the camera, overlay paths, and compare simulated flights between swings or players.

The constant through all of it is the same theme: take solid geometry and physics, pair them with careful product design, and build experiences that feel closer to magic — while still being grounded in math you can explain on a whiteboard.