Category Archives: GIS

Introducing: The crowdsourced 3D world reality model (let’s make sure we are ready for it!)

For those of you who are semi-regular readers of this blog, you know that I have been talking for several years about the exciting convergence of low cost reality capture technologies (active or passive), compute solutions (GPU or cloud), and new processing algorithms (KinFu to VSLAM). I am excited about how this convergence has transformed the way reality is captured, interacted with (AR), and even reproduced (remaining digital, turned into something physical, or a hybrid of both). I ultimately believe we are on the path towards the creation and continuous update of a 3D “world model” populated with data coming from various types of consumer, professional and industrial sensors. This excitement is only mildly tempered by the compelling legal, policy and perhaps even national security implications that have yet to be addressed or resolved.

My first exposure to reality capture hardware and reconstruction tools was in the late 90s when I was at Bentley Systems and we struck up a partnership with Cyra Technologies (prior to their acquisition by Leica Geosystems). I ultimately negotiated a distribution agreement for Cyra’s CloudWorx toolset to be distributed within MicroStation which we announced in late 2002. I remember that Greg Bentley (the CEO of Bentley Systems) strongly believed that reality capture was going to be transformative to the AEC ecosystem. As can be seen by their continuing investments in this space, he must continue to believe this, and it is bearing dividends for Bentley customers (active imaging systems, photogrammetric reconstructions, and everything in between)!

Fast forward to circa 2007 when Microsoft announced the first incarnation of Photosynth to the world at TED 2007 (approx 2:30 min mark). Photosynth stitched together multiple 2D photos and then related them together spatially (by back computing the camera positions of the individual shots and then organizing them in 3D space). Blaise Aguera y Arcas (then at Microsoft, now leading up Machine Intelligence at Google) showed a point cloud of Notre- Dame Cathedral (approx 3:40 min mark) generated computationally from photos downloaded from Flickr. One of the “by-products” of Photosynth was the ability to create 3D point clouds of real world objects.. Of course photogrammetric reconstruction techniques (2D photo to 3D) have been known for a long time – but this was an illustration of a cloud based service, working at scale, enabling a computational 3D reconstructions using photos provided by many. This was 11 years ago. It was stupefying to me. I immediately starting looking at all of the hacks to extract point clouds from the Photosynth service.  In 2014, a expanded version of the Photosynth 3D was launched, but it never achieved any critical mass. Even though Photosynth was ultimately shut down in early 2017, it was bleeding edge, and it was amazing.

It was likewise exciting (to a geek like me) when I was at Geomagic and the first hacks of the Microsoft Kinect (powered by the PrimeSense stack) began appearing in late 2010, and particularly when Microsoft Research published their KinectFusion paper (publishing algorithms for dense, real-time scene reconstructions using depth sensors). While there is no doubt that much of this work was built on giants (years of structure from motion and SLAM research), the thought that room sized spaces could be reconstructed in real-time using a handheld depth sensor was groundbreaking.  This was happening with the parallel rise of cheap desktop (and mobile) “supercomputer” like GPU compute solutions.  I knew the reality capture ecosystem had changed forever.

There has been tons of progress on the mobile handset side as well — leveraging primarily “passive” sensor fusion (accelerometer + computer vision techniques). Both Apple and Google (with their ARKit and ARCore, now released, respectively) have exposed development platforms to accelerate the creation of reality interaction solutions. I have previously written about how the release of the iPhoneX widely exposed an active scanning solution to millions of users in a mobile handset. Time will tell how that tech is leveraged.

I have long been interested in the crowd-sourced potential that various sensor platforms (mobile handsets, “traditional” DSLRs, UAVs, autonomous vehicles) will unlock. It was exciting to see the work done by Mapillary in using a crowd sourced model to capture the world using photos (leveraging Mapbox and OpenStreetMap data). Mapbox themselves recently announced their own impressive AR toolkit and platform called Mapbox AR — which provides developers with access to live location data from 300 million monthly users combined with geotagged information from 125 million locations, along with 3D DTM models, and satellite imagery of various resolutions.

I was therefore intrigued to read about 6D.ai (not much there on the website) which is emerging from Oxford’s Active Vision Lab. 6D.ai is building a reality-mesh platform for mobile devices leveraging ARCore and ARKit. Their solution will provide the necessary spatial context for AR applications  — it will create and store 3D reconstructions generated as a background process which will then be uploaded and merged with other contributions to fill out a crowdsourced reconstruction of spaces. While my guess a few years ago was this type of platform for near-scale reconstructions would have been generated on depth data generated from passive capture solutions (e.g. light field cameras), and not 2D image based, but it absolutely makes sense that for certain workflows this is absolutely the path forward – in particular when leveraging the reconstruction frameworks exposed in each of the respective handset AR toolkits.

It will be incredibly exciting in time to see the continuing progress 6D.ai and others will make in capturing reality data as a necessary predicate for AR applications of all sorts. We are consuming all types of reality data to create rich information products at Allvision, a new company that I have co-founded along with Elmer Bol, Ryan Frenz and Aaron Morris. This team knows a little bit about reality data — more on that to come in the coming weeks and months.

The era of the crowdsourced 3D world model is truly upon us – let’s make sure we are ready for it!

Apple’s iPhone X – Bringing PrimeSense 3D Scanning Technology to the Masses

Way back in 2013 (it feels way back given how fast the market continues to move on reality capture hardware and software, AR/VR applications, etc) I blogged about Apple’s acquisition of PrimeSense, and what that meant for the potential future of low cost 3D capture devices.  At the time of the acquisition, PrimeSense technology was being incorporated into a host of low cost (and admittedly relatively low accuracy) 3D capture devices, almost all leveraging the Microsoft Research KinectFusion algorithms developed as against the original Microsoft Kinect (which was based on PrimeSense tech itself).

I, and many others, have wondered when the PrimeSense technology would see the light of day.  After many rumored uses (e.g. use to drive gesture control of Apple TV, as one among many), the PrimeSense tech pipeline has emerged as the core technology behind the 3D face recognition technology which has replaced the fingerprint reader on the iPhone X.  Apple has branded the PrimeSense module as the “TrueDepth” camera.

It would surprise me if there wasn’t work already underway to use the PrimeSense technology in the iPhone X to act as a 3D scanner of objects generally –  as ultimately as enabled by/through the Apple ARKit.  Others, like those at Apple Insider, have come to the same conclusion. As one example, the TrueDepth camera could be used to capture higher quality objects to be placed within the scene that the ARKit can otherwise detect and map to (surfaces, etc.). In another, the TrueDepth camera combined with the data generated from the onboard sensor package combined with known SLAM implementations, and cloud processing, could turn the iPhone X into a mapping and large scene capture device as well as enabling the device to better localize itself within an environment that would be difficult for the device to currently work in (e.g. a relatively featureless space). The challenge with all active sensing technologies (the Apple “TrueDepth” camera, the Intel RealSense camera, or the host of commercial data acquisition devices that are available) is that they are all relatively power hungry, and therefore inefficient as a small form factor, mobile, sensing device (that, oh yeah, needs to be a phone and have long battery life).

Are we at the point where new mobile sensor packages (whether consumer or professional) coupled with new algorithms, fast(er) data transmission and cloud based GPU compute solutions will create the platform to enable crowd sourced world 3D data capture (e.g. Mapillary for the 3D world?). The potential applications working against such a dataset are virtually limitless (and truly exciting!).

MapBox, Geo Software Platform, Maps $10M from Foundry Group

It is great to see continuing venture capital and public market interest in areas such as data acquisition, unmanned aerial systems, manufacturing, AEC and GIS solutions providers.

MapBox (@MapBox) announced yesterday that it had taken a Series A investment of $10M from Foundry Group (@FoundryGroup).  After three years of bootstrapping the MapBox business, in the words of Eric Gundersen (@ericg), funding lets us plan for years of building the future of geo software, from the ground up.

MapBox is a cloud-based platform which allows for developers to embed geo rich content into their web and mobile offerings.  MapBox sources its mapping data from OpenStreetMap, keeping its operating costs low and without a tie to proprietary back end mapping databases.   It will be interesting to see how MapBox navigates the GIS/Geo Software playing field over the coming years – but more developer choices, relying on crowd-sourced mapping data, could be quite transformational indeed.

Foundry Group continues its string of investments in the technical solutions space.  They were part of a team which invested $30M into Chris Anderson’s (@chr1sa) unmanned aerial systems company 3D Robotics (@3DRobotics) a few weeks ago, which I blogged about here and were also invested into Makerbot (@Makerbot), which was recently acquired by the 3D printing company Stratasys (@Stratasys) (in mid-August 2013) for $403M (+up to $201M in earn-outs).  Seth Levine (@sether) explained some of Foundry Group’s rationale for the MapBox investment here.

Foundry Group is currently also invested into Occipital (@Occipital) which has recently developed a 3D capture device which connects to an iPad, called the Structure Sensor.  Occipital currently has a KickStarter campaign going for the Structure Sensor, and as of today they are only a few thousand dollars shy of the $1M mark. In June 2013 Occipital acquired ManCTL, adding a strong team to an already deep computer vision bench, but in this case on that had the chops to do real time 3D scene reconstruction from PrimeSense powered (a/k/a the Microsoft Kinect) devices.  Foundry Group put $8M into Occipital in August of 2011.

I am very excited to ultimately see what comes from both MapBox and Occipital!

It will be interesting to see whether/if Andreessen Horowitz (@a16z) looks for a big data, geo centric sector investment as well.