Please note javascript is required for full website functionality.
MVP

Article

Insert Data from Picture Generally Available on the iPhone Excel Application

3 June 2019


Insert Data from Picture is now Generally Available to iPhone users, including support for no less than 21 languages for both the Android version of the app (available since March) and the new iPhone one.

This tabular recognition feature for Excel combines advanced Optical Character Recognition (OCR) technology and machine learning models to transform paper-based information into digital data.  Insert Data from Picture brings printed, tabular data directly into Excel, where you can perform various kinds of analysis that are time-consuming or even impossible with pen and paper.  

Simply open the Excel app on your phone, snap a picture of your paper-based data table, crop and review the image, and you’re done—no technical skills required.  The data table is automatically embedded and ready for analysis in Excel.

Going from analogue to digital gives you time for more important tasks and is more secure.  In addition, once a printed copy is converted, digital records are easier to organise and search—eliminating paper waste and reducing physical storage space.

Here are just a few examples of how users can benefit from Insert Data from Picture:

  • Consolidate dozens, hundreds, or even thousands of rows of paper-based data in a flash—all without a single pencil mark
  • Create illustrative charts and graphs to summarise information that was extremely difficult to communicate before
  • Use Ideas in Excel to surface new trends and dependencies you might have missed when your data only existed on paper
  • Easily archive data documents for future reference and compliance purposes.

To enable seamless data extraction from an image, Insert Data from Picture reuses many of the same OCR and Layout technologies previously released for Word (e.g. the PDF Reflow feature), Office Lens, and Seeing AI.

The image is analysed to detect the main building blocks of the document, like text and graphical elements (e.g. table borders).  For this, Microsoft enhanced their OCR engine to handle images with scattered text, which is often the case in tabular data and leveraged various image processing techniques to detect graphical elements.

Once the image is decomposed into main building blocks, Insert Data from Picture starts inferring the layout of the table.  The most important part is detecting the grid of the table, which is done by generating grid candidates from horizontal and vertical lines (for bordered tables) and empty spaces between text (for borderless tables).  After all the candidates are generated, the feature uses a combination of various heuristics and machine learning models to filter false positives and produce the final grid that will be reconstructed in the output.  Producing that final grid relies on the analysis of each cell to build out other structures like paragraphs, font properties, and lists.

For those who have used OCR-based features before, you know it doesn’t always get everything right.  Insert Data from Picture takes special care to highlight potential errors, so you can focus on individual entries rather than on the whole thing.  The good news is that this is an AI-powered feature designed for continues improvement, meaning that data accuracy will increase over time.  To achieve this, Microsoft leveraged the collection of machine learning models where each model detects a specific case of misrepresented content (e.g. missed or added characters).  

You can download Excel for your phone and start using these apps today:

 

Newsletter