Real estate data firm HARNESS has launched a tool to help the commercial property industry collect data from PDF investment brochures, tenancy schedules, and marketing brochures.



The platform HARNESS PDF Extractor’s (PDFx) machine learning algorithm has been trained to extract data from complex forms and tables in tenancy schedules and is attuned to language used by the property industry to locate data.

It can extract data from 1,200 PDFs within the same time a human can enter data from one. The software matches extracted addresses and cross-reference those against Unique Property Reference Numbers.

“PDF data extraction has been a long-standing bugbear of the industry, and our solution utilises the best in machine learning to considerably reduce the friction of data use, whilst saving time and money for users,” said Ben Mein, chief executive of HARNESS Property Intelligence.

“Most importantly, the self-service platform allows for a client to instantly access and unlock valuable data from investment brochures at the point of need to enhance their revenue potential