IGF 2020 WS #175 OCR engine for data rescue in various fields


Organizer 1: Private Sector, African Group
Organizer 2: Private Sector, African Group
Organizer 3: Private Sector, African Group

Speaker 1: Chomora Mikeka, Private Sector, African Group
Speaker 2: Chomora Mikeka, Private Sector, African Group
Speaker 3: Chomora Mikeka, Private Sector, African Group
Speaker 4: Chomora Mikeka, Private Sector, African Group


Tutorial - Classroom - 30 Min

Policy Question(s)

Two policy questions that this workshop shall address namely: 1. 5) Data access, quality, interoperability, competition & innovation Topics: data concentration, data trusts/pools, data quality, technical standards, interoperability, open data, data portability, competition, innovation. Workshop focus: innovative methods to rescue otherwise lost data (handwritten in papers) to digital format for digital archiving and analysis using OCR engine developed by the workshop speakers using machine learning algorithms. Examples shall be given in the rescue and digitization of weather data for the past 2 decades, since Year 2000 in Malawi. 2. 3) Data-driven emerging technologies Topics: artificial intelligence, IoT, algorithms, facial recognition, blockchain, automated decision making, machine learning, data for good. Workshop focus: demonstrate how emerging digital technologies could be used to generate data to improve transport systems (A paper is attached).

Most data, especially in developing countries is paper based and often lost over time due to climatic disasters, damage due to mice, theft or general lack of care in handling paper files. Nevertheless, such data is overwhelmingly important in time series and forecasting of trends to generate foresight data to help in decision making in fields like health and agriculture in addressing SDGs for example estimation of food baskets and interventions planning. In addition, such data is important in transport modernization in Africa and globally. Issues of corridor management, trade, traffic de-congestion and mass transportation, for example, employing the use of digital emerging technologies based on data.


GOAL 9: Industry, Innovation and Infrastructure
GOAL 13: Climate Action


1. Data description and formats in various fields 2. Data flow and data pipe algorithms for rescue 3. Data rescue examples using OCR engine built based on Machine Learning 4. Digital Emerging Technologies including but not limited to 5G-IoT Specification 5. Digital data usage for transport systems innovation 1 to 5 will fill Monday to Friday (5 days workshop) but could be shortened to 2 or 3 days. Workshop slides, exercises and presentation tasks will be used to increase participation. A link to ITU's 5G/ML challenge which also wrestles with data in one part will be introduced. A debate will be deliberately created to unearth underlying issues about data challenges and from this, a paper manuscript as an issues paper could be developed.

Expected Outcomes

1. Publication 2. Follow up events 3. Opportunities to train other countries in data rescue or indeed handle consultancies in data rescue or innovative transport system design using digital emerging technologies

The organizers are university professors and research associate with huge international exposure. They will at a minimum employ pedagogical (tutor-student) and andragogical (collegial with colleagues) instruction techniques, with learner centered approaches being key.

Relevance to Internet Governance: Data is the central metric in IGF globally. Intern generate huge volumes of data but also is ever hungry with new data. We aspire to work on protocols and engines to generate correct data for the Internet to help in policy formulation and decision making.

Relevance to Theme: The session will focus on data rescue, data digitization and abstraction of digital data to bring about transport system innovation which in turn improves livelihoods in all sectors: health, education, agriculture, trade, security to name but a few.

Online Participation


Usage of IGF Official Tool.